Anytopo#2598
Conversation
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
In order to merge the PR, the tests will need to be part of the main merlin testing. Go ahead and move them there and we can review the code and release the PR for testing. |
|
There are some known issues with the GitHub macOS runners and how they interact with SST. You can safely ignore the results of the GitHub macOS builders for now. They are experimental and the official testing is done on other machines. |
|
Looks like that those issues were actually fixed a week or so ago. I dug a little deeper and it looks like you introduced a dependance on a third-party library that isn't installed on those systems (NetworkX). You'll need to do a check for that and not load it if it's not there (i.e. anytopo won't be available if NetworkX isn't there. The polarfly and polarstar topos have a similar issue with a different library. Give me a bit to look through the code and I can send some pointers on the best way to do this. |
| from collections import defaultdict | ||
| from typing import Callable, Optional | ||
|
|
||
| try: |
There was a problem hiding this comment.
This looks like the code segment that is failing when networks isn't installed. It looks like this code is executing when the module is loaded, so the ImportError below ends up failing in the module loading. So, this causes every simulation that imports sst.merlin to fail, not just those that use topoany. Probably need to set a variable if the modules are missing and fail when trying to use the topology instead of at module load time.
|
Looks like polarstar and polarfly depend on the same third-party Python libraries as anytopo, so we already have builders set up with those libraries, so anytopo should test on those and be skipped on all the ones without those libraries. |
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Ok that's good news. Thanks for checking. Another question is that some of my tests require extra python libraries, for example I imported the library "galois" for generating the 'Slimfly; topology. This import is not built with SST, but only is required when running the test. |
|
You'll need to add an @unittest.skipIf similar to what the polar* topology tests do so that they don't try to execute if the libraries aren't found. Note that the testsuite_Default_merlin.py file also tries to load the required libraries so that they can check if they are in sys.modules. |
560ee03 to
18399d4
Compare
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
I have added new tests for Anytopo, and all tests passed. So I think this PR is ready for review. |
|
Sorry, with the SC conference last week, Thanksgiving this week and vacation next week, it will be a bit before I can dive deep into this PR. I took a quick glance through it and didn't see anything that immediately stood out as an issue. I'll revisit this the week of December 8th and do a complete review. |
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
Sorry for the delay, things got a bit hectic during December. I will be looking at this PR this week, though it may take me a bit to get through since I'll have to remind myself of the details in the merlin code. |
| class ExtendedRequest : public SST::Interfaces::SimpleNetwork::Request { | ||
| private: | ||
| // Map from plugin name/key to metadata | ||
| std::map<std::string, std::shared_ptr<PluginMetadata>> metadata; |
There was a problem hiding this comment.
Using shard_ptr in classes that end up in events is not supported since std::shared_ptr is not serializable for events used in synchronization. This is because pointers are not tracked during event serialization, so nothing will end up being shared outside of the rank it was created on. I know this will cause some memory increase, but, in general, elements in the merlin library are assumed to be able to run across multiple ranks. I'm going to continue to review the code, but I think this will need to be changed before we can merge it.
| { | ||
| public: | ||
| // Static shared data accessible by both SourceRoutingPlugin and topology classes | ||
| static std::map<int, int> endpoint_to_router_map; // Shared endpoint-to-router mapping |
There was a problem hiding this comment.
Generally, static variables shouldn't be used in SubComponents. They should use the SharedObjects provided by the core. Since both of the classes that need access are derived from SubComponent, they will have access to those APIs.
There was a problem hiding this comment.
Ok, a minor question:
these SharedObjects support array of common data types (int, bool, ...) right?
Can we also share arbitrary objects like mutex of iostream?
There was a problem hiding this comment.
The SharedObjects pretty much support any Object that is serializable, which means that a mutex would not be supported. If you need to have a shared Mutex, then that will still need to be a static variable. I/O generally doesn't fall under the same restrictions as variables used during the simulation.
|
Hi, no worries, we are not in any hurry. Recently I have been thinking about the design of this branch. |
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
| void serialize_order(SST::Core::Serialization::serializer &ser) override { | ||
| Request::serialize_order(ser); | ||
|
|
||
| // Manually serialize the metadata map since std::variant isn't directly supported |
There was a problem hiding this comment.
std::variant is supported by the serializer now, but it was developed primarily for checkpointing and may not work properly with pointer tracking turned off. I'd have to look through that code to know if pointer tracking matters. It may be worth trying to see if you can just serialize the vector directly.
Co-authored-by: Copilot <copilot@github.com>
…ing.h/cc" -> "sourceRouting.h/cc"
8efb1e9 to
cb8dbc2
Compare
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
I have updated the branch, but the CI action is not triggered.. I don't know why. |
|
I have built and ran tests locally, all passed |
|
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Build InformationTest Name: SST__Autotest_OSX-15-XC16_OMPI-4.1.6_PY3.10_sst-elements
Using Repos:
Pull Request Author: ziyuezzy |
|
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 8 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Job: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Job: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Job: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Job: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Job: SST__Autotest_OSX-15-XC16_OMPI-4.1.6_PY3.10_sst-elements
|
|
Looks like at least one header file (ExtendedRequest.h) is missing from Makefile.am so the make dist test is failing. Please make sure any files that were added are in Makefile.am. If you have any questions about where specific files belong in the file, let me know. |
|
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Build InformationTest Name: SST__Autotest_OSX-15-XC16_OMPI-4.1.6_PY3.10_sst-elements
Using Repos:
Pull Request Author: ziyuezzy |
|
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Build InformationTest Name: SST__Autotest_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Build InformationTest Name: SST__Autotest_OSX-15-XC16_OMPI-4.1.6_PY3.10_sst-elements
|
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
cb8dbc2 to
976b77d
Compare
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |

Hi Devs,
This is a pull request, previously mentioned in issue #2584
There are documentations of this work in
"sst-elements-src/src/sst/elements/merlin/interfaces/endpointNIC/documentations/README.md", and"sst-elements-src/src/sst/elements/merlin/topology/anytopo_ultility/README.md"In a few words, this branch extends the Merlin module in SST-element, such that any network topology can be built with an input networkx graph. Another key feature is to support source routing (the endpoint NIC determines the routed paths). And a few popular HPC network topologies (dragonfly, slimfly, polarfly, jellyfish) are already defined in
"sst-elements-src/src/sst/elements/merlin/topology/anytopo_ultility/"dir, together with some tests in"sst-elements-src/src/sst/elements/merlin/topology/anytopo_ultility/tests".If you would like to accept this pull requests, maybe we can move these tests to the main merlin test dir?
It worth mention that the reorderedlinkcontrol has been modified to fit in the new 'ExtendedRequest' framework, this has been included in tests as well, see
EndpointNIC(use_reorderLinkControl=True ...in the tests.I believe that this pull request will have the following contribution to Merlin:
Please let me know if there is any feedback/comments/questions for the code.
Best regards,
Z.