-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phylanx Minimum Viable Product #1281
Comments
@stevenrbrandt I have implemented an event filter, but I need to figure out how to integrate it with the HPX build - I use an external JSON library for parsing. I think the library is already part of the build, but I need to figure out the include path support for CMake. See UO-OACISS/apex@87d26c8 for some info. |
@khuck I see. Is there also an env variable? |
@stevenrbrandt OK, I tested the filter with HPX and all appears to work. Yes, there's an environment variable. It currently has a default value of empty string, so if you want that to be defaulted, let me know. The variable is: export APEX_EVENT_FILTER_FILE=/mnt/nvme_superfast/scratch/khuck/hpx/build/filter.json and for my test, the file had the contents: {
"include": ["fibonacci.*"]
} ...which is a string that starts with 'fibonacci' and has any number of additional characters. It is the usual C++ regular expression syntax. If you want to exclude all timers that start with 'fibonacci', you would have a file with: {
"exclude": ["fibonacci.*"]
} ...and you can mix the two. I haven't really optimized the implementation, but I will if I see that it is introducing too much overhead. Any optimizations shouldn't change this interface, though. For Phylanx, it might be interesting to do a test where we only include timers that start with |
...and this is in the |
@khuck the filter works! |
[pops champagne bottle] |
At the moment, Phylanx isn't usable. However, I don't think the number of issues that need to be solved to get us to a minimum viable product is that many. There are a lot of other features that I want and think are important, but these 7 are all critical.
Apex integration needs to be fixed. At this point PhySL + APEX +
mpirun
= code that hangs. This means that JetLag no longer works.We need the ability to filter out events from OTF2. In part, this is a stop-gap until (3) can be fixed.
Traveler needs to scale to millions of events in OTF2. While I think we can fix (1) and (2) readily enough, I suspect this issue will take longer. No one can use this tool for anything real unless/until this is possible.
We need a way to create distributed arrays that persist across calls to Phylanx. We shouldn't have to set everything up all over again with each call. It's hard for me to imagine anyone still wanting to use the tool if they need to work around this limitation.
We need to fill in the missing distributed array creation primitives. See ticket. At the moment, one can't create a distributed array from an ordinary one (Note: one can hack this feature by saving with file_write_csv() and loading with file_read_csv_d(), but we should create a real solution). There are probably lots of other types of primitives missing as well, but I don't think we'll have a handle on what those are until (4) and (5) are fixed.
We need to have a successful multi-node run (not just a multi-locality run). I'm not 100% sure we've ever had one. I think @NanmiaoWu 's runs were actually all on a single node, even though she asked
srun
for multiple nodes. One of the things that would help here is to ask localities to tell us what host they are running on. (Note: If file_read() had an option to return the contents of a file as a string, we could read/etc/hostname
) (hostname()
was added by Adding hostname() primitive #1283)We need tests that ensure the above items remain fixed. I don't think CircleCI can properly cover them. It will be tricky to figure something out.
The text was updated successfully, but these errors were encountered: