Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phylanx Minimum Viable Product #1281

Open
4 of 7 tasks
stevenrbrandt opened this issue Oct 7, 2020 · 6 comments
Open
4 of 7 tasks

Phylanx Minimum Viable Product #1281

stevenrbrandt opened this issue Oct 7, 2020 · 6 comments

Comments

@stevenrbrandt
Copy link
Member

stevenrbrandt commented Oct 7, 2020

At the moment, Phylanx isn't usable. However, I don't think the number of issues that need to be solved to get us to a minimum viable product is that many. There are a lot of other features that I want and think are important, but these 7 are all critical.

  • Apex integration needs to be fixed. At this point PhySL + APEX + mpirun = code that hangs. This means that JetLag no longer works.

  • We need the ability to filter out events from OTF2. In part, this is a stop-gap until (3) can be fixed.

  • Traveler needs to scale to millions of events in OTF2. While I think we can fix (1) and (2) readily enough, I suspect this issue will take longer. No one can use this tool for anything real unless/until this is possible.

  • We need a way to create distributed arrays that persist across calls to Phylanx. We shouldn't have to set everything up all over again with each call. It's hard for me to imagine anyone still wanting to use the tool if they need to work around this limitation.

  • We need to fill in the missing distributed array creation primitives. See ticket. At the moment, one can't create a distributed array from an ordinary one (Note: one can hack this feature by saving with file_write_csv() and loading with file_read_csv_d(), but we should create a real solution). There are probably lots of other types of primitives missing as well, but I don't think we'll have a handle on what those are until (4) and (5) are fixed.

  • We need to have a successful multi-node run (not just a multi-locality run). I'm not 100% sure we've ever had one. I think @NanmiaoWu 's runs were actually all on a single node, even though she asked srun for multiple nodes. One of the things that would help here is to ask localities to tell us what host they are running on. (Note: If file_read() had an option to return the contents of a file as a string, we could read /etc/hostname) (hostname() was added by Adding hostname() primitive #1283)

  • We need tests that ensure the above items remain fixed. I don't think CircleCI can properly cover them. It will be tricky to figure something out.

@khuck
Copy link
Contributor

khuck commented Oct 13, 2020

@stevenrbrandt I have implemented an event filter, but I need to figure out how to integrate it with the HPX build - I use an external JSON library for parsing. I think the library is already part of the build, but I need to figure out the include path support for CMake. See UO-OACISS/apex@87d26c8 for some info.

@stevenrbrandt
Copy link
Member Author

@khuck I see. Is there also an env variable?

@khuck
Copy link
Contributor

khuck commented Oct 13, 2020

@stevenrbrandt OK, I tested the filter with HPX and all appears to work. Yes, there's an environment variable. It currently has a default value of empty string, so if you want that to be defaulted, let me know. The variable is:

export APEX_EVENT_FILTER_FILE=/mnt/nvme_superfast/scratch/khuck/hpx/build/filter.json

and for my test, the file had the contents:

{
    "include": ["fibonacci.*"]
}

...which is a string that starts with 'fibonacci' and has any number of additional characters. It is the usual C++ regular expression syntax. If you want to exclude all timers that start with 'fibonacci', you would have a file with:

{
    "exclude": ["fibonacci.*"]
}

...and you can mix the two.

I haven't really optimized the implementation, but I will if I see that it is introducing too much overhead. Any optimizations shouldn't change this interface, though. For Phylanx, it might be interesting to do a test where we only include timers that start with '/phylanx.*', and hide everything else. Although that would hide the HPX under the covers.

@khuck
Copy link
Contributor

khuck commented Oct 13, 2020

...and this is in the develop branch of APEX.

@stevenrbrandt
Copy link
Member Author

@khuck the filter works!

@khuck
Copy link
Contributor

khuck commented Nov 3, 2020

[pops champagne bottle]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants