Multiple Instances of Simulation Objects #20
Replies: 2 comments
-
I'll add that my experience working with @clorton 's code, which uses instantiated classes, was pretty good. |
Beta Was this translation helpful? Give feedback.
-
It sounds like we do foresee a need for running multiple simulations within a single process, and thus would need multiple instances of simulation, demographics, settings and other objects. I was noting that we thought we would do this in EMOD but in practice never really did, and so was wondering if that learning could apply here. @krosenfeld-IDM notes that LASER isn't EMOD so not all lessons can and should apply. But we'd be remiss if we didn't try to apply as much learning as possible from past experience, I think. And I suspect in LASER we will not be doing this when running at scale, remotely on a cluster -- I imagine we'll be running 1 sim per process; indeed I think that's built into the nature of COMPS -- so this would be for running locally. But we do want to be able to do a lot locally. I still have some pause since with LASER we are often running very large populations by design, with big memory impacts, so I will be surprised if we end up running a lot of multiple simulation runs (e.g., sweeps) in a single process, but the branching trajectories use case would be sequential. As for the concerns about hidden dependencies, that was interesting to read about. In my career I've never actually encountered such issues so it feels a bit theoretical to me, I must confess. I would still love us to stick with a functional programming approach to make the codebase more accessible to more people. I will continue to argue that our ever-growing R-inclined community will appreciate this. But it sounds like, on net, that "level 3 OOP" (see wiki) is what we need for what we are building for LASER. |
Beta Was this translation helpful? Give feedback.
-
From @jonathanhhb 's write-up about Consciously-Choosing-OOP‐ness:
It's helpful to hear about the experience with EMOD/DTK, but I'm not sure that utilizing standard python modules represent all cost and no benefit (where the implied benefit is being able to have multiple simulation instances). I think there is potentially a lot of benefit when it comes to someone being able to understand, use, and expand the code.
It may require some effort to have to organize around the class instance rather than rely on global instantiations allowed by e.g., singletons, but this organization can provide key structure that makes code more readable, comprehensible, and ultimately usable. I have an example here how having a single instance can enable really surprising behavior see this repo. But I'm also not sure that it will be harder not to! Python is most commonly structured around instantiated classes so my prior is that it would be, in fact, easier.
I agree it's important to think about potential use cases/environments (e.g. COMPS, K8s), but by doing so we have to take care not to unnecessarily limit flexibility. It's important to make sure that things will be able to run now, but if we conclude that the easier thing works for case A,B,C so we don't need to do the harder thing that allows for more flexibility it might very well rule out case G down the road. @KevinMcCarthyAtIDM and I came up with the examples like the ones listed but what about branching "trajectories", simultaneous "island" populations, maybe even multi-pathogen stuff? This is just off the top of my head - could be hard to do but will be impossible with a singleton.
I think it would be helpful to hear the case for and understand the benefits of singletons. What is there to add to the experience from EMOD (which, by design, should be very different)? Can there be substantial performance gains from e.g., instantiation control (i.e., laziness)? Are there answers to my concerns about hidden dependencies and the ability to track the state?
Beta Was this translation helpful? Give feedback.
All reactions