-
-
Notifications
You must be signed in to change notification settings - Fork 237
The great restructuring, episode 15: the monorepo strikes back, single SciML version #1082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A drawback to extensions is that you cannot really access symbols defined in them. So they have to work pretty much purely by method extensions. Does all the code that is desired to be put into extensions do that here? In the code given as:
does the
I think the requirement is so that you get access to the |
What about packages that extend SciML packages but are not part of SciML? We cannot depend on package extensions, can we? Moreover, I would have expected |
Keeping a look at this, but given the magnitude of things I will have to think through things before I can contribute any useful opinion. From experience, working with extensions is a proper pain. E.g. If I have function I guess we have no idea when a solution to JuliaLang/julia#55516 could drop? |
regarding "Small question" #1: Say I was still in the troubleshooting phase of a project and getting my infrastructure in place and I'm trying out different solvers to see which is most appropriate to use. Would the workflow in this suggestion be to 1) test with |
Well there's two different choices that could be made here.
I would be inclined to go with 2 for a mostly different reason than those, which is that we know large files and just loading large amounts of code is one of the biggest latency issues right now, and that's not necessarily going to get any smaller. So having a giant file of algorithm struct constructors would be something that would add up in terms of load time. IIRC the OrdinaryDiffEq algorithms page was like 0.15 seconds itself, so all solvers together could be a floor of like 0.4 seconds to loading (in its current implementation) just from the algorithm structs. So, putting those into the package shims could be the right thing to do not just for handling dependencies but also to make the loading floor lower. |
I think I'm generally in favour of this. One thing I do want to mention though is that on this issue:
Why not keep the same behaviour we currently have? Beyond that, I want to flag the fact that there's still currently a lot of UX issues with extensions, e.g. various things are broken with them in 1.10 (crytic errors about being unable to merge manifests when running test suites for example), and most users don't know how to actually access an extension module (i.e. I don't think any of that are reasons to not do it, but they are maybe reasons to be cautious about this. I would suggest perhaps starting with a smaller set of consolidations. For instance, we could turn Doing this on a smaller scale might help us find sharp edges and work out solutions to them before we go all in and consoldiate everything into one mega-repo. |
I was told today that it's unlikely to happen in v1.13, and so probably not within the next ~2 years. I don't think we can keep this version hell going on for that long without having a riot, so given that's not happening with new Base Julia tooling, we need to figure out a solution with duck tape and bubble gum. That said, one non extension way to do this that was proposed is a purely CI solution. What @oscardssmith had sent me last night is that could do this through a CI-based lock step versioning system. Quoting:
|
Error while trying to register: Action not recognized: register_all |
Go away registrator, not now. Read the room. |
Error while trying to register: Action not recognized: register_all |
That's definitely the plan. I think we'd do just the DiffEq stuff first and see how that goes. But I'm curious then for the next step if people would prefer NonlinearSolve separate or together with it. At least I know I want all of DiffEq together, since DelayDiffEq makes no sense in a different repo and that has been an issue for at least 7 years at this point. |
Somewhat other way around. |
Ah! Okay, that makes sense and seems manageable for someone like me. Thanks. |
Yes, those couldn't be extensions. They would just have to be separate packages as always, with the same warts as before. But normally external packages should just be relying on public interfaces. Really the big issues are for example the solver packages that rely on the exact implementation of ODEIntegrator and such. Though https://github.com/NumericalMathematics/PositiveIntegrators.jl is this kind of edge case that's using OrdinaryDiffEqCore internals... I'd prefer to just get that into the system at that point as its hard to ever make it fully robust if separate. We could at least keep the same downstream test infra for this kind of thing. |
One thing I want to draw attention to with a monorepo approach is testing. I have similar problems with DI, where you could argue that I run 14 downstream tests every time I bat an eyelid. Now imagine if every docstring modification triggered a full run of every single SciML test ever written. This would be nightmarish for user experience and energy consumption alike. As a side note, if we're discussing custody of the kids, I'd like to bring ADTypes into the DI repo, for exactly the same lockstep versioning reasons. These days, it is DI which defines the semantics of ADTypes, so it makes no sense that we're able to add something to ADTypes without having it immediately implemented in DI. |
One other thought: I think only reasonably mature and stable packages should be included in this monorepo approach. For example, in Neuroblox.jl we have a hard upper bound on ModelingToolkit.jl that we only increment when we've vetted that new versions don't break our stuff. This is partially our fault for using some MTK internals, but also partially the 'fault' of MTK, where it can be very hard for people working on that library to fully know and understand all the possible negative downstream affects of their changes. I'd be a little nervous that an approach like this would mean that I'd have to hold back and not update stuff from the entire SciML ecosystem just because I want to protect myself from MTK breakage. On the other hand, maybe that's a good thing, and I can definitely imagine that I'll run into problems caused by having an old MTK and a new version of other stuff. |
I don't disagree, which is why this is a bit scary. |
You kind of run into the problem described here, then https://youtu.be/TiIZlQhFzyk?t=1106. Basically, |
I don't think that'd be a bad thing. The only reason we don't want If someone already paid the cost, there's not really any reason to make it error. |
So is this the structure you are thinking of approximately? # module OrdinaryDiffEqCore.jl
abstract type Solver end
function solve(prob, alg::Solver)
throw(BackendNotLoadedError(alg))
end # module OrdinaryDiffEqTsit5.jl
struct Tsit5 <: Solver
...
end # module OrdinaryDiffEqCore/OrdinaryDiffEqTsit5Ext.jl
function OrdinaryDiffEqCore.solve(prob, alg::Tsit5)
...
end If so I don't really see the value of adopting this structure when we could have a bot / Registrator based solution that will bump all versions of packages (regardless of whether they have changed or not) in lockstep with the OrdinaryDiffEq / SciML version. One issue here, no matter which path we go down, is that a release of SciML + all constituent packages will be required even for a minor bugfix, which could create a lot of noise. We could potentially maintain per-package or per-folder changelogs, which would decrease the noise you have to wade through in a single file, and have a summary in SciML.jl's main changelog. But in general there would be a lot of "irrelevant" releases. You could version each package separately but that doesn't help the release situation for SciML.jl in general. |
Sure, but SciML gets to keep the dog. But yes since MTK, OrdinaryDiffEq, NonlinearSolve, etc. has now moved to DI, ADTypes is no longer as much of a common interface of DI and SciML, since it's now really just SciML. DI should probably pull it in and downstream test SciMLSensitivity on its changes. We should follow up on that separately. Integrals.jl is I think the last straggler. |
Unfortunately I don't know how to write my thoughts down without some duplication of what's already been written here and at JuliaLang/julia#55516 . The fundamental benefit of monorepos is that they linearize changes. This lets the developers not have to think about compatibility when developing. For monorepos with multiple packages (which we want for many reasons), this is also their fundamental problem: from the perspective of users and julia tooling these packages can skew to versions which don't work with each other. Keeping everything in lockstep -- (e.g. forcing all versions to be the same and setting the compat bounds to that version) is a solution, and the one proposed here, right? It does have downsides though: bumping anything bumps everything, with all the follow on from that: duplicated downloads, precompilation, etc. It also linearizes the entire ecosystem around SciML. Consider the cases where libraries build on disparate sciml packages, and a user wants to use two of these libraries that are stuck at different versions of SciML that nonetheless would have worked together. Letting versions float from each other is actually useful! One approach I have not seen mentioned that would make this easier is to formalize even internal interfaces, i.e. versioning them with nearly empty packages that act like tags. (Like, but not completely the same as the common use of |
Yes, the noise would be... substantial. Stefan said he was calculating statistics of the General registry and noticed that I personally averaged around 2-4 package releases per day. SciML has 200 packages now. So if we did a single SciML.jl with lockstep CI releasing per this suggestion, I personally would open 400-800 general registry PRs every single day. 🤷 But it does seem like it's coming ahead as the leading solution. |
Yes there is because you don't control the set of transitive dependencies that end up getting loaded. You don't want your code to break because some random package in your dependency graph restructured and maybe started using a different diffew solver. By structuring it like that you make users vulnerable to code breakage without an associated breaking change. |
I don't think that's helpful for beginners. Someone that starts with Julia and wants to learn how to solve an ODE numerically has no reason to first go through the SciML castle until they trickle down to the OrdinaryDiffEq pillar. Separate docs are more accessible in my opinion. The current system does a good job of supporting this by having separate docs that are connected by the MultiDocumenter.jl top-page-header. I have to say that I read this conversation but it is not transparent to me what is the currently favored approach. Is it the approach proposed in the very first comment, with empty extension packages?
This doesn't sound like an issue at all to me actually. It just warrants a major version bump. As a user I have already removed entirely all |
If everything would be in one big SciML, wouldn't it be harder for people "from the sidelines" to contribute ? A docstring update in LinearSolve triggering a new version of "everything" might scare people away. In some sense this would be opposite of the trend in Julia core - moving out SparseArrays etc into separate repos in order to be able to develop these independent from the core. I guess we need to have a good balance here, and the boundaries could be just as stated - defined by the different subfields of numerical methods as ODE solvers, linear solvers, nonlinear solvers. And it would be sufficient to be competent in just one of those in order to be not scared to contribute. |
As mentioned, there would be clear advantages and disadvantages to this. Julia is extremely composable as a language, which itself is a double-edged sword, and I feel SciML is a living example of it. As a package developer strongly relying on the SciML ecosystem, it is amazing to be able to access each individual package and to have infinite granularity, but at the same time this highly complicates managing compatibilities, and you can often end up with weird package combinations which make it harder to debug issues. From my point of view, if this new monolithic package approach could still offer some decent level of granularity while not imposing precompilation penalization to downstream users (i.e. having to precompile lots of stuff that you won't need), then this could strike a nice balance. I like the idea of "what SciML version are you using?", although that perimeter should be properly discussed, since there will be different expectations for different people. |
Most of this conversation is above my pay grade, but I wanted to say that however you decide to structure it the user experience should be consistent across the entire ecosystem. I am mainly thinking of structuring things in some sort of hierarchy like the |
Right now I am partial to something like Oscar proposed, where we keep everything in CI and checks for OrdinaryDiffEq only. That seems the less risky while we figure stuff out (I am still a bit scared about the extensions, in my experience those are always incredibly messy to develop). |
The SciML ecosystem has often been at the forefront of pushing Julia to its limits (and beyond), and being wildly successful while doing it. Considering the myriads of repos and the associated maintenance hell, I absolutely do see the need for change. However, the approach as proposed in the OP seems to (ab)use the extension system for something that it was not designed to do. IMHO this is a classic recipe for conservation of pain: It would resolve one set of issues, but only while creating a whole set of new issues. Given that the SciML ecosystem is too vast and important for such kinds of shenanigans, I kindly suggest to use one of the other - more band-aid-like - solutions that have been proposed (e.g., enhancing the CI infrastructure). This should buy some time to develop a proper solution for the underlying issues, which can then be pushed upstream to Julia, and from which other projects might benefit as well. Note: I am not an active SciML developer, only a heavy user. I thus might have easily missed some boundary conditions/constraints. Also, I am aware that proper solutions require (much) more time, possibly exceeding the realm of possibilities in an volunteer-driven project such as SciML. |
Alright, so I think the summary of the plan that seems to align with the recommendations of this thread is:
Some extra comments.All things in engineering are always questions of trade-offs, so here's some extra comments on the trade-offs we are making. ExtensionsI totally agree with extensions being scary, which is why I have been hesitant to do anything and was asking for Base Julia to give me a feature. @KristofferC has highlighted that they had even more issues that I originally was thinking about, so that makes that path a no-go. But the CI should be a good enough solution. The CI solution doesn't exist though 😅, so we need to get something fast. Single Versioning DownsideSingle versioning through CI will mean we will average about 100 General registry PRs a day. Hopefully no one gets mad 😅 but it sounds like everyone here agrees it's the best option so if someone wonders why we're flooding General I'll just point to this thread and 🤷 we all agreed it was the best option 😅. It's a known trade-off and not necessarily a bad one with the right email filters, low tech solution is a good solution. Package Boundaries At SolversFor the solver boundaries, that's fine. For the most part the solver interfaces are pretty well-defined at this point, so OrdinaryDiffEq doesn't use internals of LinearSolve or NonlinearSolve, it uses documented interfaces. So for the most part those boundaries are working well. The one issue is in the Base libraries. In particular, there was a time where NonlinearProblem was using DiffEqBase. We now have NonlinearSolveBase, but there is still a small vestage of this living in DiffEqBase https://github.com/SciML/DiffEqBase.jl/blob/v6.167.2/src/solve.jl#L1063-L1118. If this was repo'd together, moving that piece of code from DiffEqBase to NonlinearSolveBase will be "breaking" in the sense that there will be incompatible versions if you grab NonlinearSolveBase prior to having this dispatch and DiffEqBase after moving it. It's not breaking in a semver sense because this dispatch actually living in DiffEqBase is an undocumented and unpublic aspect of the API. Should we just do a major version bump with release notes saying "Please just merge this without even testing, no package except NonlinearSolveBase needs to care about this change"? IIUC @devmotion that's what you're asking for? In the 5 months I've spent a lot of timing doing repo cleaning so for the most part, that's one of the two remaining major issues in our interface packages. That one will happen hopefully in the next month or so, and it would be seamless if we had SciMLBase / DiffEqBase / NonlinearSolveBase / OrdinaryDiffEqCore in the same set, so effectively what we're saying is a value statement, that keeping NonlinearSolve.jl separate from DiffEq is more valuable from a teaching/scary factor "the repo is too big, I don't understand most of this, it will keep new contributors away" perspective than the issue of temporary breakage or CompatHelpers. The other major issue, which is unsolved by this, is the backwards dependency of SciMLBase to ModelingToolkit, where MTK defines how any symbolic interaction works. This means that SciMLBase defines DiffEq ReconnectionThis is one no one else has really commented on, but I don't think it impacts the comments mentioned about keeping SciML being easy to enter and contribute to. DelayDiffEq and StochasticDiffEq have always relied on internals of OrdinaryDiffEq, so them in separate repositories has always been an issue, especially DelayDiffEq. So moving those back in is something I believe will make @devmotion happy. Those are complex repos so I'm not sure keeping them separate helps newbies suddenly pick up Ito calculus 😅. Quick GeuestimatesJust so we're on the right page, a quick geuestimation of where breaks come from in the last year are:
Some Future Planning2024 had some difficult times due to a few major interface changing events:
That was not a fun year, refactoring to improve loading times is no one's free time hobby. But it's a necessity. In 2025 we're planning major projects:
So the only thing I'm quite scared about in the next year is the FillArray / SparseArray / LinearAlgebra stuff is going to break something making bad assumptions. Optimziation.jl, if collapsing OptimizationBase into the same repo and having this CI solution, can be done in a way that is nicer than how DiffEqBase / NonlinearSolveBase has gone. @TorkelE @devmotion @Datseris @isaacsas are you okay with these trade-offs? |
Yes, Optimization.jl having some interface breaks and oddities is known. That's what I mean by "Clean up in Optimization.jl. This one won't be fun and will have probably a major version.". It's something I noted in the State of SciML talk at the last JuliaCon that Optimization.jl is the solver library that needs the most work because its solver / base interface is somewhat inverted. If we move OptimizationBase into the same repo as Optimization, then we can iterate on it all in one repo, using the new CI tooling, and then just done one breaking update when the flip is complete. So, this work was mostly just waiting on making a decision on what to do about repo/package splits. Hopefully we get things going on OrdinaryDiffEq with the new CI, and if that's working then we start this project. |
Regarding this, why not just go with this structure: # DiffEqCore/src/DiffEqCore.jl
module DiffEqCore
struct Tsit5
...
end
function solve end
# Define but don't export Tsit5 struct
end # DiffEqCore/ext/DiffEqCoreDiffEqTsit5Ext.jl
module DiffEqCoreDiffEqTsit5Ext
import DiffEqCore: Tsit5, solve
# Define the extension
function solve(prob, ::Tsit5)
...
end
end and then your stub package would just do: module DiffEqTsit5
using DiffEqCore: Tsit5
export Tsit5
end That way, by default users don't have access to |
Is this big enough to increase the merge time of General Registry PRs for everyone else? |
If there's enough complaints we can take one of the benchmarking nodes and add it to the general registry. Like, if the solution here is just a bit of $$$ then we'll pay the cost, it's worth it.
That is close to the current form of LinearSolve.jl. The issues are:
|
As far as I am concerned the recent PRs from Aayush fixed this! (pending testing in the wild)
In general I am on board. I like the locking single versioning approach and I am not too worried about many PRs at the registry, things have been looking fantastic in Julia the last year w.r.t. the automated registry. I am also on board with not going this package extensions route, seems too brittle. Some things are not clear to me, please clarify:
|
I personally don't think that is acceptable. You might have to stagger releases a bit, so that users might have to wait a week or so to get the latest typo fix. |
I'm extremely looking forward to these two 😅 |
In terms of noise in General - if we guarantee that all version bumps will be synchronized for a set of repos, does it make sense to then add a method to Registrator that will create a single PR for a monorepo version bump? Then we don't have to worry about noise either, and presumably the version bump will be well reviewed enough on this end. |
It's really too bad that there isn't just a way to do this with the new |
If we are literally talking about hundreds of package registrations per day that get their
This could indeed be built upon the
The suggestions presented here make me scared enough to want to start working on proper support for this heh... |
One thing I was thinking about is that the I managed to hack together a little demo here: https://github.com/MasonProtter/ExampleMonorepo/ where I was able to trick This basically seems like what we want here, right? It just needs a blessed mechanism to trigger it and nicer syntax. What's neat about this is that you only need to register the monorepo, and only the monorepo needs to go in your Project.toml Then, people who use the monorepo just choose which extensions are loaded at the toplevel of their package and they only pay for what they use. E.g. in the DiffEq example using it would just look like (SomePackage) pkg> add DiffEqCore and then module SomePackage
using DiffEqCore
DiffEqCore.@using Tsit5
...
end and then you'd only load the DifferentialEquations.jl would then just be something like module DifferentialEquations
using DiffEqCore
DiffEqCore.@using Tsit5
DiffEqCore.@using SomeOtherSolver
...
export DiffEqCore, Tsit5, SomeOtherSolver,
end |
So dealing with extensions directly runs into a wall when dealing with extension-only dependencies. I put together a demo of an extension-like functionality here: JuliaLang/julia#58051 that'd be more adapted for this usecase. |
Yes, that would get rid of effectively all of the noise. And probably lessen the CI burden.
We might just have it all be the current OrdinaryDiffeq repo, if only because most stuff is already there and because the DiffEq repo is like 2GB of old nonsense, and so using the cleaner repo is better in the end. We would lose the star count though 😅. Maybe there's a nicer way to clean the repo, move to main branch and purge master or something.
That's in any of the plans. We do not want to lose the functionality of having minimal exports. That has been a major win for startup times. We're now just trying to figure out how to deal with the version mess and the bumping nightmares that it's causing.
It already does not depend on NonlinearSolve:
|
As SciML has grown, so has its structural needs. Back in 2017, we realized that DifferentialEquations.jl was too large and so it needed to be split into component packages for OrdinaryDiffEq, StochasticDiffEq, etc. in order to allow for loading only a part of the system as DiffEq itself was too big. In 2021, we expanded SciML to have LinearSolve, NonlinearSolve, etc. as separate interfaces so they can be independently used, and as such independent packages. In 2024 we split many of the solver packages into independent packages like OrdinaryDiffEqTsit5, in order to accomodate users who only wanted lean dependencies to simple solvers. And now we have hundreds of packages across hundreds of repos, though many of the subpackages are in the same repo.
If we had started this system from scratch again, I don't think we'd have so many repos. With the subrepo infrastructure of modern Julia, keeping OrdinaryDiffEq and StochasticDiffEq together would have been nice. In fact, DelayDiffEq.jl touches many of the OrdinaryDiffEqCore internals, so it should be versioned together with OrdinaryDiffEq. This leads to issues like JuliaLang/julia#55516 being asked of the compiler team.
The core issue here is that the module structure does not necessarily reflect the public/private boundaries, but rather the module system is designed for separate loading, separate compilation, conditional dependencies, and ultimately handling startup/latency issues. This means that it really does not make sense for OrdinaryDiffEqTsit5 and OrdinaryDiffEqCore to have different versions, since they really should be handled in lock-step, and any versioning that isn't matching is always suspect. This leads to odd issues with bumping, and nasty resolution of manifests. The issue is that the semver system is designed around package boundaries being built around public interfaces with announcing public breakage, but this simply does not work out when the common breakage is non-public internals simply because the package boundary is at some internal function, again not because it's a generally good idea for the packages but instead because it's a requirement to achieve lower startup times.
So what can we do? The proposal is not so simple, but it can work. Instead of waiting on a fix for JuliaLang/julia#55516, we can do some clever tricks on the tools that we now have in Julia 2025 land in order to pull this off. It would look like this:
using OrdinaryDiffEqTsit5
that actually loads the extension code.With this, the user code would look like:
Note that
using OrdinaryDiffEqTsit5
would be a requirement, as OrdinaryDiffEq would no longer trigger any solvers. If this isn't nice, we could have the single version package be OrdinaryDiffEqCore, with a higher level OrdinaryDiffEq that just uses a few solvers.Small Questions
using OrdinaryDiffEqTsit5
would be a requirement, as OrdinaryDiffEq would no longer trigger any solvers. If this isn't nice, we could have the single version package be OrdinaryDiffEqCore, with a higher level OrdinaryDiffEq that just uses a few solvers. " - Is that to janky?Big Question
If we go to monorepo, what's in and what's out? The advantage of having more things in a single repo is clear. Most of the issues around downstream testing, keeping package versions together, etc. are all gone. You know your code is good to go if it passes tests in this repo because that would have "everything".
However, what is everything? Should we have the following:
Or if we're going to do this, should we just have a single SciML?
Then you can just ask, "what version is your SciML?"
I think that would actually be really nice because splitting the interface between SciMLOperators, SciMLBase, DiffEqBase, NonlinearSolveBase, OptimizationBase, OrdinaryDiffEqCore, etc. are also somewhat arbitrary distinctions and moving code at the interface level there has always been a multi-repo mess due to the aritificiality of the split and release process w.r.t. semver at this level. Additionally, SciMLSensitivity is a package adding sensitivity analysis to all solvers, even NonlinearSolve.jl, so it would neatly fit as an extension to all of the packages. If that's the case, is the core package here just SciMLBase, and everything else is an extension to SciMLBase?
That said, how far do we go? Is ModelingToolkit in there? Is Catalyst in there? Symbolics.jl and SymbolicUtils.jl? Do we then restructure the whole SciML documentation as a single Vitepress doc? Or build many different Documenter docs from one repo?
Some CI questions
Conclusion
This is going to be a lot of work, so I'm looking for comments before commencing.
@devmotion @oscardssmith @isaacsas @TorkelE @thazhemadam @asinghvi17
The text was updated successfully, but these errors were encountered: