-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion and NNVM #1
Comments
Hmm, yeah I was looking into the project, however I think I do need a "crash" course in order to fully understand how does the NVVM actually represents the graph and how is the autodiff executed (I guess the principle is quite similar in that respect). I would be quite happy to discuss ideas and see if we could work together, however just to note there are a few things which I'm quite keen on keeping. As to your point 2, I do not fully agree. I think it is correct, and probably NVVM is the right hit on this, to allow of adding new operators, however I do still think that even so the implementator should be required to provide some strict basic info for the new Operators, such that optimizers later can work with it, as well as to provide checks that things don't break at graph build time, rather than runtime. Of course some of that could be optional, but thinks like shape mutation and type mutation I think are always necessary. Additionally, if the optimization for the primitive operations get's to a good level I'm not too convinced we will ever need more than those operations. If you want you can contact me on my email for further discussion. |
I totally agree that operators need to provide information, and there should be guideline of information. This can essentially be done by providing a few attributes as in https://github.com/dmlc/nnvm/blob/master/include/nnvm/op_attr_types.h However, the fact is that again the basic info could not be enumerated (imagine what if you want to cost function for the graph scheduling pass), and some info should be disposable (Assume a planer that contains all the passes to an optimized graph, while an executor contains non of these info, but simply executes them). That inspires the design of NNVM. The primitive ops are more for compilation, while the graph have something besides the compilation. The fact is differentiation, scheduling and memory planning do not need to have such fine grained information. While in the high level, there is always a need to support customized operators while still benefit from these optimizations, and mix them with primitive ops which can be compiled. So to summarize, NNVM's goal is kinda of aligned, except that the project aims to provide more extendibility and abstraction for higher level optimizations. The primitive ops can be build with NNVM's registry to provide ways for compilation and lower level optimizations. This separation is very helpful because this enables the immediate adoption of NNVM in higher level frameworks(because they can reuse their own op definition), while being able to optimize and transform back to primitive ones partially for benefit of compilation later |
Hey, just see this project from your reddit post. We are working on something related called nnvm https://github.com/dmlc/nnvm
Here are the few things I like to clarify.
NNVM does do symbolic differentiation, which enables higher order differentiation if the gradient operator's gradient is well defined. In fact mxnet also creates new nodes for gradient, except that in the front-end it exposes module based API.
NNVM's optimization relies on attributes such as shape function, type inference function, which are strict. The observation is at such high level, it is hard to enumerate the operations a DL system want. This being said, it is indeed helpful to have a collection of primitive operators, which can be hinted through additional attributes. We used such thing for fusion module
I am hoping to see if there is any chance for discussion and lure you into NNVM to build interesting abstractions together, or making the abstraction better for everyone
The text was updated successfully, but these errors were encountered: