Einsum op #240

dc-dc-dc · 2023-12-21T16:11:53Z

Proposed changes

Adds einsum op

Step closer to adding einops support as mentioned here #172

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

gboduljak · 2023-12-22T00:07:29Z

I think this PR is quite cool. I have some performance concerns. Should we use something like opt_einsum for computing the optimal einsum computation path? If we go this route, we maybe need only to implement tensordot and similar operators.

dc-dc-dc · 2023-12-22T06:21:41Z

Not too familiar with this lib but It looks like opt_einsum builds on top of einsum from various libraries, is it intended to be used for an einsum implementation?

gboduljak · 2023-12-24T03:45:48Z

Not too familiar with this lib but It looks like opt_einsum builds on top of einsum from various libraries, is it intended to be used for an einsum implementation?

Optimized einsum is agnostic to the backend. This means we could just implement the wrapper that calls MLX backend. It is used in JAX einsum implementation (https://jax.readthedocs.io/en/latest/_modules/jax/_src/numpy/lax_numpy.html#einsum).

awni

This will be great to have! Thanks for adding it! I left a few comments as a start.

I think the main thing is if/when/how we dispatch to matmul. O/w einsum will be really slow in cases when it should not be.

mlx/ops.cpp

awni · 2023-12-24T03:43:06Z

mlx/ops.cpp

+    i += 1;
+  }
+
+  auto acc = ones_like(inputs_arr.at(0), s);


Start with inputs_arr[0] rather than include a new array for accumulation. There should be at least one input right?

Actually would be good to check for that at the top and throw if it's right, and maybe add a test case that check that we throw for that.

strangely this does not work, if i switch it to use the first array and then accumulate it gets some strange results on the test cases

Did you ever get to the bottom of this? Let me know I can take a look as well?

That would be helpful, I messed with it a little but didn't find anything concrete.
I tried calling eval on inputs_arr.at(0) to see if maybe certain ops didnt take affect but it didn't resolve it

awni · 2023-12-24T03:45:14Z

mlx/ops.cpp

+  for (auto arr : inputs_arr) {
+    acc = multiply(acc, arr, s);
+  }
+  return transpose(sum(acc, sum_axis, false, s), rhs_order, s);


In some/many common cases we should dispatch to matmul rather than multiply and sum as it will be a lot slower and memory inefficient. Is it possible to include that logic (or some of it) now?

dc-dc-dc · 2023-12-28T18:11:26Z

After messing this a little more there are still some edge cases that this doesnt cover

fastpath for matmul
ij,jk

Will close for now and re-open when its in a better spot

angeloskath · 2023-12-28T22:07:05Z

@dc-dc-dc FYI I think it was a great PR! You could have changed it to draft to keep working on it in the open and get feedback. @awni let me know what you think and sorry to both that I didn't get to reviewing this earlier.

I started a review yesterday and looked into what other frameworks implement, what opt_einsum expects etc. My two cents regarding how I would approach it are

splitting it to pairwise contractions
sorting the contractions based on some cost then run them
almost all implementation use a greedy search which yields near optimal results and is much much simpler to implement

Just to make myself clearer, for the following contraction for instance ijk,km,k->im the ideal einsum would perform the following

op0 = op0.sum(1)
op1 = op1 * op2[:, None]  # this could also be op0 * op2[None] depending on the sizes of op0, op1
return op0 @ op1

I think summing over axes that only appear on one input (and not the output) is straightforward. Subsequently, we have the following contractions ik,km->ikm, ik,k->ik, km,k->km. Each contraction should keep all axes that appear in the result or other arguments. From these it is obvious that the last two are the fastest. We could use a naive FLOPS estimator to do the sorting (namely product of size of all axes in the contraction).

As an aside, if you want to tackle something simpler to start with (again I think your PR was great!), you could implement tensordot. Then you would already have a baseline for einsum from opt_einsum since we could very easily implement a backend for that to test against our own einsum.

dc-dc-dc · 2023-12-28T23:32:49Z

I wouldve preferred to switch to draft but didnt see an option to switch to draft so closed for now. But as I was messing with it more I noticed some more missed cases that need to be covered. Might take a quick stab at dot / tensordot and come back to this. But in the meantime if someone wants to take this as a base and continue forward, you have my approval 😄

awni requested changes Dec 24, 2023

View reviewed changes

awni requested a review from angeloskath December 24, 2023 03:46

dc-dc-dc force-pushed the einsum branch from 77d296d to 7a06489 Compare December 26, 2023 03:44

dc-dc-dc mentioned this pull request Dec 26, 2023

Support for einops #172

Closed

dc-dc-dc added 9 commits December 28, 2023 10:29

einsum initial

c8ec602

fix comma break

4c43708

sum axis was wrong

73ae2a7

small cleanups

949b0fe

python binding

ed889d1

changed bindings to resemble numpy

4dd9c8c

remove todo comment

400d98a

comment changes

a77f2e8

add count of operands/inputs

6526cd3

dc-dc-dc force-pushed the einsum branch from 7a06489 to 6526cd3 Compare December 28, 2023 15:29

fail fast if operands list is empty

98b2831

dc-dc-dc closed this Dec 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Einsum op #240

Einsum op #240

dc-dc-dc commented Dec 21, 2023

gboduljak commented Dec 22, 2023

dc-dc-dc commented Dec 22, 2023 •

edited

Loading

gboduljak commented Dec 24, 2023 •

edited

Loading

awni left a comment

awni Dec 24, 2023

awni Dec 24, 2023

dc-dc-dc Dec 26, 2023

awni Dec 27, 2023

dc-dc-dc Dec 27, 2023

awni Dec 24, 2023

dc-dc-dc commented Dec 28, 2023

angeloskath commented Dec 28, 2023

dc-dc-dc commented Dec 28, 2023

Einsum op #240

Einsum op #240

Conversation

dc-dc-dc commented Dec 21, 2023

Proposed changes

Checklist

gboduljak commented Dec 22, 2023

dc-dc-dc commented Dec 22, 2023 • edited Loading

gboduljak commented Dec 24, 2023 • edited Loading

awni left a comment

Choose a reason for hiding this comment

awni Dec 24, 2023

Choose a reason for hiding this comment

awni Dec 24, 2023

Choose a reason for hiding this comment

dc-dc-dc Dec 26, 2023

Choose a reason for hiding this comment

awni Dec 27, 2023

Choose a reason for hiding this comment

dc-dc-dc Dec 27, 2023

Choose a reason for hiding this comment

awni Dec 24, 2023

Choose a reason for hiding this comment

dc-dc-dc commented Dec 28, 2023

angeloskath commented Dec 28, 2023

dc-dc-dc commented Dec 28, 2023

dc-dc-dc commented Dec 22, 2023 •

edited

Loading

gboduljak commented Dec 24, 2023 •

edited

Loading