Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract optimizations #5

Open
botev opened this issue Jan 23, 2017 · 0 comments
Open

Abstract optimizations #5

botev opened this issue Jan 23, 2017 · 0 comments
Milestone

Comments

@botev
Copy link
Contributor

botev commented Jan 23, 2017

For begging more or less we need only a single optimization - Kernel fusion. The optimization basically states that whenever we have a composition of nodes, but the first one will never be used again we can "fuse" the computation avoiding extra looping. This would require a special FusionOp which would contain a graph initself.

Example: f = tanh(a + b). This normally would become: n0 = a + b and n0 = tanh(n0) (assuming the memory optimizer works well). However, on a GPU these are still 2 kernels. on a CPU two loops. Fusing this would mean we move from:

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = ai + bi;
}
for &mut ni in &mut n0 {
    *ni = tanh(*ni);
}

To

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = tanh(ai + bi);
}
@botev botev added this to the Draft milestone Jan 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant