Abstract optimizations #5

botev · 2017-01-23T15:58:48Z

For begging more or less we need only a single optimization - Kernel fusion. The optimization basically states that whenever we have a composition of nodes, but the first one will never be used again we can "fuse" the computation avoiding extra looping. This would require a special FusionOp which would contain a graph initself.

Example: f = tanh(a + b). This normally would become: n0 = a + b and n0 = tanh(n0) (assuming the memory optimizer works well). However, on a GPU these are still 2 kernels. on a CPU two loops. Fusing this would mean we move from:

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = ai + bi;
}
for &mut ni in &mut n0 {
    *ni = tanh(*ni);
}

To

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = tanh(ai + bi);
}

The text was updated successfully, but these errors were encountered:

botev added this to the Draft milestone Jan 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract optimizations #5

Abstract optimizations #5

botev commented Jan 23, 2017

Abstract optimizations #5

Abstract optimizations #5

Comments

botev commented Jan 23, 2017