Conversation
|
Is there a reason you closed this? It looks cool. |
|
Oh, I just accidentally commit it to the original main branch. If you found it interesting, maybe I can reopen it? |
Some modifications to the shape of the input of `semiring.matmul` to make it compatible.
|
Very neat. I'll take a read. Do you find that this gives a speedup? Seems hard to parallelize. |
|
You mean the speed up comparing to using the gradient identity? At first I have tried only calculating the prefix sum and using back-propagation to get the marginal of the prefix sequence. But it requires O(N) times of back propagation separately. Otherwise we have to replicate the whole graph N times for O(1) parallel complexity but it will easily hit the GPU memory limit. |

No description provided.