Value expectation and 1st order CKY by teffland · Pull Request #93 · harvardnlp/pytorch-struct

teffland · 2021-01-20T17:05:29Z

Changes are:

Add 1st order cky implementation with suggested updates
Add value-expectation semiring and test
Make marginal computation work on potentials w/o gradients
Minor clarity/consistency changes to semirings
Some documentation for _Struct

srush · 2021-01-20T18:49:58Z

Hmm, for some reason the tests are not running on this. Trying to figure out why.

srush · 2021-01-20T19:01:12Z

tests/test_distributions.py

+        .sum(2)
+    )
+    assert torch.isclose(
+        E_val, log_probs.exp().unsqueeze(-1).mul(struct_vals).sum(0)


Just curious. Why not just make this the implementation of expected value? It seems just as good and perhaps more efficient.y

Sorry, maybe I'm confused but isn't this enumerating over all possible structures explicitly?

Oh sorry, my comment is confusing.

I think a valid way of computing an expectation over any "part-level value" is to first compute the marginals (.marginals()) and then doing an elementwise mul (.mul) and then summing. Doesn't that give you the same thing as the semiring?

Oh wow, I didn't realize this! I just tested it out and it appears to be more efficient for larger structure sizes. I guess this is due to the fast log semiring implementation? I'll update things to use this approach instead.

Yeah, I think that is right... I haven't thought about this too much, but my guess is that this is just better on GPU hardware since the expectation is batched at the end. But it seems worth understand when this works. I don't think you can compute Entropy this way? (but I might be wrong)

Makes sense. I also don't think entropy can be done this way -- I just tested it out and the results didn't match the semiring. I will switch to this implementation in the latest commit and get rid of the value semiring.

Fwiw I ran a quick speed comparison you might be interested in:

B, N, C = 4, 200, 10 phis = torch.randn(B,N,C,C).cuda() vals = torch.randn(B,N,C,C,10).cuda()

Results from running w/ genbmm

%%timeit LinearChainCRF(phis).expected_value(vals) >>> 100 loops, best of 3: 6.34 ms per loop %%timeit LinearChainCRF(phis).marginals.unsqueeze(-1).mul(vals).reshape(B,-1,vals.shape[-1]).sum(1) >>> 100 loops, best of 3: 5.64 ms per loop

Results from running w/o genbmm

%%timeit LinearChainCRF(phis).expected_value(vals) >>> 100 loops, best of 3: 9.67 ms per loop %%timeit LinearChainCRF(phis).marginals.unsqueeze(-1).mul(vals).reshape(B,-1,vals.shape[-1]).sum(1) >>> 100 loops, best of 3: 8.83 ms per loop

srush · 2021-01-20T19:01:39Z

torch_struct/distributions.py

+        """
+        Compute expectated value for distribution :math:`E_z[f(z)]` where f decomposes additively over the factors of p_z.
+
+        Params:


This should be "Parameters:"

srush · 2021-01-20T19:02:11Z

torch_struct/distributions.py

+        Compute expectated value for distribution :math:`E_z[f(z)]` where f decomposes additively over the factors of p_z.
+
+        Params:
+          * values (*batch_shape x *event_shape, *value_shape): torch.FloatTensor that assigns a value to each part


Let's put the types in the first parens, and use :class:torch.FloatTensor

srush · 2021-01-20T19:03:58Z

torch_struct/distributions.py

        samples = []
        for k in range(nsamples):
-            if k % 10 == 0:
+            if k % batch_size == 0:


Oh yeah, sorry this is my fault. 10 is a global constant. Let's put it on MultiSampledSemiring.

srush · 2021-01-20T19:04:45Z

torch_struct/distributions.py

+    Implementation uses width-batched, forward-pass only
+
+    * Parallel Time: :math:`O(N)` parallel merges.
+    * Forward Memory: :math:`O(N^2)`


This can't be right... isn't the event shape O(N^3) alone?

Oops yeah that's from modifying the CKYCRF class

srush · 2021-01-20T19:05:24Z

torch_struct/full_cky_crf.py

@@ -0,0 +1,114 @@
+import torch
+from .helpers import _Struct, Chart
+from tqdm import tqdm


Be sure to run python setup.py style to run flake8 . It will catch these errors.

srush · 2021-01-20T19:06:15Z

torch_struct/helpers.py

+
+        Returns:
+          v (torch.Tensor) : the resulting output of the dynammic program
+          edges (List[torch.Tensor]): the log edge potentials of the model.


changing this to logpotentials throughout.

srush · 2021-01-20T19:06:28Z

torch_struct/helpers.py

+                 [scores], as in `Alignment`, `LinearChain`, `SemiMarkov`, `CKY_CRF`.
+                 An exceptional case is the `CKY` struct, which takes log potential parameters from production rules
+                 for a PCFG, which are by definition independent of position in the sequence.
+          charts: Optional[List[Chart]] = None, the charts used in computing the dp. They are needed if we want to run the


Going to remove this for simplicity.

srush · 2021-01-20T19:07:34Z

torch_struct/helpers.py

-            for k in range(v.shape[0]):
-                obj = v[k].sum(dim=0)
-
+        with torch.autograd.enable_grad():  # in case input potentials don't have grads enabled.


srush · 2021-01-20T19:08:45Z

torch_struct/semirings/semirings.py

        return xs


+def ValueExpectationSemiring(k):


Are you sure we don't have this already? Could have sworn someone added it.

I'm not 100% sure, I looked and hadn't seen it anywhere in master so I went ahead with it. Maybe it's in another branch? There's the entropy semiring which is very similar.

srush · 2021-01-20T19:09:47Z

Thanks the PR. Lots of nice stuff in here.

teffland · 2021-01-22T15:59:04Z

Quick dev question: when I try running python setup.py test or python setup.py style, I get error: invalid command 'tests' and error: option --ignore not recognized, respectively. I can run the command on their own, but I'm just wondering if I'm doing something wrong?

srush · 2021-01-22T16:03:59Z

Interesting, yeah not sure how to run those automatically, I will look into it.

teffland and others added 7 commits January 19, 2021 16:32

starting on full crf

90fc546

working full cky crf

e863fbb

debugging full cky

498c964

more changes

049dbd4

[wip] add expectation semiring

3c5dfbc

add expected value semiring and test

73c8d7b

add full cky crfclear

6e1704a

teffland mentioned this pull request Jan 20, 2021

1st order cky implementation #83

Closed

Update full_cky_crf.py

2d0abe8

srush added 2 commits January 20, 2021 13:51

Update helpers.py

297209a

Update distributions.py

657fbc6

srush requested changes Jan 20, 2021

View reviewed changes

Ubuntu added 4 commits January 20, 2021 20:12

address review suggestions

6edceb0

fix doc string errors

19982f7

switch value expectation to elementwise mul and reduce

71004b2

darglint ignore logpartition docstring mismatch

cded5e1

Conversation

teffland commented Jan 20, 2021

Uh oh!

srush commented Jan 20, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Results from running w/ genbmm

Results from running w/o genbmm

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

teffland Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srush commented Jan 20, 2021

Uh oh!

teffland commented Jan 22, 2021

Uh oh!

srush commented Jan 22, 2021 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

teffland Jan 20, 2021 •

edited

Loading