Skip to content

Conversation

@cvondrick
Copy link

LossLayer::Reshape() uses old-style code for accessing blobs, which will cause a CHECK failure for N-d blobs. This patch fixes it.

@longjon
Copy link
Contributor

longjon commented Jul 24, 2015

Thanks for noticing this. The check pretty clearly needs to go since there's no reason loss layers must take 4-d bottoms.

Options:

  • Always check the first dimension (this PR as is). This preserves the old check, but does this behavior make sense in general? (How strictly should we preserve the meaning of the batch dimension? In some of my own experiments I've needed to drop it entirely.)
  • Omit the check altogether. Loss layers become fully general (with respect to bottom dimensions), but layers which had this check lose it.
  • Keep the check, but only for 4-d bottoms. This preserves the check for layers which previously had it, but makes four a magic number for new loss layers, which doesn't seem right.
  • Omit the check here, but distribute it to the individual layers which previously had it. This results in a certain amount of duplication, but might be the right thing to do in general.

@jeffdonahue
Copy link
Contributor

Thanks for the analysis @longjon -- since ND blobs there are a lot of parts of the code that still awkwardly/incorrectly assume that blobs have a first dimension that corresponds to a "batch", whose items are independent instances. As Caffe becomes more general the notions of instances and batches get less well-defined, e.g. with sequence problems, one might have a data with dimensions NxTxD (N independent instances of T timesteps with D-dimensional data), and "sequence-aware" layers will want to treat the data as a batch of N separate instances, but other layers doing basic processing on individual timesteps (e.g. convolution with the same conv filters on each frame of a video) will want to treat the data as a batch of T*N separate instances, etc. I'm not sure of the best way to address this in general -- with the ND blob PR I added an axis argument to many individual layers as a way of specifying where instances begin, but maybe it would be better in the long run to just leave it to the user to reshape data as appropriate for the layer, except in cases where a layer's functionality very neatly lends itself to processing along any axis.

Anyway, I'd be in favor of your last option -- removing the check here and moving it to individual layers. I'd also be fine with merging this for now as it's a strict improvement over what's currently there.

@cvondrick
Copy link
Author

I think removing it and only adding it to layers that expect it is a good idea.

However, offhand, I can't think of a loss layer that requires this check?

@longjon
Copy link
Contributor

longjon commented Jul 29, 2015

Well, no need to be offhand, let's take a glance:

  • ContrastiveLossLayer seems to assume without checking that all three bottoms have the same num; there's probably already a missing check there.
  • EuclideanLossLayer doesn't strictly need this check, but right now performs an implicit reshape along other dimensions, as long as the count is correct (maybe it should check the whole shape; maybe not). EuclideanLossLayer also uses the num dimension to normalize, so perhaps it should have this check, or some kind of option to control that.
  • HingeLossLayer uses the num dimension for labels and for normalization. It should probably be upgraded to work like SoftmaxLossLayer.
  • InfogainLossLayer both iterates and normalizes over the num dimension, so it probably needs this check.
  • MultinomialLogisticLossLayer currently iterates and normalizes over the num dimension, so it probably needs this check (although it should be upgraded to be like SoftmaxLossLayer).
  • SigmoidCrossEntropyLossLayer appears to normalize over num, but only checks count (giving it a similar implicit reshape to EuclideanLossLayer). It should probably be upgraded to check shape and allow normalization along other dimensions.
  • Finally SoftmaxLossLayer is the gold standard, more or less; it looks like there is some possible implicit reshaping, but the iteration/normalization axes are a parameter, and the num check should be dropped.

So it looks like all but Softmax are assuming this check, whereas Softmax shouldn't have this check, a few layers should have extra checks, and most layers should be upgraded to be more general.

If you want to simply distribute this check to the non-SoftmaxLoss layers, that seems like an improvement.

@Noiredd
Copy link
Member

Noiredd commented Mar 2, 2018

This has been done by now in #4559, so I'll close this but leave #2097 open so we still had a pointer to this discussion.

@Noiredd Noiredd closed this Mar 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants