ToTensor aliases input data and operations like Normalize work in-place, which leads to a nasty footgun when implementing one's own data loaders with the data residing in RAM - as training progresses, the dataset becomes more and more "normalized" which is difficult to debug. Also, the fact that the current implementation of datasets.MNIST doesn't display this behavior seems to depend on an (arguably buggy) implementation detail of PIL.Image, as I argue in my answer to this stackoverflow question.
I propose that the in-place operation is at least made explicit in docs on Normalize, or (a breaking change) make it work out of place by default and add a inplace flag.
ToTensoraliases input data and operations likeNormalizework in-place, which leads to a nasty footgun when implementing one's own data loaders with the data residing in RAM - as training progresses, the dataset becomes more and more "normalized" which is difficult to debug. Also, the fact that the current implementation ofdatasets.MNISTdoesn't display this behavior seems to depend on an (arguably buggy) implementation detail ofPIL.Image, as I argue in my answer to this stackoverflow question.I propose that the in-place operation is at least made explicit in docs on
Normalize, or (a breaking change) make it work out of place by default and add ainplaceflag.