Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented May 23, 2022

Now the builder name attribute is set from from the builder class name.

This PR sets the builder name attribute from the module name instead. Some motivating reasons:

  • The dataset ID is relevant and unique among all datasets and this is directly related to the repository name, i.e., the name of the directory containing the dataset
  • The name of the module (i.e. the file containing the loading loading script) is already relevant for loading: it must have the same name as its containing directory (related to the dataset ID), as we search for it using its directory name
  • On the other hand, the name of the builder class is not relevant for loading: in our code, we just search for a class which is subclass of DatasetBuilder (independently of its name). We do not put any constraint on the naming of the builder class and indeed it can have a name completely different from its module/direcotry/dataset_id

IMO it makes more sense to align the caching directory name with the dataset_id/directory/module name instead of the builder class name.

Fix #4381.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 23, 2022

The documentation is not available anymore as the PR was closed or merged.

@albertvillanova albertvillanova marked this pull request as draft May 23, 2022 16:19
@albertvillanova albertvillanova marked this pull request as ready for review May 24, 2022 08:29
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks ! I agree this is the right way to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug in caching 2 datasets both with the same builder class name

3 participants