Skip to content

Conversation

@mariosasko
Copy link
Collaborator

As per @stas00's suggestion in #2500, this PR inserts a space between the logo and the lib name (🤗Datasets -> 🤗 Datasets) for consistency with the Transformers lib. Additionally, more consistent names are used for Datasets Hub, etc.

Copy link
Contributor

@stas00 stas00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this polish, @mariosasko! Great work.

I left a few small remarks in the changes.

README.md Outdated
</p>

`🤗Datasets` is a lightweight library providing **two** main features:
`🤗 Datasets` is a lightweight library providing **two** main features:
Copy link
Contributor

@stas00 stas00 Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, here and everywhere else in this doc needs to be either datasets or 🤗 Datasets - no backticks!

backticks are for the module name, which is not 🤗 Datasets but datasets

Copy link
Contributor

@stas00 stas00 Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there is only one python package called datasets if you go with the former, you probably don't need the emoji.

I'd say to be consistent with transformers docs, most of the time ideally it should be: 🤗 Datasets (as in 🤗 Transformers).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you are writing:

Main differences between 🤗 Datasets and tfds

a few paras below, which looks very unbalanced.

Here you probably want to go for one of:

  1. datasets and tfds
  2. 🤗 Datasets and TensorFlow tfds

Just a suggestion of course...

🤗 Datasets currently provides access to ~100 NLP datasets and ~10 evaluation metrics and is designed to let the community easily add and share new datasets and evaluation metrics. You can browse the full set of datasets with the live 🤗 Datasets viewer.

🤗Datasets originated from a fork of the awesome TensorFlow Datasets and the HuggingFace team want to deeply thank the TensorFlow Datasets team for building this amazing library. More details on the differences between 🤗Datasets and tfds can be found in the section Main differences between 🤗Datasets and tfds.
🤗 Datasets originated from a fork of the awesome TensorFlow Datasets and the HuggingFace team want to deeply thank the TensorFlow Datasets team for building this amazing library. More details on the differences between 🤗 Datasets and tfds can be found in the section Main differences between 🤗 Datasets and tfds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
🤗 Datasets originated from a fork of the awesome TensorFlow Datasets and the HuggingFace team want to deeply thank the TensorFlow Datasets team for building this amazing library. More details on the differences between 🤗 Datasets and tfds can be found in the section Main differences between 🤗 Datasets and tfds.
🤗 Datasets originated from a fork of the awesome TensorFlow Datasets and the HuggingFace team want to deeply thank the TensorFlow Datasets team for building this amazing library. More details on the differences between 🤗 Datasets and tfds can be found in the section Main differences between 🤗 Datasets and `tfds`.

but see my earlier comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine this way. We could also change it to "Tensorflow Datasets"

-----------------------------------------

In this page, we will show you how to share a dataset with the community on the `datasets hub <https://huggingface.co/datasets>`__.
In this page, we will show you how to share a dataset with the community on the `Datasets Hub <https://huggingface.co/datasets>`__.
Copy link
Contributor

@stas00 stas00 Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency may want to add 🤗 here and in all the instances of Datasets Hub below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added them, thanks !

Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing consistency :) This is much cleaner now

Also feel free to ignore the CI errors, it's red just because some tags are missing in some datasets for which the docstring changed a bit

@lhoestq lhoestq merged commit 4aff493 into huggingface:master Jul 13, 2021
@mariosasko mariosasko deleted the more-consistent-names branch July 13, 2021 17:13
@mariosasko mariosasko mentioned this pull request Jul 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants