Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented Mar 30, 2022

I updated our tasks.json file with the new task taxonomy that is aligned with models.

The rule that defines a task is the following:

Two tasks are different if and only if the steps of their pipelines are different, i.e. if they can’t reasonably be implemented using the same coherent code (level of granularity/complexity of the code to be defined - ideally I’d like to say “HF user’s level”) - this is the same definition in transformers

I will update the tags of all the datasets in this repository in another PR for readability.

Main changes:

  • conditional-text-generation is split between summarization, translation, text-generation and text2text-generation
  • speech-processing is split into automatic-speech-recognition, audio-classification, etc.
  • structure-prediction is renamed token-classification
  • abstractive-qa now belongs to text2text-generation

Here is just a simplified YAML dump of tasks.json:

audio-classification:
- keyword-spotting
- speaker-identification
- speaker-intent-classification
- emotion-recognition
- speaker-language-identification
audio-to-audio: []
automatic-speech-recognition: []
conversational:
- dialogue-generation
feature-extraction: []
fill-mask:
- slot-filling
- masked-language-modeling
image-classification:
- multi-label-image-classification
- multi-class-image-classification
image-segmentation:
- instance-segmentation
- semantic-segmentation
- panoptic-segmentation
image-to-text:
- image-captioning
multiple-choice:
- multiple-choice-qa
- multiple-choice-coreference-resolution
object-detection:
- face-detection
- vehicle-detection
question-answering:
- extractive-qa
- open-domain-qa
- closed-domain-qa
sentence-similarity: []
tabular-classification: []
tabular-to-text:
- rdf-to-text
summarization:
- news-articles-summarization
- news-articles-headline-generation
table-to-text: []
table-question-answering: []
text-classification:
- acceptability-classification
- entity-linking-classification
- fact-checking
- intent-classification
- multi-class-classification
- multi-label-classification
- natural-language-inference
- semantic-similarity-classification
- sentiment-classification
- topic-classification
- semantic-similarity-scoring
- sentiment-scoring
- sentiment-analysis
- hate-speech-detection
- text-scoring
text-generation:
- dialogue-modeling
- language-modeling
text-retrieval:
- document-retrieval
- utterance-retrieval
- entity-linking-retrieval
- fact-checking-retrieval
text-to-image: []
text-to-tabular:
- relation-extraction
- semantic-role-labeling
text-to-speech: []
text2text-generation:
- text-simplification
- explanation-generation
- abstractive-qa
- open-domain-abstractive-qa
- closed-domain-qa
- open-book-qa
- closed-book-qa
time-series-forecasting:
- univariate-time-series-forecasting
- multivariate-time-series-forecasting
token-classification:
- named-entity-recognition
- part-of-speech-tagging
- parsing
- lemmatization
- word-sense-disambiguation
- coreference-resolution
translation: []
visual-question-answering: []
voice-activity-detection: []
zero-shot-classification: []
zero-shot-image-classification: []
reinforcement-learning: []
other: []

Feel free to comment and give suggestions, especially if you think we can also align this list with other projects

cc @julien-c @osanseviero @severo @lewtun @yjernite @albertvillanova @mariosasko @polinaeterna

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 30, 2022

The documentation is not available anymore as the PR was closed or merged.

@julien-c
Copy link
Member

Yay! This is exciting! Note that we would probably be able to generate this JSON directly from huggingface/hub-docs' Types.ts file (cc @osanseviero)

@osanseviero
Copy link
Contributor

The following issue should make this much easier 😄 huggingface/hub-docs#83

@lhoestq lhoestq requested a review from osanseviero April 1, 2022 15:16
@lhoestq
Copy link
Member Author

lhoestq commented Apr 1, 2022

So far I think I've addressed all the comments that I got on slack, but feel free to do a review @osanseviero and let me know if it sounds good to you

@lewtun
Copy link
Member

lewtun commented Apr 5, 2022

It just occurred to me that we should probably restart the datasets-tagging space once this is merged to update all the task categories there: https://huggingface.co/spaces/huggingface/datasets-tagging

@lhoestq
Copy link
Member Author

lhoestq commented Apr 5, 2022

Yes, let me update it now

@lhoestq
Copy link
Member Author

lhoestq commented Apr 5, 2022

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

As discussed, once huggingface/hub-docs#83 is done this json file will be automatically generated.

@lhoestq lhoestq merged commit 3b695f2 into master Apr 8, 2022
@lhoestq lhoestq deleted the update-task-list branch April 8, 2022 12:20
@julien-c
Copy link
Member

current automated export is visible at #4154

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants