Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion datasets/event2Mind/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
languages:
- en
paperswithcode_id: event2mind
pretty_name: Event2Mind
---

# Dataset Card for "event2Mind"
Expand Down Expand Up @@ -164,4 +165,4 @@ The data fields are the same among all splits.

### Contributions

Thanks to [@thomwolf](https://github.com/thomwolf), [@patrickvonplaten](https://github.com/patrickvonplaten), [@lewtun](https://github.com/lewtun) for adding this dataset.
Thanks to [@thomwolf](https://github.com/thomwolf), [@patrickvonplaten](https://github.com/patrickvonplaten), [@lewtun](https://github.com/lewtun) for adding this dataset.
5 changes: 3 additions & 2 deletions datasets/factckbr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,10 @@ task_categories:
task_ids:
- fact-checking
paperswithcode_id: null
pretty_name: FACTCK BR
---

# Dataset Card for [Dataset Name]
# Dataset Card for FACTCK BR

## Table of Contents
- [Dataset Description](#dataset-description)
Expand Down Expand Up @@ -142,4 +143,4 @@ The FACTCK.BR dataset contains 1309 claims with its corresponding label.

### Contributions

Thanks to [@hugoabonizio](https://github.com/hugoabonizio) for adding this dataset.
Thanks to [@hugoabonizio](https://github.com/hugoabonizio) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/fake_news_english/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- multi-label-classification
paperswithcode_id: null
pretty_name: Fake News English
---

# Dataset Card for Fake News English
Expand Down Expand Up @@ -165,4 +166,4 @@ doi = {10.1145/3201064.3201100}

### Contributions

Thanks to [@MisbahKhan789](https://github.com/MisbahKhan789), [@lhoestq](https://github.com/lhoestq) for adding this dataset.
Thanks to [@MisbahKhan789](https://github.com/MisbahKhan789), [@lhoestq](https://github.com/lhoestq) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/fake_news_filipino/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- fact-checking
paperswithcode_id: fake-news-filipino-dataset
pretty_name: Fake News Filipino
---

# Dataset Card for Fake News Filipino
Expand Down Expand Up @@ -156,4 +157,4 @@ Jan Christian Blaise Cruz, Julianne Agatha Tan, and Charibeth Cheng

### Contributions

Thanks to [@anaerobeth](https://github.com/anaerobeth) for adding this dataset.
Thanks to [@anaerobeth](https://github.com/anaerobeth) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/farsi_news/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- language-modeling
paperswithcode_id: null
pretty_name: FarsiNews
---

# Dataset Card Creation Guide
Expand Down Expand Up @@ -153,4 +154,4 @@ https://github.com/sci2lab/Farsi-datasets

### Contributions

Thanks to [@Narsil](https://github.com/Narsil) for adding this dataset.
Thanks to [@Narsil](https://github.com/Narsil) for adding this dataset.
1 change: 1 addition & 0 deletions datasets/fashion_mnist/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ task_categories:
task_ids:
- other-other-image-classification
paperswithcode_id: fashion-mnist
pretty_name: FashionMNIST
---

# Dataset Card for FashionMNIST
Expand Down
1 change: 1 addition & 0 deletions datasets/few_rel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ task_categories:
task_ids:
- other-other-relation-extraction
paperswithcode_id: fewrel
pretty_name: Few-Shot Relation Classification Dataset
---

# Dataset Card for few_rel
Expand Down
1 change: 1 addition & 0 deletions datasets/financial_phrasebank/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ task_ids:
- multi-class-classification
- sentiment-classification
paperswithcode_id: null
pretty_name: FinancialPhrasebank
---

# Dataset Card for financial_phrasebank
Expand Down
3 changes: 2 additions & 1 deletion datasets/finer/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- named-entity-recognition
paperswithcode_id: finer
pretty_name: Finnish News Corpus for Named Entity Recognition
---

# Dataset Card for [Dataset Name]
Expand Down Expand Up @@ -155,4 +156,4 @@ IOB2 labeling scheme is used.

### Contributions

Thanks to [@stefan-it](https://github.com/stefan-it) for adding this dataset.
Thanks to [@stefan-it](https://github.com/stefan-it) for adding this dataset.
11 changes: 6 additions & 5 deletions datasets/fquad/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ task_ids:
- extractive-qa
- closed-domain-qa
paperswithcode_id: fquad
pretty_name: "FQuAD: French Question Answering Dataset"
---

# Dataset Card for "fquad"
# Dataset Card for FQuAD

## Table of Contents
- [Dataset Description](#dataset-description)
Expand Down Expand Up @@ -63,10 +64,10 @@ paperswithcode_id: fquad
### Dataset Summary

FQuAD: French Question Answering Dataset
We introduce FQuAD, a native French Question Answering Dataset.
We introduce FQuAD, a native French Question Answering Dataset.

FQuAD contains 25,000+ question and answer pairs.
Finetuning CamemBERT on FQuAD yields a F1 score of 88% and an exact match of 77.9%.
Finetuning CamemBERT on FQuAD yields a F1 score of 88% and an exact match of 77.9%.
Developped to provide a SQuAD equivalent in the French language. Questions are original and based on high quality Wikipedia articles.

### Supported Tasks and Leaderboards
Expand Down Expand Up @@ -116,7 +117,7 @@ The data fields are the same among all splits.

### Data Splits

The FQuAD dataset has 3 splits: _train_, _validation_, and _test_. The _test_ split is however not released publicly at the moment. The splits contain disjoint sets of articles. The following table contains stats about each split.
The FQuAD dataset has 3 splits: _train_, _validation_, and _test_. The _test_ split is however not released publicly at the moment. The splits contain disjoint sets of articles. The following table contains stats about each split.

Dataset Split | Number of Articles in Split | Number of paragraphs in split | Number of questions in split
--------------|------------------------------|--------------------------|-------------------------
Expand All @@ -134,7 +135,7 @@ Test | 10 | 532 | 2189
The text used for the contexts are from the curated list of French High-Quality Wikipedia [articles](https://fr.wikipedia.org/wiki/Cat%C3%A9gorie:Article_de_qualit%C3%A9).
### Annotations

Annotations (spans and questions) are written by students of the CentraleSupélec school of engineering.
Annotations (spans and questions) are written by students of the CentraleSupélec school of engineering.
Wikipedia articles were scraped and Illuin used an internally-developped tool to help annotators ask questions and indicate the answer spans.
Annotators were given paragraph sized contexts and asked to generate 4/5 non-trivial questions about information in the context.

Expand Down
1 change: 1 addition & 0 deletions datasets/freebase_qa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- open-domain-qa
paperswithcode_id: freebaseqa
pretty_name: FreebaseQA
---

# Dataset Card for FreebaseQA
Expand Down
3 changes: 2 additions & 1 deletion datasets/gap/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
languages:
- en
paperswithcode_id: gap
pretty_name: GAP Benchmark Suite
---

# Dataset Card for "gap"
Expand Down Expand Up @@ -187,4 +188,4 @@ The data fields are the same among all splits.

### Contributions

Thanks to [@thomwolf](https://github.com/thomwolf), [@patrickvonplaten](https://github.com/patrickvonplaten), [@otakumesi](https://github.com/otakumesi), [@lewtun](https://github.com/lewtun) for adding this dataset.
Thanks to [@thomwolf](https://github.com/thomwolf), [@patrickvonplaten](https://github.com/patrickvonplaten), [@otakumesi](https://github.com/otakumesi), [@lewtun](https://github.com/lewtun) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/generics_kb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ task_categories:
task_ids:
- other-other-knowledge-base
paperswithcode_id: genericskb
pretty_name: GenericsKB
---

# Dataset Card for Generics KB
Expand Down Expand Up @@ -205,4 +206,4 @@ publisher = {Allen Institute for AI},

### Contributions

Thanks to [@bpatidar](https://github.com/bpatidar) for adding this dataset.
Thanks to [@bpatidar](https://github.com/bpatidar) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/german_legal_entity_recognition/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- named-entity-recognition
paperswithcode_id: legal-documents-entity-recognition
pretty_name: Legal Documents Entity Recognition
---

# Dataset Card Creation Guide
Expand Down Expand Up @@ -142,4 +143,4 @@ paperswithcode_id: legal-documents-entity-recognition
[More Information Needed]
### Contributions

Thanks to [@abhishekkrthakur](https://github.com/abhishekkrthakur) for adding this dataset.
Thanks to [@abhishekkrthakur](https://github.com/abhishekkrthakur) for adding this dataset.
13 changes: 7 additions & 6 deletions datasets/germaner/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
annotations_creators:
annotations_creators:
- crowdsourced
language_creators:
language_creators:
- found
languages:
- de
Expand All @@ -18,9 +18,10 @@ task_categories:
task_ids:
- named-entity-recognition
paperswithcode_id: null
pretty_name: GermaNER
---

# Dataset Card Creation Guide
# Dataset Card for GermaNER

## Table of Contents
- [Dataset Description](#dataset-description)
Expand Down Expand Up @@ -72,8 +73,8 @@ An example instance looks as follows:

```
{
'id': '3',
'ner_tags': [1, 5, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
'id': '3',
'ner_tags': [1, 5, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8],
'tokens': ['Bayern', 'München', 'ist', 'wieder', 'alleiniger', 'Top-', 'Favorit', 'auf', 'den', 'Gewinn', 'der', 'deutschen', 'Fußball-Meisterschaft', '.']
}
```
Expand Down Expand Up @@ -190,7 +191,7 @@ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
You must give any other recipients of the Work or Derivative Works a copy of this License; and
You must cause any modified files to carry prominent notices stating that You changed the files; and
You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and
If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.
If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License.

You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions.
Expand Down
3 changes: 2 additions & 1 deletion datasets/germeval_14/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
paperswithcode_id: null
pretty_name: GermEval14
---

# Dataset Card for "germeval_14"
Expand Down Expand Up @@ -168,4 +169,4 @@ The data fields are the same among all splits.

### Contributions

Thanks to [@thomwolf](https://github.com/thomwolf), [@jplu](https://github.com/jplu), [@lewtun](https://github.com/lewtun), [@lhoestq](https://github.com/lhoestq), [@stefan-it](https://github.com/stefan-it), [@mariamabarham](https://github.com/mariamabarham) for adding this dataset.
Thanks to [@thomwolf](https://github.com/thomwolf), [@jplu](https://github.com/jplu), [@lewtun](https://github.com/lewtun), [@lhoestq](https://github.com/lhoestq), [@stefan-it](https://github.com/stefan-it), [@mariamabarham](https://github.com/mariamabarham) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/giga_fren/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ task_categories:
task_ids:
- machine-translation
paperswithcode_id: null
pretty_name: GigaFren
---

# Dataset Card Creation Guide
Expand Down Expand Up @@ -145,4 +146,4 @@ Here are some examples of questions and facts:
[More Information Needed]
### Contributions

Thanks to [@abhishekkrthakur](https://github.com/abhishekkrthakur) for adding this dataset.
Thanks to [@abhishekkrthakur](https://github.com/abhishekkrthakur) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/gigaword/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
languages:
- en
paperswithcode_id: null
pretty_name: gigaword
---

# Dataset Card for "gigaword"
Expand Down Expand Up @@ -176,4 +177,4 @@ The data fields are the same among all splits.

### Contributions

Thanks to [@lewtun](https://github.com/lewtun), [@lhoestq](https://github.com/lhoestq), [@thomwolf](https://github.com/thomwolf) for adding this dataset.
Thanks to [@lewtun](https://github.com/lewtun), [@lhoestq](https://github.com/lhoestq), [@thomwolf](https://github.com/thomwolf) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/glucose/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- sequence-modeling-other-common-sense-inference
paperswithcode_id: glucose
pretty_name: GLUCOSE
---

# Dataset Card for [Dataset Name]
Expand Down Expand Up @@ -232,4 +233,4 @@ Creative Commons Attribution-NonCommercial 4.0 International Public License
```
### Contributions

Thanks to [@TevenLeScao](https://github.com/TevenLeScao) for adding this dataset.
Thanks to [@TevenLeScao](https://github.com/TevenLeScao) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/glue/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,10 @@ task_ids:
wnli:
- text-classification-other-coreference-nli
paperswithcode_id: glue
pretty_name: GLUE (General Language Understanding Evaluation benchmark)
---

# Dataset Card for "glue"
# Dataset Card for GLUE

## Table of Contents
- [Dataset Description](#dataset-description)
Expand Down
3 changes: 2 additions & 1 deletion datasets/gnad10/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- topic-classification
paperswithcode_id: null
pretty_name: 10k German News Articles Datasets
---

# Dataset Card for 10k German News Articles Datasets
Expand Down Expand Up @@ -153,4 +154,4 @@ This dataset is licensed under the Creative Commons Attribution-NonCommercial-Sh

### Contributions

Thanks to [@stevhliu](https://github.com/stevhliu) for adding this dataset.
Thanks to [@stevhliu](https://github.com/stevhliu) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/go_emotions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ task_ids:
- multi-label-classification
- text-classification-other-emotion
paperswithcode_id: goemotions
pretty_name: GoEmotions
---

# Dataset Card for GoEmotions
Expand Down Expand Up @@ -181,4 +182,4 @@ The GitHub repository which houses this dataset has an

### Contributions

Thanks to [@joeddav](https://github.com/joeddav) for adding this dataset.
Thanks to [@joeddav](https://github.com/joeddav) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/google_wellformed_query/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ size_categories:
licenses:
- CC-BY-SA-4.0
paperswithcode_id: null
pretty_name: GoogleWellformedQuery
---

# Dataset Card Creation Guide
Expand Down Expand Up @@ -154,4 +155,4 @@ Query-wellformedness dataset is licensed under CC BY-SA 4.0. Any third party con

### Contributions

Thanks to [@vasudevgupta7](https://github.com/vasudevgupta7) for adding this dataset.
Thanks to [@vasudevgupta7](https://github.com/vasudevgupta7) for adding this dataset.
3 changes: 2 additions & 1 deletion datasets/grail_qa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ task_categories:
task_ids:
- question-answering-other-knowledge-base-qa
paperswithcode_id: null
pretty_name: Grail QA
---

# Dataset Card for Grail QA
Expand Down Expand Up @@ -178,4 +179,4 @@ Test | 13,231

### Contributions

Thanks to [@mattbui](https://github.com/mattbui) for adding this dataset.
Thanks to [@mattbui](https://github.com/mattbui) for adding this dataset.
Loading