From b8b84e0c930e77f7372e8dbbf514c4845428b4d5 Mon Sep 17 00:00:00 2001 From: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> Date: Fri, 12 Aug 2022 11:48:51 +0200 Subject: [PATCH] Fix documentation card of recipe_nlg dataset --- datasets/recipe_nlg/README.md | 71 +++++++++++++++++++++++++++++++---- 1 file changed, 64 insertions(+), 7 deletions(-) diff --git a/datasets/recipe_nlg/README.md b/datasets/recipe_nlg/README.md index b5f6d10e7dc..81840d84f5c 100644 --- a/datasets/recipe_nlg/README.md +++ b/datasets/recipe_nlg/README.md @@ -66,7 +66,10 @@ pretty_name: RecipeNLG ### Dataset Summary -[More Information Needed] +RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation. + +While the RecipeNLG dataset is based on the Recipe1M+ dataset, it greatly expands the number of recipes available. +The new dataset provides over 1 million new, preprocessed and deduplicated recipes on top of the Recipe1M+ dataset. ### Supported Tasks and Leaderboards @@ -74,21 +77,52 @@ pretty_name: RecipeNLG ### Languages -[More Information Needed] +The dataset is in English. ## Dataset Structure ### Data Instances -[More Information Needed] +``` +{'id': 0, + 'title': 'No-Bake Nut Cookies', + 'ingredients': ['1 c. firmly packed brown sugar', + '1/2 c. evaporated milk', + '1/2 tsp. vanilla', + '1/2 c. broken nuts (pecans)', + '2 Tbsp. butter or margarine', + '3 1/2 c. bite size shredded rice biscuits'], + 'directions': ['In a heavy 2-quart saucepan, mix brown sugar, nuts, evaporated milk and butter or margarine.', + 'Stir over medium heat until mixture bubbles all over top.', + 'Boil and stir 5 minutes more. Take off heat.', + 'Stir in vanilla and cereal; mix well.', + 'Using 2 teaspoons, drop and shape into 30 clusters on wax paper.', + 'Let stand until firm, about 30 minutes.'], + 'link': 'www.cookbooks.com/Recipe-Details.aspx?id=44874', + 'source': 0, + 'ner': ['brown sugar', + 'milk', + 'vanilla', + 'nuts', + 'butter', + 'bite size shredded rice biscuits']} +``` ### Data Fields -[More Information Needed] +- `id` (`int`): ID. +- `title` (`str`): Title of the recipe. +- `ingredients` (`list` of `str`): Ingredients. +- `directions` (`list` of `str`): Instruction steps. +- `link` (`str`): URL link. +- `source` (`ClassLabel`): Origin of each recipe record, with possible value {"Gathered", "Recipes1M"}: + - "Gathered" (0): Additional recipes gathered from multiple cooking web pages, using automated scripts in a web scraping process. + - "Recipes1M" (1): Recipes from "Recipe1M+" dataset. +- `ner` (`list` of `str`): NER food entities. ### Data Splits -[More Information Needed] +The dataset contains a single `train` split. ## Dataset Creation @@ -146,11 +180,34 @@ pretty_name: RecipeNLG ### Licensing Information -[More Information Needed] +I (the "Researcher") have requested permission to use the RecipeNLG dataset (the "Dataset") at PoznaƄ University of Technology (PUT). In exchange for such permission, Researcher hereby agrees to the following terms and conditions: + +1. Researcher shall use the Dataset only for non-commercial research and educational purposes. +2. PUT makes no representations or warranties regarding the Dataset, including but not limited to warranties of non-infringement or fitness for a particular purpose. +3. Researcher accepts full responsibility for his or her use of the Dataset and shall defend and indemnify PUT, including its employees, Trustees, officers and agents, against any and all claims arising from Researcher's use of the Dataset including but not limited to Researcher's use of any copies of copyrighted images or text that he or she may create from the Dataset. +4. Researcher may provide research associates and colleagues with access to the Dataset provided that they first agree to be bound by these terms and conditions. +5. If Researcher is employed by a for-profit, commercial entity, Researcher's employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer. ### Citation Information -[More Information Needed] +```bibtex +@inproceedings{bien-etal-2020-recipenlg, + title = "{R}ecipe{NLG}: A Cooking Recipes Dataset for Semi-Structured Text Generation", + author = "Bie{\'n}, Micha{\l} and + Gilski, Micha{\l} and + Maciejewska, Martyna and + Taisner, Wojciech and + Wisniewski, Dawid and + Lawrynowicz, Agnieszka", + booktitle = "Proceedings of the 13th International Conference on Natural Language Generation", + month = dec, + year = "2020", + address = "Dublin, Ireland", + publisher = "Association for Computational Linguistics", + url = "https://www.aclweb.org/anthology/2020.inlg-1.4", + pages = "22--28", +} +``` ### Contributions