Skip to content

Conversation

@hobson
Copy link
Contributor

@hobson hobson commented Jun 16, 2022

yaml header at top of README.md file was edited to add task tags because I couldn't find the existing tags in the json
separate Pull Request will modify dataset_infos.json to add these tags

The Enron dataset (dataset id aeslc) is only tagged with:

arxiv:1906.03497'
languages:en
pretty_name:AESLC

Using the email subject_line field as a label or target variable it possible to create models for the following task_ids (in order of relevance):

'task_ids:summarization'
'task_ids:summarization-other-conversations-summarization'
"task_ids:other-other-query-based-multi-document-summarization"
'task_ids:summarization-other-aspect-based-summarization'
'task_ids:summarization--other-headline-generation'

The subject might also be used for the task_category "task_categories:summarization"

E-mail chains might be used for the task category "task_categories:dialogue-system"

@hobson
Copy link
Contributor Author

hobson commented Jun 17, 2022

Associated community discussion is here.
Paper referenced in the dataset_infos.json is here. It mentions the email-subject-generation task, which is not a tag mentioned in any other dataset so it was not added in this pull request. The summarization task is mentioned as a related task.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 17, 2022

The documentation is not available anymore as the PR was closed or merged.

@lhoestq lhoestq merged commit 7125c98 into huggingface:main Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants