Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
7dac9cf
removed spacy import
RakshitKhajuria Jun 4, 2023
004c942
updated hf notebook for summarization models and removed api key
RakshitKhajuria Jun 4, 2023
7aab333
updated hf notebook
RakshitKhajuria Jun 4, 2023
11c2e52
updated notebook with new QA benchmark datasets
RakshitKhajuria Jun 4, 2023
b6bd1a7
added dyslexia_word_swap in utils
RakshitKhajuria Jun 4, 2023
9da3ee1
added new robustness test in notebooks
RakshitKhajuria Jun 4, 2023
ba497ea
Updated notebooks
RakshitKhajuria Jun 4, 2023
63ecc0f
updated slangifytypo alias name
RakshitKhajuria Jun 4, 2023
86bca65
updated new tests on website
RakshitKhajuria Jun 4, 2023
399fa85
updated website for truthfulqa data
RakshitKhajuria Jun 4, 2023
782a9f8
updated website for MMLU data
RakshitKhajuria Jun 4, 2023
9645958
updated website for narrativeqa, hellaswag,quac data
RakshitKhajuria Jun 4, 2023
5a1c19c
updated website for qpenbookQA
RakshitKhajuria Jun 4, 2023
60eb452
updated spacy model path
RakshitKhajuria Jun 5, 2023
67be7fa
update tutorials page
ArshaanNazir Jun 5, 2023
d1d235d
Merge branch 'docs/update-nb-docs' of https://github.com/JohnSnowLabs…
ArshaanNazir Jun 5, 2023
13d01d9
updated representation links
RakshitKhajuria Jun 5, 2023
2e6e85d
Merge branch 'docs/update-nb-docs' of https://github.com/JohnSnowLabs…
RakshitKhajuria Jun 5, 2023
c95be54
add one-liner toxicity
ArshaanNazir Jun 5, 2023
ac06b7d
update supported tasks
ArshaanNazir Jun 5, 2023
eaade01
update supported hubs
ArshaanNazir Jun 5, 2023
29a7fae
update config for llms
ArshaanNazir Jun 5, 2023
310092d
add toxicity test to website
ArshaanNazir Jun 5, 2023
ab3eab9
updated installation and data pages for toxicity
RakshitKhajuria Jun 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 38 additions & 3 deletions demo/tutorials/AI21_QA_Summarization_Testing_Notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -185,15 +185,42 @@
"source": [
"We have specified task as QA, hub as AI21 and model as `j2-jumbo-instruct`.\n",
"\n",
"For dataset we used BoolQ-test-tiny which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"For dataset we used `BoolQ-test-tiny` which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"\n",
"#### BoolQ\n",
"* `BoolQ-test-tiny`\n",
"* `BoolQ-test`\n",
"* `BoolQ-combined`\n",
"#### NQ-open\n",
"* `NQ-open-test`\n",
"* `NQ-open-combined`\n",
"* `NQ-open-test-tiny`\n",
"\n"
"#### TruthfulQA\n",
"* `TruthfulQA-combined`\n",
"* `TruthfulQA-test`\n",
"* `TruthfulQA-tiny`\n",
"* `TruthfulQA-train`\n",
"#### MMLU\n",
"* `MMLU-dev-tiny`\n",
"* `MMLU-test-tiny`\n",
"* `MMLU-val-tinyt`\n",
"#### OpenBookQA\n",
"* `OpenBookQA-test`\n",
"* `OpenBookQA-train`\n",
"* `OpenBookQA-dev`\n",
"* `OpenBookQA-test-tiny`\n",
"* `OpenBookQA-train-tiny`\n",
"* `OpenBookQA-dev-tiny`\n",
"#### QUAC\n",
"* `Quac-val`\n",
"* `Quac-val-tiny`\n",
"* `Quac-train`\n",
"* `Quac-train-tiny`\n",
"#### NarrativeQA\n",
"* `NarrativeQA-test`\n",
"* `NarrativeQA-test-tiny`\n",
"* `HellaSwag-test`\n",
"* `HellaSwag-test-tiny`"
]
},
{
Expand All @@ -213,7 +240,11 @@
"* `strip_punctuation`\n",
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`"
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slangs`\n",
"* `dyslexia_word_swap`"
]
},
{
Expand Down Expand Up @@ -1801,6 +1832,10 @@
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slangs`\n",
"* `dyslexia_word_swap`\n",
"\n",
"Available Bias tests for summarization task are:\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -191,14 +191,42 @@
"source": [
"We have specified task as QA, hub as OpenAI and model as text-davinci-003, text-davinci-002 whatever model available from azure openai services.\n",
"\n",
"For dataset we used BoolQ-test-tiny which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"For dataset we used `BoolQ-test-tiny` which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"\n",
"#### BoolQ\n",
"* `BoolQ-test-tiny`\n",
"* `BoolQ-test`\n",
"* `BoolQ-combined`\n",
"#### NQ-open\n",
"* `NQ-open-test`\n",
"* `NQ-open-combined`\n",
"* `NQ-open-test-tiny`\n",
"#### TruthfulQA\n",
"* `TruthfulQA-combined`\n",
"* `TruthfulQA-test`\n",
"* `TruthfulQA-tiny`\n",
"* `TruthfulQA-train`\n",
"#### MMLU\n",
"* `MMLU-dev-tiny`\n",
"* `MMLU-test-tiny`\n",
"* `MMLU-val-tinyt`\n",
"#### OpenBookQA\n",
"* `OpenBookQA-test`\n",
"* `OpenBookQA-train`\n",
"* `OpenBookQA-dev`\n",
"* `OpenBookQA-test-tiny`\n",
"* `OpenBookQA-train-tiny`\n",
"* `OpenBookQA-dev-tiny`\n",
"#### QUAC\n",
"* `Quac-val`\n",
"* `Quac-val-tiny`\n",
"* `Quac-train`\n",
"* `Quac-train-tiny`\n",
"#### NarrativeQA\n",
"* `NarrativeQA-test`\n",
"* `NarrativeQA-test-tiny`\n",
"* `HellaSwag-test`\n",
"* `HellaSwag-test-tiny`\n",
"\n"
]
},
Expand All @@ -219,7 +247,11 @@
"* `strip_punctuation`\n",
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`"
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slangs`\n",
"* `dyslexia_word_swap`"
]
},
{
Expand Down Expand Up @@ -1793,6 +1825,10 @@
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slang_typo`\n",
"* `dyslexia_word_swap`\n",
"\n",
"Available Bias tests for summarization task are:\n",
"\n",
Expand Down
40 changes: 38 additions & 2 deletions demo/tutorials/Cohere_QA_Summarization_Testing_Notebook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -194,14 +194,42 @@
"source": [
"We have specified task as QA, hub as Cohere and model as `command-xlarge-nightly`.\n",
"\n",
"For dataset we used BoolQ-test-tiny which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"For dataset we used `BoolQ-test-tiny` which includes 50 lines from BoolQ-test. Other available datasets are:\n",
"\n",
"#### BoolQ\n",
"* `BoolQ-test-tiny`\n",
"* `BoolQ-test`\n",
"* `BoolQ-combined`\n",
"#### NQ-open\n",
"* `NQ-open-test`\n",
"* `NQ-open-combined`\n",
"* `NQ-open-test-tiny`\n",
"#### TruthfulQA\n",
"* `TruthfulQA-combined`\n",
"* `TruthfulQA-test`\n",
"* `TruthfulQA-tiny`\n",
"* `TruthfulQA-train`\n",
"#### MMLU\n",
"* `MMLU-dev-tiny`\n",
"* `MMLU-test-tiny`\n",
"* `MMLU-val-tinyt`\n",
"#### OpenBookQA\n",
"* `OpenBookQA-test`\n",
"* `OpenBookQA-train`\n",
"* `OpenBookQA-dev`\n",
"* `OpenBookQA-test-tiny`\n",
"* `OpenBookQA-train-tiny`\n",
"* `OpenBookQA-dev-tiny`\n",
"#### QUAC\n",
"* `Quac-val`\n",
"* `Quac-val-tiny`\n",
"* `Quac-train`\n",
"* `Quac-train-tiny`\n",
"#### NarrativeQA\n",
"* `NarrativeQA-test`\n",
"* `NarrativeQA-test-tiny`\n",
"* `HellaSwag-test`\n",
"* `HellaSwag-test-tiny`\n",
"\n"
]
},
Expand All @@ -222,7 +250,11 @@
"* `strip_punctuation`\n",
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`"
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slangs`\n",
"* `dyslexia_word_swap`"
]
},
{
Expand Down Expand Up @@ -703,6 +735,10 @@
"* `titlecase`\n",
"* `uppercase`\n",
"* `number_to_word`\n",
"* `add_abbreviation`\n",
"* `add_speech_to_text_typo`\n",
"* `add_slangs`\n",
"* `dyslexia_word_swap`\n",
"\n",
"Available Bias tests for summarization task are:\n",
"\n",
Expand Down
Loading