Skip to content

Releases: Pacific-AI-Corp/langtest

John Snow Labs NLP Test 1.5.0: Amplifying Model Comparisons, Bias Tests, Runtime Checks, Harnessing HF Datasets for Superior Text Classification and Introducing Augmentation Proportion Control

16 Jun 14:49
192ca0b

Choose a tag to compare


πŸ“’ Overview

NLP Test 1.5.0 πŸš€ comes with brand new features, including: new capabilities to run comparisons between different models from same/different hubs in a single Harness for robustness, representation, bias, fairness and accuracy tests. It includes support for runtime checks and ability to pass custom replacement dictionaries for bias testing. Also added support for HF datasets for text classification task and many other enhancements and bug fixes!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Adding support for Model Comparisons #514
  • Adding support for passing custom replacement dictionaries #509
  • Adding support for hf datasets for text classification task #511
  • Adding support for runtime checks #515
  • Adding support for Augmentation Proportion Control #506
  • Adding new tutorial notebooks #526

πŸ› Bug Fixes

  • Review issues with add-context for QA #507

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Defining a dictionary to run model comparisons
models = {
  "ner.dl": "johnsnowlabs",
  "en_core_web_sm": "spacy"
}


# Import and create a Harness object
from nlptest import Harness
h = Harness(task='ner', model=models, data='/Path-to-test-conll')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

Full Changelog: v1.4.0...v1.5.0

John Snow Labs NLP Test 1.4.0: Enhancing Support for Toxicity test and new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA)

06 Jun 14:17
8cc420c

Choose a tag to compare

John Snow Labs NLP Test 1.4.0: Enhancing Support for Toxicity test and new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA)


πŸ“’ Overview

NLP Test 1.4.0 πŸš€ comes with brand new features, including: new capabilities for testing Large Language Models for toxicity and support for new QA benchmark datasets (NarrativeQA, TruthfulQA, QuAC, HellaSwag, MMLU and OpenbookQA) for robustness, representation, fairness and accuracy tests. It also includes addition of some new robustness tests and many other enhancements and bug fixes!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Adding support for NarrativeQA dataset #487
  • Adding support for toxicity task #488
  • Adding support for TruthfulQA dataset #477
  • Adding support for new dyslexia swap test for robustness testing #474
  • Adding support for new slangificator test for robustness testing #463
  • Adding support for new abbreviation test for robustness testing #471
  • Adding support for OpenBookQA dataset #479
  • Adding support for MMLU dataset #481
  • Adding support for hellaswag dataset #486
  • Adding new tutorial notebooks #497

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='toxicity', model='text-davinci-002', hub='openai', data='toxicity-test-tiny')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

New Contributors

Full Changelog: v1.3.0...v1.4.0

John Snow Labs NLP Test 1.3.0: Enhancing Support for Evaluating Large Language Models in Summarization

25 May 11:47
2683c1e

Choose a tag to compare

John Snow Labs NLP Test 1.3.0: Enhancing Support for Evaluating Large Language Models in Summarization


πŸ“’ Overview

NLP Test 1.3.0 πŸš€ comes with brand new features, including: new capabilities for testing Large Language Models on Summarization task with support for robustness, bias, representation, fairness and accuracy tests on the XSum dataset. Also added fairness tests for the Question Answering datasets and many other enhancements and bug fixes!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Adding support for summarization with the XSum dataset #433
  • Adding support for fairness tests for testing LLMs on Question Answering #430
  • Adding support for accuracy/fairness tests for testing LLMs on summarization #446
  • Adding new robustness test called add_ocr_typo #428

πŸ› Bug Fixes

  • Review issues with QAEval in OpenAI Natural Questions #444

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='summarization', model='text-davinci-002', hub='openai', data='XSum-test', config='config.yml')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.3.0

John Snow Labs NLP Test 1.2.0: Announcing Support for Cohere, AI21, Azure OpenAI and Hugging Face Inference API

13 May 10:59
0698507

Choose a tag to compare


πŸ“’ Overview

NLP Test 1.2.0 πŸš€ comes with brand new features, including: support for testing Cohere, AI21, Hugging Face Inference API and Azure-OpenAI LLMs for robustness, bias, accuracy and representation tests on the BoolQ and Natural Questions datasets, and many other enhancements and bug fixes!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Adding support for 4 new LLM APIs for Question Answering task #388
  • Adding support for bias tests for testing LLMs on Question Answering #404
  • Adding support for representation tests for testing LLMs on Question Answering #405
  • Adding support for accuracy tests for testing LLMs on Question Answering #394
  • Adding new robustness test called number_to_word #377

πŸ› Bug Fixes

  • Fixed bias tests to enable multi-token name replacements #400
  • Fixed issue in ethnicity/religion-names #393
  • Fixed issue in default HF text classification model #402

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='question-answering', model='gpt-3.5-turbo', hub='openai', data='BoolQ-test', config='config.yml')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

New Contributors

Full Changelog: v1.1.0...v1.2.0

John Snow Labs NLP Test 1.1.0: Announcing Support for Testing LLMs

02 May 15:09
932d8e7

Choose a tag to compare


πŸ“’ Overview

NLP Test 1.1.0 πŸš€ comes with brand new features, including: new capabilities for testing Large Language Models on Question Answering tasks, with support for testing OpenAI-based LLMs and support for robustness tests on the BoolQ and Natural Questions datasets!

A big thank you to our early-stage community for their contributions, feedback, questions, and feature requests πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Support for testing OpenAI LLMs on Question Answering #361
  • Support for BoolQ and Natural Questions datasets #361
  • Improved layout for configuring tests #361
  • Improved warning and error messaging #361

πŸ› Bug Fixes

  • Fixed overlapping and mis-formatted country names in dictionaries #347

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Set OpenAI API keys
os.environ['OPENAI_API_KEY'] = ''

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='question-answering', model='gpt-3.5-turbo', hub='openai', data='BoolQ-test', config='config.yml')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

Full Changelog: v1.0.2...v1.1.0

John Snow Labs NLP Test 1.0.2: Patch Release

21 Apr 18:08
e3f78b7

Choose a tag to compare


πŸ“’ Overview

NLP Test 1.0.2 πŸš€ comes with several improvements and bug fixes, including: 7x speed-up on test generation, support for installation from conda-forge, brand new sphinx docs, bug fixes for token mismatches, and many other enhancements and bug fixes!

A big thank you to our early-stage community for their feedback, questions, and feature requests πŸŽ‰ A special thank you to @sugatoray for becoming the library's first contributor from outside of John Snow Labs! πŸ₯³

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • 7x speed-up through multithreading-based parallelization and other optimizations #325 #321
  • Support for installation from conda-forge channel conda-forge/staged-recipes#22525
  • Brand new sphinx docs and website updates #335
  • Cleaner outputs when generating and running tests #317 #329

πŸ› Bug Fixes

  • Fixed token mismatch issues occurring in various edge-cases #328 #331
  • Fixed representation and fairness test attribute errors in text classification #325
  • Standardized model outputs for default text classification code blocks #325

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='ner', model='dslim/bert-base-NER', hub='transformers')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

New Contributors

Full Changelog: v1.0.1...v1.0.2

John Snow Labs NLP Test 1.0.1: Patch Release

07 Apr 11:54
de77d4d

Choose a tag to compare


πŸ“’ Overview

NLP Test 1.0.1 πŸš€ comes with several improvements and bug fixes, including: a clean display format for expected and actual results on NER tests, support for a default spaCy text classifier, a bug fix for token mismatches in transformers, and many other enhancements and bug fixes!

A big thank you to our early-stage community for their feedback, questions, and feature requests. πŸŽ‰

Make sure to give the project a star right here ⭐


πŸ”₯ New Features & Enhancements

  • Clean display for actual and expected results on NER tests #301
  • Added default spaCy text classifier support #285
  • Removed memory location display when calling Harness methods #302
  • Enhanced error messages for spaCy model downloads #286
  • Standardize NER model outputs for all supported libraries #289

πŸ› Bug Fixes

  • Fix swap_entities augmentation failures #284
  • Linked replace_to_inter_racial_lastnames and replace_to_native_american_lastnames to transformation #300
  • Fix token mismatch issue occurring with transformers #279

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='ner', model='dslim/bert-base-NER', hub='transformers')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


♻️ Changelog

What's Changed

New Contributors

Full Changelog: v1.0.0...v1.0.1

John Snow Labs - NLP Test 1.0.0: An open-source library for delivering safe & effective models into production!

03 Apr 20:22
0abe698

Choose a tag to compare


πŸ“’ Overview

We are very excited to release John Snow Labs' latest library: NLP Test! πŸš€ This is our first major step towards building responsible AI.

NLP Test is an open-source library for testing NLP models and datasets from all major NLP libraries in a few lines of code. πŸ§ͺ The library has 1 goal: delivering safe & effective models into production. 🎯

Make sure to give the project a star right here ⭐


πŸ”₯ Features

  • Generate & run over 50 test types in a few lines of code πŸ’»
  • Test all aspects of model quality: robustness, bias, representation, fairness and accuracy
  • Automatically augment training data based on test results πŸ’ͺ
  • Support for popular NLP libraries: Spark NLP, Hugging Face Transformers & spaCy
  • Support for popular NLP tasks: Named Entity Recognition and Text Classification πŸŽ‰

❓ How to Use

Get started now! πŸ‘‡

pip install nlptest

Create your test harness in 3 lines of code πŸ§ͺ

# Import and create a Harness object
from nlptest import Harness
h = Harness(task='ner', model='dslim/bert-base-NER', hub='transformers')

# Generate test cases, run them and view a report
h.generate().run().report()

πŸ“– Documentation


❀️ Community support

  • Slack For live discussion with the NLP Test community, join the #nlptest channel
  • GitHub For bug reports, feature requests, and contributions
  • Discussions To engage with other community members, share ideas, and show off how you use NLP Test!

We would love to have you join the mission πŸ‘‰ open an issue, a PR, or give us some feedback on features you'd like to see! πŸ™Œ


πŸš€ Mission

While there is a lot of talk about the need to train AI models that are safe, robust, and fair - few tools have been made available to data scientists to meet these goals. As a result, the front line of NLP models in production systems reflects a sorry state of affairs.

We propose here an early stage open-source community project that aims to fill this gap, and would love for you to join us on this mission. We aim to build on the foundation laid by previous research such as Ribeiro et al. (2020), Song et al. (2020), Parrish et al. (2021), van Aken et al. (2021) and many others.

John Snow Labs has a full development team allocated to the project and is committed to improving the library for years, as we do with other open-source libraries. Expect frequent releases with new test types, tasks, languages, and platforms to be added regularly. We look forward to working together to make safe, reliable, and responsible NLP an everyday reality.