[Feature] Model Card for Language Model Performance

### Priority

P2-High

### OS type

Ubuntu

### Hardware type

Xeon-GNR

### Running nodes

Single Node

### Description

The `Model Card Generator Script` intends to streamline the generation of Model Cards for large language models (LLMs) by analyzing their performance on a diverse set of academic benchmarks using the `lm-evaluation-harness`. This script enables users to produce comprehensive Model Cards that document key aspects such as performance metrics, fairness considerations, and ethical implications. The generated Model Cards are available in both HTML and Markdown formats, offering a structured and transparent overview of the model's strengths and limitations. The Model Card Generator also allows for the creation of static or interactive graphics, enabling users to analyze model performance metrics across a variable threshold range from 0 to 1. This provides valuable insights for selecting optimal thresholds to balance trade-offs in various performance metrics. The script supports both command line and function call usage, providing flexibility for integration into various workflows.

@ashahba @daniel-de-leon-user293 @qgao007 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Model Card for Language Model Performance #236

Priority

OS type

Hardware type

Running nodes

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Model Card for Language Model Performance #236

Description

Priority

OS type

Hardware type

Running nodes

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions