Joseph

"Too many notes." — Emperor Joseph II to Mozart

Joseph is a an AI text detection system combining machine learning with information theory.

Features

🥊 Generative Adversarial Network (GAN) - The model fights itself to get better!
📊 Information Theory - Shannon entropy, burstiness, lexical diversity and more!
- 🎯 Sensitive Against Modern LLMs - entropy features work on GPT-3.5 output... Yes we're a little behind the times. 👇
🐳 Fully containerized - runs anywhere with Docker
🌐 Web UI + REST API - easy to use, easy to integrate

⚠️ Alpha Model Limitations: This lightweight model is trained primarily on ChatGPT-3.5 and early GPT-4 data and performs significantly worse on GPT-5 output (F1 score ~0.62). See Model Evaluation Report for detailed performance metrics.

Quick Start

Run with Docker Compose (Recommended)

Install Docker if you haven't already.

git clone https://github.com/JamesABaker/joseph.git
cd joseph
docker compose up --build

The application will be available at:

Web UI: http://localhost:8000
API Docs: http://localhost:8000/docs
Health Check: http://localhost:8000/health

Usage

Web Interface

Open http://localhost:8000 in your browser
Paste or type text into the textarea
Click "Detect AI"
View the results

Interactive API Documentation

Visit http://localhost:8000/docs for interactive API documentation with a built-in testing interface.

Development

Local Development Setup

Using uv (recommended)

# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install production dependencies
uv pip install -e .

# For training (optional)
uv pip install -e ".[training]"

# For development and testing
uv pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
uv run pytest -v

Model Information

Architecture: GAN-based discriminator (adversarially trained)
Features: 10 statistical metrics (no large language models)
- Shannon Entropy: Character-level information density
- Burstiness: Sentence length variation (human writing varies more)
- Lexical Diversity: Unique word ratio (type-token ratio)
- Word Length Variance: Human vocabulary shows wider range
- Punctuation Diversity: Humans use more varied punctuation
- Vocabulary Richness: Yule's K statistic measures lexical repetition
- Sentence Statistics: Length means and standard deviations
- Character Ratios: Special characters and uppercase patterns
Performance: 91.8% accuracy, 87.2% F1-score on ChatGPT-3.5 & 4 detection
Memory: ~80MB RAM (lightweight, no transformers)

Future Enhancements

License

MIT License - feel free to use this project for learning, development, or production.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
app		app
docs		docs
migrations		migrations
models		models
scripts		scripts
tests		tests
.coverage		.coverage
.dockerignore		.dockerignore
.env.example		.env.example
.env.test		.env.test
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SETUP_AUTH.md		SETUP_AUTH.md
docker-compose.yml		docker-compose.yml
entrypoint.py		entrypoint.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
render.yaml		render.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Joseph

Features

Quick Start

Run with Docker Compose (Recommended)

Usage

Web Interface

Interactive API Documentation

Development

Local Development Setup

Model Information

Future Enhancements

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Joseph

Features

Quick Start

Run with Docker Compose (Recommended)

Usage

Web Interface

Interactive API Documentation

Development

Local Development Setup

Model Information

Future Enhancements

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages