feat: add Cerebras provider integration #6842

samdickson22 · 2025-10-30T00:43:51Z

What

Adds Cerebras as a new AI provider to Jan with support for 8 models, including Llama 4 Scout, Llama 3.3 70B, GPT OSS 120B, and Qwen variants.

Why

Cerebras offers ultra-fast AI inference (2000-3000 tokens/s) with OpenAI-compatible endpoints, making it an excellent addition to Jan's provider ecosystem. This integration enables users to leverage Cerebras's high-performance models directly through Jan's interface.

How

Provider Configuration:

Added Cerebras to predefinedProviders array in web-app/src/consts/providers.ts
Configured OpenAI-compatible endpoint: https://api.cerebras.ai/v1
Added 8 models with proper capability flags:
- Production models: Llama 4 Scout (109B), Llama 3.1 8B, Llama 3.3 70B, GPT OSS 120B, Qwen 3 32B
- Preview models: Qwen 3 235B Instruct, Qwen 3 235B Thinking, Qwen 3 Coder 480B
Tool calling support

Visual Assets:

Added Cerebras logo (PNG) from LobeHub icon collection
Updated getProviderLogo() function in web-app/src/lib/utils.ts

Documentation:

Created comprehensive setup guide at docs/src/pages/docs/desktop/remote-models/cerebras.mdx
Includes model descriptions with performance specs
Documents features (streaming, tool calling)
Lists limitations (unsupported OpenAI parameters)
Provides troubleshooting section
Added navigation entry in _meta.json

Technical Approach:
This is a configuration-driven integration that leverages Jan's existing OpenAI-compatible provider infrastructure. No custom code or API handlers needed - everything works through the standard token.js fallback mechanism.

Testing

Manual Testing Required:

Provider appears in Settings > Providers
Toggle functionality works
API key configuration saves correctly
Models can be fetched from Cerebras API
Chat completions work with streaming
Tool calling works
Documentation renders correctly
All documentation links work

Automated Testing:
TypeScript compilation verified - no errors in modified files.

Breaking Changes

None. This is a purely additive change that doesn't modify existing provider behavior.

Files Changed

web-app/src/consts/providers.ts - Provider configuration (+91 lines)
web-app/src/lib/utils.ts - Logo reference (+2 lines)
web-app/public/images/model-provider/cerebras.png - Provider logo (53KB PNG)
docs/src/pages/docs/desktop/remote-models/cerebras.mdx - Documentation (new file)
docs/src/pages/docs/desktop/remote-models/_meta.json - Navigation metadata (+3 lines)

Additional Notes

Model performance specs based on Cerebras documentation:

Llama 4 Scout: ~2600 tokens/s (deprecating Nov 3, 2025)
GPT OSS 120B: ~3000 tokens/s (fastest)
Preview models: 1400-2000 tokens/s (evaluation only, scheduled deprecation)

API compatibility: OpenAI-compatible but does not support frequency_penalty, logit_bias, presence_penalty, parallel_tool_calls, or service_tier.

Add Cerebras as a new AI provider with: - OpenAI-compatible API endpoint (https://api.cerebras.ai/v1) - 8 models including Llama 4 Scout, Llama 3.3 70B, GPT OSS 120B, and Qwen variants - Tool calling support for gpt-oss-120b and llama-3.3-70b - Ultra-fast inference speeds (2000-3000 tokens/s) - Complete documentation with setup guide and troubleshooting

All 8 Cerebras models support tool calling according to their official documentation. Updated capabilities to include 'tools' for: - llama-4-scout-17b-16e-instruct - llama3.1-8b - qwen-3-32b - qwen-3-235b-a22b-instruct-2507 - qwen-3-235b-a22b-thinking-2507 - qwen-3-coder-480b Also corrected Llama 4 Scout parameter count from 109B to 17B.

…n all models - Corrected Llama 4 Scout parameter count from 109B to 17B - Added tool calling support notation for all 8 models - Updated Features section to list all models with tool calling capability

Disable tool calling for 5 Cerebras models that reject JSON schema validation fields (minimum, maximum, default). Only gpt-oss-120b, llama-3.3-70b, and qwen-3-coder-480b support tools with Jan's RAG tool schemas. Root cause: Models have inconsistent JSON schema validation strictness. Most models reject requests containing unsupported fields like minimum/maximum in tool parameter schemas, while 3 models are more lenient. Error returned by strict models: "Unsupported JSON schema fields: {'maximum', 'minimum'}" Models with tools disabled: - llama-4-scout-17b-16e-instruct - llama3.1-8b - qwen-3-32b - qwen-3-235b-a22b-instruct-2507 - qwen-3-235b-a22b-thinking-2507

github-project-automation bot added this to Jan Oct 30, 2025

samdickson22 changed the base branch from main to dev October 30, 2025 00:45

samdickson22 added 3 commits October 30, 2025 13:08

docs: update Cerebras documentation to reflect tool calling support o…

78c7547

…n all models - Corrected Llama 4 Scout parameter count from 109B to 17B - Added tool calling support notation for all 8 models - Updated Features section to list all models with tool calling capability

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Cerebras provider integration #6842

feat: add Cerebras provider integration #6842

samdickson22 commented Oct 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: add Cerebras provider integration #6842

Are you sure you want to change the base?

feat: add Cerebras provider integration #6842

Conversation

samdickson22 commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Testing

Breaking Changes

Files Changed

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

samdickson22 commented Oct 30, 2025 •

edited

Loading