This document describes how the coding-agent currently loads models, applies overrides, resolves credentials, and chooses models at runtime.
Primary implementation files:
src/config/model-registry.ts— loads built-in + custom models, provider overrides, runtime discovery, auth integrationsrc/config/model-resolver.ts— parses model patterns and selects initial/smol/slow modelssrc/config/settings-schema.ts— model-related settings (modelRoles, provider transport preferences)src/session/auth-storage.ts— API key + OAuth resolution orderpackages/ai/src/models.tsandpackages/ai/src/types.ts— built-in providers/models andModel/compattypes
Default config path:
~/.omp/agent/models.yml
Legacy behavior still present:
- If
models.ymlis missing andmodels.jsonexists at the same location, it is migrated tomodels.yml. - Explicit
.json/.jsoncconfig paths are still supported when passed programmatically toModelRegistry.
providers:
<provider-id>:
# provider-level config
equivalence:
overrides:
<provider-id>/<model-id>: <canonical-model-id>
exclude:
- <provider-id>/<model-id>provider-id is the canonical provider key used across selection and auth lookup.
equivalence is optional and configures canonical model grouping on top of concrete provider models:
overridesmaps an exact concrete selector (provider/modelId) to an official upstream canonical idexcludeopts a concrete selector out of canonical grouping
providers:
my-provider:
baseUrl: https://api.example.com/v1
apiKey: MY_PROVIDER_API_KEY
api: openai-completions
headers:
X-Team: platform
authHeader: true
auth: apiKey
discovery:
type: ollama
modelOverrides:
some-model-id:
name: Renamed model
models:
- id: some-model-id
name: Some Model
api: openai-completions
reasoning: false
input: [text]
cost:
input: 0
output: 0
cacheRead: 0
cacheWrite: 0
contextWindow: 128000
maxTokens: 16384
headers:
X-Model: value
compat:
supportsStore: true
supportsDeveloperRole: true
supportsReasoningEffort: true
maxTokensField: max_completion_tokens
openRouterRouting:
only: [anthropic]
vercelGatewayRouting:
order: [anthropic, openai]
extraBody:
gateway: m1-01
controller: mlxopenai-completionsopenai-responsesopenai-codex-responsesazure-openai-responsesanthropic-messagesgoogle-generative-aigoogle-vertex
auth:apiKey(default) ornonediscovery.type:ollama
Required:
baseUrlapiKeyunlessauth: noneapiat provider level or each model
Must define at least one of:
baseUrlmodelOverridesdiscovery
discoveryrequires provider-levelapi.
idrequiredcontextWindowandmaxTokensmust be positive if provided
ModelRegistry pipeline (on refresh):
- Load built-in providers/models from
@oh-my-pi/pi-ai. - Load
models.ymlcustom config. - Apply provider overrides (
baseUrl,headers) to built-in models. - Apply
modelOverrides(per provider + model id). - Merge custom
models:- same
provider + idreplaces existing - otherwise append
- same
- Apply runtime-discovered models (currently Ollama and LM Studio), then re-apply model overrides.
The registry keeps every concrete provider model and then builds a canonical layer above them.
Canonical ids are official upstream ids only, for example:
claude-opus-4-6claude-haiku-4-5gpt-5.3-codex
Example:
providers:
zenmux:
baseUrl: https://api.zenmux.example/v1
apiKey: ZENMUX_API_KEY
api: openai-codex-responses
models:
- id: codex
name: Zenmux Codex
reasoning: true
input: [text]
cost:
input: 0
output: 0
cacheRead: 0
cacheWrite: 0
contextWindow: 200000
maxTokens: 32768
equivalence:
overrides:
zenmux/codex: gpt-5.3-codex
p-codex/codex: gpt-5.3-codex
exclude:
- demo/codex-previewBuild order for canonical grouping:
- exact user override from
equivalence.overrides - bundled official-id matches from built-in model metadata
- conservative heuristic normalization for gateway/provider variants
- fallback to the concrete model's own id
Current heuristics are intentionally narrow:
- embedded upstream prefixes can be stripped when present, for example
anthropic/...oropenai/... - dotted and dashed version variants can normalize only when they map to an existing official id, for example
4.6 -> 4-6 - ambiguous families or versions are not merged without a bundled match or explicit override
When multiple concrete variants share a canonical id, resolution uses:
- availability and auth
config.ymlmodelProviderOrder- existing registry/provider order if
modelProviderOrderis unset
Disabled or unauthenticated providers are skipped.
Session state and transcripts continue to record the concrete provider/model that actually executed the turn.
Provider defaults vs per-model overrides:
- Provider
headersare baseline. - Model
headersoverride provider header keys. modelOverridescan override model metadata (name,reasoning,input,cost,contextWindow,maxTokens,headers,compat,contextPromotionTarget).compatis deep-merged for nested routing blocks (openRouterRouting,vercelGatewayRouting,extraBody).
If ollama is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
ollama - api:
openai-completions - base URL:
OLLAMA_BASE_URLorhttp://127.0.0.1:11434 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET /api/tags on Ollama and synthesizes model entries with local defaults.
If llama.cpp is not explicitly configured, registry adds an implicit discoverable provider:
Note: it's using the newer antropic messages api instead of the openai-competions.
- provider:
llama.cpp - api:
openai-responses - base URL:
LLAMA_CPP_BASE_URLorhttp://127.0.0.1:8080 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery calls GET models on llama.cpp and synthesizes model entries with local defaults.
If lm-studio is not explicitly configured, registry adds an implicit discoverable provider:
- provider:
lm-studio - api:
openai-completions - base URL:
LM_STUDIO_BASE_URLorhttp://127.0.0.1:1234/v1 - auth mode: keyless (
auth: nonebehavior)
Runtime discovery fetches models (GET /models) and synthesizes model entries with local defaults.
You can configure discovery yourself:
providers:
ollama:
baseUrl: http://127.0.0.1:11434
api: openai-completions
auth: none
discovery:
type: ollama
llama.cpp:
baseUrl: http://127.0.0.1:8080
api: openai-responses
auth: none
discovery:
type: llama.cppExtensions can register providers at runtime (pi.registerProvider(...)), including:
- model replacement/append for a provider
- custom stream handler registration for new API IDs
- custom OAuth provider registration
When requesting a key for a provider, effective order is:
- Runtime override (CLI
--api-key) - Stored API key credential in
agent.db - Stored OAuth credential in
agent.db(with refresh) - Environment variable mapping (
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc.) - ModelRegistry fallback resolver (provider
apiKeyfrommodels.yml, env-name-or-literal semantics)
models.yml apiKey behavior:
- Value is first treated as an environment variable name.
- If no env var exists, the literal string is used as the token.
If authHeader: true and provider apiKey is set, models get:
Authorization: Bearer <resolved-key>header injected.
Keyless providers:
- Providers marked
auth: noneare treated as available without credentials. getApiKey*returnskNoAuthfor them.
getAll()returns the loaded model registry (built-in + merged custom + discovered).getAvailable()filters to models that are keyless or have resolvable auth.
So a model can exist in registry but not be selectable until auth is available.
model-resolver.ts supports:
- exact
provider/modelId - exact canonical model id
- exact model id (provider inferred)
- fuzzy/substring matching
- glob scope patterns in
--models(e.g.openai/*,*sonnet*) - optional
:thinkingLevelsuffix (off|minimal|low|medium|high|xhigh)
--provider is legacy; --model is preferred.
Resolution precedence for exact selectors:
- exact
provider/modelIdbypasses coalescing - exact canonical id resolves through the canonical index
- exact bare concrete id still works
- fuzzy and glob matching run after the exact paths
findInitialModel(...) uses this order:
- explicit CLI provider+model
- first scoped model (if not resuming)
- saved default provider/model
- known provider defaults (e.g. OpenAI/Anthropic/etc.) among available models
- first available model
Supported model roles:
default,smol,slow,plan,commit
Role aliases like pi/smol expand through settings.modelRoles. Each role value can also append a thinking selector such as :minimal, :low, :medium, or :high.
If a role points at another role, the target model still inherits normally and any explicit suffix on the referring role wins for that role-specific use.
Related settings:
modelRoles(record)enabledModels(scoped pattern list)modelProviderOrder(global canonical-provider precedence)providers.kimiApiFormat(openaioranthropicrequest format)providers.openaiWebsockets(auto|off|onwebsocket preference for OpenAI Codex transport)
modelRoles may store either:
provider/modelIdto pin a concrete provider variant- a canonical id such as
gpt-5.3-codexto allow provider coalescing
For enabledModels and CLI --models:
- exact canonical ids expand to all concrete variants in that canonical group
- explicit
provider/modelIdentries stay exact - globs and fuzzy matches still operate on concrete models
Both surfaces keep provider-prefixed models visible and selectable.
They now also expose canonical/coalesced models:
/modelincludes a canonical view alongside provider tabs--list-modelsprints a canonical section plus the concrete provider rows
Selecting a canonical entry stores the canonical selector. Selecting a provider row stores the explicit provider/modelId.
Context promotion is an overflow recovery mechanism for small-context variants (for example *-spark) that automatically promotes to a larger-context sibling when the API rejects a request with a context length error.
When a turn fails with a context overflow error (e.g. context_length_exceeded), AgentSession attempts promotion before falling back to compaction:
- If
contextPromotion.enabledis true, resolve a promotion target (see below). - If a target is found, switch to it and retry the request — no compaction needed.
- If no target is available, fall through to auto-compaction on the current model.
Selection is model-driven, not role-driven:
currentModel.contextPromotionTarget(if configured)- smallest larger-context model on the same provider + API
Candidates are ignored unless credentials resolve (ModelRegistry.getApiKey(...)).
If switching from/to openai-codex-responses, session provider state key openai-codex-responses is closed before model switch. This drops websocket transport state so the next turn starts clean on the promoted model.
Promotion uses temporary switching (setModelTemporary):
- recorded as a temporary
model_changein session history - does not rewrite saved role mapping
Configure fallback directly in model metadata via contextPromotionTarget.
contextPromotionTarget accepts either:
provider/model-id(explicit)model-id(resolved within current provider)
Example (models.yml) for Spark -> non-Spark on the same provider:
providers:
openai-codex:
modelOverrides:
gpt-5.3-codex-spark:
contextPromotionTarget: openai-codex/gpt-5.3-codexThe built-in model generator also assigns this automatically for *-spark models when a same-provider base model exists.
models.yml supports this compat subset:
supportsStoresupportsDeveloperRolesupportsReasoningEffortmaxTokensField(max_completion_tokensormax_tokens)openRouterRouting.only/openRouterRouting.ordervercelGatewayRouting.only/vercelGatewayRouting.order
These are consumed by the OpenAI-completions transport logic and combined with URL-based auto-detection.
providers:
local-openai:
baseUrl: http://127.0.0.1:8000/v1
auth: none
api: openai-completions
models:
- id: Qwen/Qwen2.5-Coder-32B-Instruct
name: Qwen 2.5 Coder 32B (local)providers:
anthropic-proxy:
baseUrl: https://proxy.example.com/anthropic
apiKey: ANTHROPIC_PROXY_API_KEY
api: anthropic-messages
authHeader: true
models:
- id: claude-sonnet-4-20250514
name: Claude Sonnet 4 (Proxy)
reasoning: true
input: [text, image]providers:
openrouter:
baseUrl: https://my-proxy.example.com/v1
headers:
X-Team: platform
modelOverrides:
anthropic/claude-sonnet-4:
name: Sonnet 4 (Corp)
compat:
openRouterRouting:
only: [anthropic]Most model configuration now flows through models.yml via ModelRegistry.
One notable legacy path remains: web-search Anthropic auth resolution still reads ~/.omp/agent/models.json directly in src/web/search/auth.ts.
If you rely on that specific path, keep JSON compatibility in mind until that module is migrated.
If models.yml fails schema or validation checks:
- registry keeps operating with built-in models
- error is exposed via
ModelRegistry.getError()and surfaced in UI/notifications