Skip to content

v0.2.8

Choose a tag to compare

@raghotham raghotham released this 27 May 21:03
· 1980 commits to main since this release

Release v0.2.8

Highlights

  • Server-side MCP with auth firewalls now works in the Stack - both for Agents and Responses
  • Get chat completions APIs and UI to show chat completions
  • Enable keyword search for sqlite-vec

What's Changed

  • feat: use openai-python for openai inference provider by @mattf in #2193
  • feat: allow the interface on which the server will listen to be configured by @grs in #2015
  • fix: replace all instances of --yaml-config with --config by @cdoern in #2196
  • fix: update llama stack build --run to use new start_stack.sh signature by @mattf in #2191
  • feat: add huggingface post_training impl by @cdoern in #2132
  • feat: --image-type argument overrides value in --config build.yaml by @mattf in #2179
  • chore(github-deps): bump astral-sh/setup-uv from 5.4.1 to 6.0.1 by @dependabot in #2197
  • feat: introduce OAuth2TokenAuthProvider and notion of "principal" by @ashwinb in #2185
  • feat: introduce APIs for retrieving chat completion requests by @ehhuang in #2145
  • fix: Pass external_config_dir to BuildConfig by @manstis in #2190
  • fix: remove wrong deprecated warning by @leseb in #2202
  • feat: Propagate W3C trace context headers from clients by @bbrowning in #2153
  • fix: Setting default value for metadata_token_count in case the key is not found by @franciscojavierarceo in #2199
  • chore: Updated readme by @franciscojavierarceo in #2219
  • ci: enable ruff output format for github by @leseb in #2214
  • chore: collapse all local hook under the same repo by @leseb in #2217
  • fix: Pass model parameter as config name to NeMo Customizer by @JashG in #2218
  • feat: Add "instructions" support to responses API by @derekhiggins in #2205
  • fix: synchronize concurrent coroutines checking & updating key set by @grs in #2215
  • feat: add additional auth provider that uses oauth token introspection by @grs in #2187
  • feat: add llama stack rm command by @akoserwal in #2127
  • feat(quota): add server‑side per‑client request quotas (requires auth) by @liangwen12year in #2096
  • chore: remove k8s auth in favor of k8s jwks endpoint by @leseb in #2216
  • chore: clarify cache_ttl to be key_recheck_period by @leseb in #2220
  • chore: refactor workflow writting by @leseb in #2225
  • docs: misc cleanup by @leseb in #2223
  • feat(sqlite-vec): enable keyword search for sqlite-vec by @varshaprasad96 in #1439
  • fix: use proper service account for kube auth by @leseb in #2227
  • fix: only print routes that match the runtime config by @leseb in #2226
  • feat(providers): sambanova safety provider by @jhpiedrahitao in #2221
  • feat: implement get chat completions APIs by @ehhuang in #2200
  • fix: openai provider model id by @ehhuang in #2229
  • feat: add MCP tool signature to Responses API by @ashwinb in #2232
  • feat(ui): implement chat completion views by @ehhuang in #2201
  • feat: accept MCP authorization headers for MCP toolgroups by @ashwinb in #2230
  • chore: add sqlalchemy to test dependencies by @ehhuang in #2236
  • fix: signature change to match OpenAI SDK by @ashwinb in #2237
  • feat: allow using llama-stack-library-client from verifications by @ashwinb in #2238
  • feat: add list responses API by @ehhuang in #2233
  • feat: start ui server in llama stack run by @ehhuang in #2170
  • fix(security): Upgrade setuptools to v80.8.0. Fixes CVE-2025-47273 by @terrytangyuan in #2242
  • feat: add responses input items api by @ehhuang in #2239
  • docs: Update CHANGELOG.md by @terrytangyuan in #2241
  • fix: skip failing tests by @raghotham in #2243
  • fix: disable test_responses_store by @ashwinb in #2244
  • feat: enable MCP execution in Responses impl by @ashwinb in #2240
  • fix(telemetry): get rid of annoying sqlite span export error by @ashwinb in #2245
  • chore: split routers into individual files (safety) by @ashwinb in #2248
  • chore: split routers into individual files (datasets) by @ashwinb in #2249
  • chore: split routers into individual files (inference, tool, vector_io, eval_scoring) by @ashwinb in #2258
  • chore: split routing_tables into individual files by @ashwinb in #2259
  • fix: use pypi browser agent by @raghotham in #2260
  • chore: make cprint write to stderr by @raghotham in #2250
  • fix(tools): do not index tools, only index toolgroups by @ashwinb in #2261
  • fix: match mcp headers in provider data to Responses API shape by @ashwinb in #2263
  • chore: removed unused class by @leseb in #2268
  • chore: allow to pass CA cert to remote vllm by @leseb in #2266
  • fix: index non-MCP toolgroups at registration time by @ashwinb in #2272
  • test: disable test_inference_store test urrrggg by @ashwinb in #2273
  • fix: handle None external_providers_dir in build with run arg by @Ygnas in #2269
  • chore: mark blobpath as optional by @leseb in #2271
  • docs: fix evals notebook preview by @Bobbins228 in #2277
  • chore: fix visible comments in pr template by @Bobbins228 in #2279
  • chore: remove dependencies.json by @leseb in #2281

New Contributors

Full Changelog: v0.2.7...v0.2.8