llm-d incubation
Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework
Popular repositories Loading
-
-
llm-d-modelservice
llm-d-modelservice Publichelm charts for deploying models with llm-d
-
workload-variant-autoscaler
workload-variant-autoscaler PublicVariant optimization autoscaler for distributed inference workloads
-
Repositories
Showing 8 of 8 repositories
- llm-d-fast-model-actuation Public
llm-d-incubation/llm-d-fast-model-actuation’s past year of commit activity - offline-batch-gateway Public
The offline batch gateway is an llm-d compatible implementation of the OpenAI Batch inference API
llm-d-incubation/offline-batch-gateway’s past year of commit activity - workload-variant-autoscaler Public
Variant optimization autoscaler for distributed inference workloads
llm-d-incubation/workload-variant-autoscaler’s past year of commit activity - llm-d-ci Public
llm-d-incubation/llm-d-ci’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…