Backend service module that extends the AI DIAL Admin Panel, simplifying and democratizing the building and deployment of AI applications and deployments inside Kubernetes. This service is deployed alongside the main admin app and is used by both the admin backend and frontend (feature-flagged).
The DIAL Deployment Manager Backend makes complex Kubernetes-based AI deployments accessible without requiring deep Kubernetes expertise. It provides a streamlined interface for managing container image builds, deployments, and lifecycle operations for AI workloads.
Current Focus:
- MCP (Model Context Protocol) servers
- DIAL interceptors
- NIM support for simplified inference deployment
Roadmap:
- KServe support for simplified inference deployment
- Agile network rules management in untrusted clusters
- DIAL adapters
- DIAL applications
For more information about the parent AI DIAL Admin Panel, visit the ai-dial-admin-backend repository or DIAL Documentation.
- Prerequisites
- Features
- Configuration
- Getting Started
- Deploying NVIDIA NIM Models
- Security
- Contributing
- License
- Documentation
- Java 21 or higher
- Gradle 7.x or higher
- Docker and Docker Compose (for containerized deployment)
- Kubernetes cluster (for production deployments)
- Database (H2/PostgreSQL/SQL Server) - H2 is sufficient for local development
- MCP Deployment Management: Complete lifecycle management for Model Context Protocol servers
- Container Image Building: Automated image building with Kaniko in Kubernetes jobs
- Image Definition Management: Support for multiple image definition types (MCP, DIAL Interceptor) with versioning
- Knative-Based Deployments: Serverless container deployments with auto-scaling and automatic HTTPS endpoints
- Real-Time Status Updates: Server-Sent Events (SSE) for real-time build status and deployment monitoring
- Kubernetes Event Streaming: Real-time event streaming for deployment monitoring and troubleshooting
- Disposable Resource Management: Automatic tracking and cleanup of Kubernetes resources with lifecycle state management
- Multi-Database Support: Flexible database options (H2, PostgreSQL, SQL Server) for different deployment scenarios
- Multiple Authentication Methods: Support for JWT and Basic Auth authentication
- Identity Providers Support Integration of different providers (Keycloak, AzureAD and others) for identity management
- Health Monitoring: Comprehensive health checks and metrics endpoints (Prometheus)
- Private Registry Support: Support for private Git repositories and Docker registries in image builds
- Environment Variable Management: Sophisticated handling of environment variables with automatic conversion of sensitive vars to Kubernetes Secrets
Complete list of configuration properties can be found in the Configuration Documentation.
- Kubernetes Connection: Configure via config file (
K8S_CONNECT_TYPE=CONFIG_FILE) or token-based authentication (K8S_CONNECT_TYPE=TOKEN) - Docker Registry: Configure registry URL, protocol, and authentication (
DOCKER_REGISTRY,DOCKER_REGISTRY_AUTH) - Database: Support for H2, PostgreSQL, and SQL Server via
DATASOURCE_VENDORenvironment variable - Security/Authentication: Configure via
CONFIG_REST_SECURITY_MODE(none, basic, oidc) - Knative Deployment: Configure namespaces, timeouts, and scaling parameters
- Build Pipeline: Configure build namespaces, images, and resource limits
The application supports extensive configuration via environment variables. See Configuration Documentation for the complete list.
From the project's root directory:
./gradlew bootRundocker build -t aidial/ai-dial-admin-deployment-manager-backend:latest .docker run -p 8080:8080 aidial/ai-dial-admin-deployment-manager-backend:latestVerify the installation:
curl -X GET --location "http://localhost:8080/api/v1/health"Expected response:
{
"status": "UP"
}Import (Postman) collection to test other requests.
For local development with H2 database, default credentials are provided in the docker-compose.yml file. For details on generating new credentials, see the Development Guide.
NVIDIA NIM (NVIDIA Inference Microservices) is a technology that provides optimized, containerized AI inference microservices for deploying large language models and other AI workloads. NIM models are pre-optimized containers that enable fast, efficient inference deployment with minimal configuration, leveraging NVIDIA's GPU acceleration and optimization technologies.
The DIAL Deployment Manager Backend supports deploying NIM models through Kubernetes Custom Resource Definitions (CRDs), allowing you to manage NIM-based inference services alongside other deployment types.
Legal Notice: NIM models and the NIM operator are available only with an NVIDIA Enterprise License Key for production use. For evaluation, development, and production deployment of NIM models, you must obtain the appropriate NVIDIA enterprise license and configure the necessary authentication credentials (NGC secrets). Please refer to the NVIDIA NIM documentation for licensing information, setup instructions, and the latest requirements.
For detailed developer documentation, including REST API endpoints, development commands, architecture details, and configuration guides, see the Development Guide.
Edit diagram in draw.io
For information about security practices and reporting security issues, please refer to our SECURITY.md document.
Security is disabled for default configuration. It's highly not recommended to use default configuration for production environment.
For production environment:
- Set
CONFIG_REST_SECURITY_MODEenvironment variable with eitheroidcorbasicvalue - (optional) Set
MS_SQL_SERVER_OPSenvironment variable withencrypt=true;value if application is launched with SQL Server.
The system supports two authentication methods:
-
Basic Authentication (Default)
- Configure username and password in
application.yml:spring.security.user.name=your_username spring.security.user.password=your_password
- Enable with:
config.rest.security.mode=basic
- Configure username and password in
-
JWT Authentication
- Configure Identity Provider settings using environment variables or with the
providers.{provider_name}.*format. Each provider is configured separately (example for Azure Provider):# Required properties providers.azure.issuer=your_issuer providers.azure.jwk-set-uri=your_jwk_set_uri providers.azure.audiences=your_audiences # Optional properties providers.azure.aliases=your_aliases # Azure-specific providers.azure.role-claims=your_role_claims providers.azure.allowed-roles=ConfigAdmin,admin
- Enable with:
config.rest.security.mode=oidc - Supported providers: azure, keycloak, auth0, okta, cognito
- See Configuration Documentation for complete provider configuration details.
Keycloak can be used as a simple IDP replacement for local test/development. Please refer to the Keycloak setup guide for more information.
We welcome contributions! Please see our Contributing Guide for details.
For detailed information on how to contribute, see the full contributing documentation from the parent project.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
