Skip to content

Commit a199518

Browse files
JinLee794Jin Lee (HLS US SE)pablosalvador10annaquincy-msftCopilot
authored
Introducing evaluations framework in beta, agent dev assets, and deployment fixes/enhancements (#30)
* feat: enhance azd environment variable handling with error checks and local state support * fix: update foundry account and project naming conventions for consistency * Syncinc to Azure Samples (#95) * Delete samples/labs/dev/leadership_phrases.txt * Update version and SKU name in staging params * Change version for text-embedding-3-large model Updated the version of the text-embedding-3-large model. * Update main.tfvars.staging.json * Update communication.tf * feat: Enhance status envelope with optional label and update frontend to derive WS URL - Added optional `label` parameter to `make_status_envelope` function in `envelopes.py` to allow custom labels in status messages. - Updated `entrypoint.sh` to derive WebSocket URL from `BACKEND_URL` or use `WS_URL` if provided, replacing placeholders in frontend assets. - Upgraded `js-yaml` and `vite` dependencies in `package.json` and `package-lock.json`. - Enhanced `App.jsx` to format event type labels and summarize event data for better user experience. - Introduced new demo scenarios in `DemoScenariosWidget.jsx` to showcase Microsoft Copilot Studio integration and ACS call routing. - Added tests for call transfer events in `test_acs_events_handlers.py` to ensure correct envelope broadcasting for transfer accepted and failed events. - Created a new Jupyter notebook for custom speech model demonstration in `12-custom-speech-model.ipynb`. - Updated Terraform parameters to include a new text embedding model in `main.tfvars.dev.json`. * refactor: Comment out unused email communication service domain resource * refactor: Comment out unused Azure email communication service resources * feat: Enhance event handling and UI components - Added new utility functions for formatting event types and summarizing event data in App.jsx. - Improved ChatBubble component to display event messages with formatted labels and timestamps. - Updated DemoScenariosWidget to include new scenarios and enhanced filtering options based on tags. - Introduced websocket URL derivation in postprovision.sh for better backend integration. - Added tests for call transfer events in test_acs_events_handlers.py to ensure proper envelope broadcasting. - Updated package.json to include js-yaml and upgraded vite version. * add value * feat: Enhance distributed session handling and improve PayPal agent interactions - Implement distributed session bus using Redis for cross-replica session routing in connection manager. - Add methods for publishing session envelopes to Redis channels. - Introduce confirmation context for call center transfers to ensure explicit user consent. - Update PayPal agent templates to clarify authentication and routing guidelines. - Enhance real-time voice app to manage relay WebSocket connections and handle session updates more effectively. - Improve error handling and logging for distributed session delivery and Redis interactions. - Refactor session envelope handling in frontend to accommodate new event types and improve user experience. * feat: Enhance status tone metadata and improve chat bubble styling * feat: Implement background task handling for MFA delivery and improve greeting messages in handoff processes * feat: Enhance call escalation process with detailed transfer context and improve PayPal agent handoff scenarios * feat: Implement retry mechanism for browser session ID resolution in media streaming * feat: Enhance session management and greeting handling across various components * fixing session mapping for acs calls * add value * add value * adding test file * Adding agents and templates for credit card recommendation and fee dispute agents * add value * Enhance audio transcription settings across agents and adjust logging levels for better debugging * Enhance audio transcription settings across agents and adjust logging levels for better debugging * add value * add value * Implement Azure Voice Live service integration and enhance Terraform configurations for voice model deployments * add value * Add Azure Voice Live model configuration and outputs * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * remove sensitive contact information and unused transfer agency client data * feat: Introduce Agent Consolidation Plan with YAML-driven architecture - Added a comprehensive proposal for consolidating agent architecture in `apps/rtagent/backend/src/agents/`. - Established key goals including single source of truth for agent definitions, auto-discovery, and unified tool registry. - Analyzed current architecture and identified pain points such as manual handoff registration and duplicate tool registries. - Proposed a new solution architecture featuring enhanced YAML schema, auto-discovery engine, and unified tool registry. - Detailed implementation roadmap divided into phases for gradual migration and integration. - Included backward compatibility strategy to ensure existing agents function without modification. - Provided extensive documentation on YAML schema, CLI tool usage, and migration checklist. * Refactor speech cascade handler and routing for browser communication - Updated speech cascade handler to prioritize `on_greeting` callback over `on_tts_request` for greeting events. - Added `queue_user_text` method to `SpeechCascadeHandler` for queuing user text input. - Changed routing from `/realtime` to `/browser` for browser communication endpoints. - Modified orchestration logic to ensure TTS responses are sent with blocking behavior to prevent overlap. - Introduced WebSocket helper functions for better organization and clarity in messaging. - Enhanced connection manager to handle Redis pubsub reconnections on credential expiration. - Updated frontend components to reflect routing changes for browser communication. - Adjusted tests to align with the new browser routing and functionality. - Commented out live metrics enabling condition in telemetry configuration for future consideration. * feat(telemetry): add decorators for tracing LLM, dependency, speech, and ACS calls - Introduced , , , and decorators for OpenTelemetry instrumentation. - Implemented context manager for tracking conversation turns with detailed metrics. - Added helper functions for recording GenAI and speech metrics. - Enhanced span attributes for Azure Application Insights visualization. * Remove telemetry configuration module (telemetry_config_v2.py) to streamline codebase and eliminate unused functionality. * feat: Enhance telemetry and tracing for CosmosDB and latency tool - Added OpenTelemetry tracing to CosmosDB operations with a decorator for latency tracking. - Integrated tracing spans in the LatencyTool for better observability in Application Insights. - Updated telemetry configuration to suppress noisy logs and added new attributes for speech cascade metrics. - Created unit tests for SessionAgentManager, covering configuration management, override resolution, handoff management, and persistence. - Removed outdated endpoints review document. * feat: Add useBackendHealth hook for backend health checks and integrate with readiness, agents, and health endpoints test: Implement integration tests for VoiceLive Session Agent Manager, covering agent resolution, handoff mapping, and runtime modifications * WARNING!!!! MAJOR REFACTOR COMMIT - Removed the VoiceLive SDK integration module from the backend. - Added a new AgentTopologyPanel component to the frontend for displaying agent inventory and connections. - Integrated the AgentTopologyPanel into the main application layout. - Updated the BackendIndicator to include agent count and selection functionality. - Enhanced the ConversationControls with a fixed view switcher for better accessibility. - Improved the useBackendHealth hook to handle various agent data structures. - Updated styles for better responsiveness and visual consistency across components. - Modified utility functions to format agent inventory data correctly. - Adjusted import paths in orchestrators and tests to reflect the new backend structure. * feat: Enhance agent handoff process and response handling; refactor UI components for improved usability * feat: Update change notes for v2/speech-orchestration-and-monitoring branch; highlight major features, improvements, and new agents * refactor: Remove Unified Agent Configuration Module; streamline agent management and improve code organization * feat: Enhance ProfileDetailsPanel with resizable functionality and UI improvements - Added resizable panel feature to ProfileDetailsPanel, allowing users to adjust width dynamically. - Updated panel styling for improved aesthetics, including a gradient background and adjusted borders. - Enhanced scrollbar visibility and overflow handling for better user experience. refactor: Simplify GraphListView filter logic - Removed default selection logic for filters in GraphListView, allowing users to start with no filters applied. - Cleaned up useEffect dependencies for better performance and clarity. docs: Introduce Backend Voice & Agents Architecture documentation - Added comprehensive documentation outlining the architecture of backend voice and agent modules. - Detailed separation of concerns between voice transport and agent business logic. - Included data flow diagrams and module responsibilities for clarity. docs: Create Handoff Logic Inventory for better understanding of handoff processes - Documented the handoff logic across backend voice and agent modules. - Established a single source of truth for handoff mappings and protocols. - Summarized cleanup phases and their impact on the codebase. fix: Update logging to safely handle span names - Modified TraceLogFilter to safely retrieve span names, preventing attribute errors with NonRecordingSpan. fix: Adjust telemetry configuration to capture all loggers - Changed logger_name default to an empty string in TelemetryConfig to capture all loggers. * feat: Implement context-aware greeting rendering in VoiceLive agent; enhance session management and logging * feat: Refactor agent configuration and voice handling; streamline agent switching and TTS integration * feat: Enhance Agent Details Panel and Session Management - Added sessionAgentConfig prop to AgentDetailsPanel for dynamic agent configuration display. - Implemented logic to show agent name, description, tools, and model/voice details based on session configuration. - Introduced a new PanelCard in AgentDetailsPanel to display session agent configuration, including model, voice, and prompt preview. - Updated App component to fetch session agent configuration on agent panel visibility and manage agent creation/updating. - Added validation for TTS client initialization in dedicated_tts_pool.py to ensure clients are ready before use. - Enhanced on_demand_pool.py to validate cached resources and remove invalid ones. - Improved error logging in text_to_speech.py to include detailed initialization failure information and added is_ready property for synthesizer readiness check. * Refactor code structure for improved readability and maintainability * feat: Enhance MemoManager with background persistence and lifecycle management - Added support for background persistence in MemoManager, allowing non-blocking state saving to Redis. - Implemented task deduplication to cancel previous persistence tasks when a new one is initiated. - Removed unused auto-refresh functionality and related attributes from MemoManager. - Updated tests to verify new persistence behavior and ensure proper task management. - Enhanced error handling and logging for background persistence operations. * feat: Add Connection Warmup Analysis document for Azure Speech & OpenAI optimization * feat(session): enhance session ID management and URL parameter support - Added `pickSessionIdFromUrl` function to extract session ID from URL parameters. - Updated `getOrCreateSessionId` to allow session ID restoration from URL. - Refactored `setSessionId` for better logging and session management. - Improved `createNewSessionId` to utilize `setSessionId`. docs(api): restructure API documentation for clarity and completeness - Organized API endpoints into categories: Health & Monitoring, Call Management, Media Streaming, Browser Conversations, Session Metrics, Agent Builder, Demo Environment, and TTS Health. - Added detailed descriptions and examples for each endpoint. - Included new sections for interactive API documentation and WebSocket endpoints. docs(api-reference): update WebSocket message types and endpoint details - Clarified message types for incoming audio data and control messages. - Updated WebSocket endpoint URLs and query parameters for browser conversations and dashboard relay. docs(architecture): refine agent architecture diagrams for clarity - Adjusted diagrams to improve readability and understanding of the agent framework and orchestration. fix(architecture): correct orchestration mode comparison table - Updated ratings for Azure Speech voices and simplicity of setup in the orchestration comparison table. docs(getting-started): add demo guide and enhance onboarding experience - Introduced a new demo guide to facilitate user onboarding and provide structured paths for different user levels. - Enhanced the getting started guide with tips and recommended paths for new users. feat(aoai): implement OpenAI connection warmup to reduce latency - Added `warm_openai_connection` function to pre-establish OpenAI connection and reduce cold-start latency on first call. feat(speech): implement token warmup for Speech API to minimize latency - Added `warm_token` method in `SpeechTokenManager` to pre-fetch tokens during startup, reducing latency on first API call. * feat(healthcare): Implement Nurse Triage Agent with symptom assessment and routing capabilities - Introduced a comprehensive voice agent for healthcare triage. - Added agent configuration and prompt templates for patient interaction. - Developed healthcare tools for patient verification, clinical knowledge search, and symptom urgency assessment. - Integrated routing logic for scheduling appointments and emergency transfers. - Enhanced documentation with demo scenarios and testing instructions. * feat: Implement logging utility and session management - Added a logger utility to manage console logging levels and filtering. - Created session management functions to handle session IDs, including retrieval from URL and session creation. - Developed styles for the frontend components to ensure consistent UI design. - Configured Vite for the frontend build process with proper asset handling and environment variable support. - Introduced scripts for starting the backend and frontend development servers, including Azure Dev Tunnel hosting. * feat: Simplify agent handoff process by refining context management and removing redundant data collection * feat: Enhance agent handoff process by managing conversation history and user context * feat: Enhance message handling by persisting tool calls and results as JSON for conversation continuity * feat: Implement silent handoff protocol across agents to enhance user experience and streamline transitions * feat: Add Azure App Configuration module with RBAC and Key Vault integration - Implemented main resource for Azure App Configuration in Terraform. - Added outputs for App Configuration details including ID, name, and endpoint. - Defined variables for App Configuration module, including identity and Key Vault integration. - Updated main Terraform outputs to include App Configuration details. - Enhanced error handling in Azure OpenAI client for missing endpoint configuration. - Improved Redis manager to handle port configuration with better error messaging. - Updated requirements to include Azure App Configuration SDKs. * first code clean up * enabling oidc * Refactor code structure and remove redundant sections for improved readability and maintainability * add value * add value * feat: Add managed certificate and domain registration modules - Introduced `managed-cert-example.bicep` for example usage of managed certificate deployment. - Created `managed-cert.bicep` to handle App Service Domain registration and managed SSL certificate generation. - Implemented `role-assignment.bicep` for managing role assignments with support for built-in and custom roles. - Added `windows-vm.bicep` for deploying a Windows VM as a jumphost with necessary networking components. - Developed `peer-virtual-networks.bicep` for establishing peering between virtual networks. - Implemented `private-dns-zone.bicep` for creating and linking private DNS zones to virtual networks. - Created `private-endpoint.bicep` for deploying private endpoints with DNS zone integration. - Added `vnet.bicep` for creating virtual networks with associated subnets and network security groups. - Updated `types.bicep` with new types for model deployment, role assignments, and network configurations. - Developed `secret.bicep` for managing secrets in Azure Key Vault. - Created `network.bicep` for orchestrating network resources including virtual networks and subnets. * fix: Update default location parameter in create_storage function for clarity * feat: Extract AZURE_LOCATION from environment-specific tfvars file if not set * feat: Implement location resolution with fallback chain in preprovision script * fix: Update Dockerfile to install runtime dependencies and mitigate vulnerabilities * chore: Update CHANGELOG for version 1.5.0 release and remove changenotes.md; enable remote builds in azure.yaml; enhance terraform initialization script with location prompts * feat: Update launch configuration and scripts to use virtual environment with uv; enhance README for deployment clarity * further deployment cleanup, docs update/tweaks, adding more todos * removing unused dependency in src/herlpers.py * refactor: Update architecture diagram in README for clarity and consistency in orchestration modes * add value * Refactor Terraform configuration: - Update main.tf to adjust foundry account and project naming conventions. - Remove feature flags and keys from appconfig module as they are now managed externally. - Clean up variables.tf by removing unused variables and updating descriptions. - Delete provider configuration file as it is no longer needed. - Change default application name from "rtaudioagent" to "artagent" and adjust related settings. - Modify connection settings and pool sizes for improved performance. * feat: Enhance Azure Voice Live integration and refactor configuration management * last changes * feat: Add app configuration bootstrap to initialize environment variables * Enhance configuration loading with .env.local support and update documentation * fix voicelive output attributes * add * Refactor agent paths and update documentation for agent discovery and configuration * Add Insurance Voice Agent Scenario documentation and update navigation - Introduced a comprehensive guide for the Insurance Customer Service Scenario, detailing the security-focused multi-agent voice system for claims processing, fraud detection, and policy management. - Updated mkdocs.yml to include the new Insurance documentation in the Industry Solutions section. * Add integration proposal for Spec-Driven Development methodology in ARTVoice * add value * Enhance Terraform configuration and scripts for Voice Live integration - Update Dockerfile to install dependencies and set up virtual environment. - Modify initialize-terraform.sh and local-dev-setup.sh for improved script handling. - Refactor sync-appconfig.sh to streamline key-value imports and feature flag management. - Add provider.conf.json generation for remote state backend configuration. - Update main.tf and outputs.tf to support new Voice Live model deployments. - Introduce voice_live_location and voice_live_model_deployments variables in variables.tf. * feat: Add Concierge agent configuration and prompts for banking scenarios - Introduced a new YAML configuration for the Concierge agent, defining its voice, model, session, and tool configurations. - Created a comprehensive prompt file for the Concierge agent, detailing voice and language settings, identity and trust guidelines, and operational modes. - Implemented scenario orchestration analysis to address issues with agent initialization and fallback logic, ensuring the correct agent is set for banking scenarios. - Renamed orchestration.yaml to scenario.yaml for consistency in scenario loading. - Updated default start agent to BankingConcierge and added validation for agent existence at startup. * feat: Enhance scenario loading to support orchestration.yaml naming convention * feat: Implement scenario-based handoff map resolution for orchestrator configuration * cicd test for azd deploy * feat: Update audio handling and documentation dependencies for improved installation and error handling * feat: Refactor app configuration handling to prioritize .env.local overrides and improve environment variable management * feat: Revise documentation deployment workflow to enhance dependency management and streamline build process * modified docs workflow * feat: Add site_dir configuration to mkdocs.yml for improved site structure * feat: Allow mkdocs build to proceed with warnings by removing --strict flag * fix: Update health check endpoint in postprovision script to use correct API path * refactor: Remove outdated AZD deployment workflow and update documentation links for clarity * fix: Ensure principal_id logging does not fail and handle local_state retrieval correctly * refactor: Simplify state key handling in provider configuration by using environment name * fix: Skip null values when loading static parameters from tfvars file to use Terraform defaults * fix: Use coalesce function for location assignment in storage account resource * refactor: Remove unused backend API public URL variable and related validation * refactor: Remove unused backend API public URL and source phone number from environment parameter files * improvements flow * fix: Implement auto-selection and timeout for user input in setup scripts * add value * fix: Update naming conventions for foundry account and project variables in locals * fix: Update name from rtaudioagent to artaudioagent in environment parameter files * fix: Update name from rtaudioagent to artaudioagent in environment parameter files * fix: Update documentation URLs to reflect new repository location * feat: Enhance API documentation and tagging for better clarity and organization * docs: Update documentation links and improve clarity across various guides * refactor: replace deploy-azd workflow with reusable template and remove redundant summary job - Updated the deployment workflow name to "Deploy to Azure". - Replaced the usage of the old deploy-azd.yml with a new reusable template _template-deploy-azd.yml. - Removed the deployment summary job and its associated steps to streamline the workflow. * fix: Add run-name to the Azure deployment workflow for better clarity * fix: Update condition for output extraction in deployment workflow * fix: Update GitHub token to use secrets for enhanced security * feat: Add optional GitHub PAT secret and enhance environment variable handling for Azure deployment * adding rg as env var set at the gh env level * fix: Add emoji to workflow names for better visibility * feat: Update documentation workflow name and enhance README with deployment badges * fix: Update README layout and enhance navigation links for better user experience * fix: Restore header for ARTVoice Accelerator Framework in README * add value * fix: Update README layout for improved clarity and navigation * Enhance provisioning scripts and documentation - Updated postprovision.sh to clarify phone number provisioning steps and added guidance for obtaining a phone number via Azure Portal. - Modified preprovision.sh to include preflight checks for tools, authentication, and providers before proceeding with provisioning. - Added jq as a prerequisite in the getting-started documentation and provided installation instructions for various platforms. - Created a new TODO-deployfixes.md file to document common issues encountered during deployment sessions, including resolutions for Docker errors, jq installation, and subscription registration. - Expanded troubleshooting.md with detailed solutions for common deployment and provisioning issues, including authentication mismatches, Docker errors, jq command not found, and ACS phone number prompts. - Updated variables.tf to improve the description of the voice_live_location variable, including a link to supported Azure regions. * feat: Update branch triggers in workflow to include feat/troubleshooting-enhancements * fix(ci): simplify test-azd-hooks workflow tests and run in parallel - Remove fragile grep-based function extraction that caused syntax errors - Run lint, linux, macos, windows tests in parallel (no dependencies) - Trigger on all pushes to main/staging (remove path filters for push) - Simplify backend configuration test to avoid function sourcing issues * feat: Add troubleshooting steps for "bad interpreter" errors and enhance post-provisioning instructions for phone number configuration * feat: Add preprovision hook execution to Linux, macOS, and Windows test jobs in CI workflow * feat: Enhance AZD hook testing with postprovision execution and Azure CLI setup * feat: Update test job names for clarity and enhance preflight checks for CI mode * feat: Update preflight checks to conditionally include Docker in CI mode and log its status * feat: Add Dev Container testing for AZD hooks with environment validation and summary reporting * feat: Enhance deployment scripts with pre/post-provisioning hooks and Azure CLI extension checks * feat: Add troubleshooting guidance for MkDocs module errors and update dev dependencies in uv.lock * feat: Update Azure deployment workflows and normalize container memory formats * feat: Add troubleshooting guidance for Terraform state lock errors and provide remote/local fix options * feat: Remove outdated troubleshooting documentation for deployment issues * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> * Apply suggestion from @Copilot Co-authored-by: Copilot <[email protected]> * Update .github/workflows/test-azd-hooks.yml Co-authored-by: Copilot <[email protected]> * feat: Implement TTS Streaming Latency Analysis and Optimization Plan - Added a comprehensive document outlining the critical latency issues in TTS playback within the Speech Cascade architecture. - Identified root causes including processing loop deadlock, sentence buffering delays, queue-based event processing, and full synthesis before streaming. - Proposed a multi-phase optimization strategy to address identified issues, including: - Phase 0: Fix processing loop deadlock by creating a dedicated TTS processing task. - Phase 1: Reduce sentence buffer threshold for earlier TTS chunk dispatch. - Phase 2: Implement parallel TTS prefetching to synthesize the next sentence while streaming. - Phase 3: Enable streaming TTS synthesis to stream audio while synthesizing. - Phase 4: Achieve full pipeline parallelism for LLM to TTS to WebSocket streaming. - Created a detailed test implementation plan with metrics and success criteria to validate improvements. test: Add unit tests for HandoffService - Created unit tests for the HandoffService, covering handoff detection, target resolution, and handoff resolution methods. - Implemented tests for greeting selection and context building to ensure proper functionality. - Added tests for the HandoffResolution dataclass to verify properties and default values. * feat: Add Scenario Builder component and integrate with RealTimeVoiceApp - Introduced ScenarioBuilder component for visual orchestration of agent flows. - Implemented drag-and-drop functionality for agents and handoff configuration. - Added buttons in RealTimeVoiceApp for accessing Agent and Scenario Builders. - Enhanced state management for agent scenarios, including creation and updates. - Integrated new handoff editor for configuring agent interactions. * Refactor code structure for improved readability and maintainability * Add error handling for Redis connection issues and implement unit tests for HandoffService - Enhanced AzureRedisManager to handle RedisClusterException and OSError during client connection attempts. - Introduced comprehensive unit tests for HandoffService, covering handoff detection, target resolution, handoff resolution, greeting selection, and context building. - Added tests for HandoffResolution dataclass to ensure correct property behavior and default values. * Enhance LiveOrchestrator to handle context-only session updates without UI broadcasts * Refactor LiveOrchestrator to prevent duplicate UI updates by omitting redundant session_updated broadcasts during context-only updates. * Refactor environment variable assignment in deploy workflow for clarity * Refactor tests and dependencies following module renaming and API changes - Removed pytest-twisted from dev dependencies in pyproject.toml and uv.lock. - Updated conftest.py to mock configuration and Azure OpenAI client for tests. - Skipped tests in test_acs_media_lifecycle.py, test_acs_media_lifecycle_memory.py, and test_acs_simple.py due to dependencies on removed/renamed modules. - Adjusted imports in test_artagent_wshelpers.py for orchestrator path change. - Skipped tests in test_call_transfer_service.py due to API changes in toolstore. - Updated datetime usage in test_demo_env_phrase_bias.py to use UTC. - Modified websocket endpoint assertions in test_realtime.py to reflect new paths. - Added new test file test_voice_handler_components.py for voice handler components. * Add comprehensive tests for VoiceLive handler and orchestrator memory management - Implement tests to verify cleanup functionality in LiveOrchestrator. - Ensure proper registration and unregistration of orchestrators in the registry. - Test background task tracking and cleanup mechanisms. - Validate greeting task cancellation during orchestrator cleanup. - Introduce memory leak detection tests to prevent unbounded growth in orchestrator registry. - Verify user message history deque is properly bounded and cleared on cleanup. - Add scenario update tests to ensure correct agent management during updates. - Optimize hot path functions to ensure non-blocking behavior during network calls. * feat: Enhance AgentBuilder with consistent field names and improved UI elements * Refactor logging levels from info to debug in connection manager, warmable pool, Redis manager, speech auth manager, speech recognizer, and text-to-speech modules for improved log verbosity control. Remove outdated greeting context tests and add comprehensive scenario orchestration contract tests to ensure functional contracts are preserved during refactoring. Update session agent manager tests to use set comparison for agent listing to avoid dict ordering issues. * feat: Add predefined handoff condition patterns to enhance scenario orchestration * add value * feat(metrics): Introduce shared metrics factory for lazy initialization - Added `metrics_factory.py` to provide a common infrastructure for OpenTelemetry metrics. - Implemented `LazyMeter`, `LazyHistogram`, and `LazyCounter` for lazy initialization of metrics. - Updated `speech_cascade/metrics.py` to utilize the new shared metrics factory, simplifying metric initialization. - Refactored `voicelive/metrics.py` to use the shared factory for consistent metric handling. - Enhanced orchestrator classes in `speech_cascade/orchestrator.py` and `voicelive/orchestrator.py` to cache orchestrator configurations, improving performance and reducing redundant calls. - Introduced utility functions for building common metric attributes, ensuring consistency across metrics. * feat: Consolidate handoff logic into a unified HandoffService for consistent behavior across orchestrators and enhance documentation * fix: Simplify environment determination logic in deployment workflow * add value * feat: Add user flow screenshots and enhance documentation for guided agent setup * feat: Enhance scenario testing instructions for clarity and user guidance * fix: Correct image paths in quickstart guide for accurate rendering * feat: Add initial agent builder and template selection screenshots to quickstart guide * feat: Add demo profile creation steps and related images to quickstart guide * feat: Implement EasyAuth configuration script and integrate into post-provisioning process * refactor: Remove backend IP restrictions configuration and related outputs * Added non qualifying rush response to ensure clear model behavior * updated order so confirmation statement is in the correct spot * add value * add value * chore: Remove unused workflow images for demo profiles * fix: Update demo profile creation images in quickstart guide * fix: Update home screen image in quickstart guide * fix: Update home screen and scenario images in quickstart guide * add value * add value * add value * add value * add value * add * add value * art * add opentelemetry import for tracing support in TTS module * refactor: update LiveOrchestrator to enhance user message history management and improve handoff context * Refactor TTS Playback and Voice Handling - Consolidated TTS playback logic into a unified class for speech cascade. - Removed deprecated VoiceSessionContext and related compatibility shims. - Enhanced error handling during tool initialization and event handler registration. - Updated model configuration handling in UnifiedAgent to prioritize mode-specific settings. - Improved logging for TTS synthesis and streaming processes. - Added new handoff tool registration for dynamic routing. * refactor: streamline EasyAuth enabling process in CI mode and improve interactive prompts * refactor: enhance EasyAuth interactive prompts and streamline user choices * refactor: enhance run-name logic for Azure deployment workflow * fix: update environment logic for pull_request events in Azure deployment workflow * refactor: update preprovision hook execution and streamline backend configuration * feat: add context variable support for handoffs and enhance UI for variable mapping * feat: enhance TTS processing by adding text sanitization and sentence boundary detection (#11) Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat(telemetry): consolidate to OpenTelemetry and establish proper hierarchy (#14) Infrastructure Changes: - Delete 6 obsolete latency_tool implementations (~2200 lines) - Install SessionContextSpanProcessor for automatic session correlation - Replace LatencyTool with @trace_speech decorators in legacy paths - Remove latency_tool field from VoiceSessionContext Speech Services & Dependencies: - Add @trace_speech for STT partial/final transcripts with attributes - Add TTS attributes: voice, output_format, language, audio_size_bytes - Standardize ACS and Redis span attributes with OTel conventions - Add voice_session root SERVER span in media/browser endpoints Orchestrator & Token Tracking: - Add tool execution and agent handoff observability spans - Fix token tracking to use actual API usage data (not estimates) - Update Azure OpenAI API to 2024-10-01-preview - Add session metadata timestamps to MemoManager Benefits: - Single source of truth (ConversationTurnSpan + OTel) - Complete E2E traces in Application Insights - Accurate cost tracking and token visibility - ~2300 lines of dead code removed Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat(telemetry): consolidate to OpenTelemetry and establish proper hierarchy (#15) Infrastructure Changes: - Delete 6 obsolete latency_tool implementations (~2200 lines) - Install SessionContextSpanProcessor for automatic session correlation - Replace LatencyTool with @trace_speech decorators in legacy paths - Remove latency_tool field from VoiceSessionContext Speech Services & Dependencies: - Add @trace_speech for STT partial/final transcripts with attributes - Add TTS attributes: voice, output_format, language, audio_size_bytes - Standardize ACS and Redis span attributes with OTel conventions - Add voice_session root SERVER span in media/browser endpoints Orchestrator & Token Tracking: - Add tool execution and agent handoff observability spans - Fix token tracking to use actual API usage data (not estimates) - Update Azure OpenAI API to 2024-10-01-preview - Add session metadata timestamps to MemoManager Benefits: - Single source of truth (ConversationTurnSpan + OTel) - Complete E2E traces in Application Insights - Accurate cost tracking and token visibility - ~2300 lines of dead code removed Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat(telemetry): consolidate to OpenTelemetry and establish proper hierarchy (#13) Infrastructure Changes: - Delete 6 obsolete latency_tool implementations (~2200 lines) - Install SessionContextSpanProcessor for automatic session correlation - Replace LatencyTool with @trace_speech decorators in legacy paths - Remove latency_tool field from VoiceSessionContext Speech Services & Dependencies: - Add @trace_speech for STT partial/final transcripts with attributes - Add TTS attributes: voice, output_format, language, audio_size_bytes - Standardize ACS and Redis span attributes with OTel conventions - Add voice_session root SERVER span in media/browser endpoints Orchestrator & Token Tracking: - Add tool execution and agent handoff observability spans - Fix token tracking to use actual API usage data (not estimates) - Update Azure OpenAI API to 2024-10-01-preview - Add session metadata timestamps to MemoManager Benefits: - Single source of truth (ConversationTurnSpan + OTel) - Complete E2E traces in Application Insights - Accurate cost tracking and token visibility - ~2300 lines of dead code removed Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat(telemetry): consolidate to OpenTelemetry and establish proper hierarchy (#12) Infrastructure Changes: - Delete 6 obsolete latency_tool implementations (~2200 lines) - Install SessionContextSpanProcessor for automatic session correlation - Replace LatencyTool with @trace_speech decorators in legacy paths - Remove latency_tool field from VoiceSessionContext Speech Services & Dependencies: - Add @trace_speech for STT partial/final transcripts with attributes - Add TTS attributes: voice, output_format, language, audio_size_bytes - Standardize ACS and Redis span attributes with OTel conventions - Add voice_session root SERVER span in media/browser endpoints Orchestrator & Token Tracking: - Add tool execution and agent handoff observability spans - Fix token tracking to use actual API usage data (not estimates) - Update Azure OpenAI API to 2024-10-01-preview - Add session metadata timestamps to MemoManager Benefits: - Single source of truth (ConversationTurnSpan + OTel) - Complete E2E traces in Application Insights - Accurate cost tracking and token visibility - ~2300 lines of dead code removed Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat: Responses API Infrastructure & Dual Model Configuration (#16) * feat: enhance azd environment variable handling with error checks and local state support * fix: update foundry account and project naming conventions for consistency * feat: add Responses API infrastructure and dual model configuration **Infrastructure Changes:** - Add UnifiedResponse dataclass for dual endpoint support - Implement _should_use_responses_endpoint() routing logic - Add _prepare_responses_params() and _prepare_chat_params() methods - Update generate_response() to route between /chat/completions and /responses **Model Configuration:** - Add cascade_model and voicelive_model fields to AgentConfig - Add get_model_for_mode() with support for 'cascade', 'media', 'voicelive', 'realtime' aliases - Add Responses API fields: endpoint_preference, verbosity, min_p, typical_p, reasoning_effort, include_reasoning, max_completion_tokens - Update ModelConfigSchema in agent_builder API **Tests:** - Add test_generate_response_respects_responses_config - Add test_generate_response_respects_chat_config - Add TestUnifiedAgentGetModelForMode test suite This PR provides the foundation for Responses API support without changing orchestrator behavior. * fix: update project version to 2.0.0-beta in pyproject.toml --------- Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat: Orchestrator Integration + Optimizations (#17) * feat: enhance azd environment variable handling with error checks and local state support * fix: update foundry account and project naming conventions for consistency * feat: add Responses API infrastructure and dual model configuration **Infrastructure Changes:** - Add UnifiedResponse dataclass for dual endpoint support - Implement _should_use_responses_endpoint() routing logic - Add _prepare_responses_params() and _prepare_chat_params() methods - Update generate_response() to route between /chat/completions and /responses **Model Configuration:** - Add cascade_model and voicelive_model fields to AgentConfig - Add get_model_for_mode() with support for 'cascade', 'media', 'voicelive', 'realtime' aliases - Add Responses API fields: endpoint_preference, verbosity, min_p, typical_p, reasoning_effort, include_reasoning, max_completion_tokens - Update ModelConfigSchema in agent_builder API **Tests:** - Add test_generate_response_respects_responses_config - Add test_generate_response_respects_chat_config - Add TestUnifiedAgentGetModelForMode test suite This PR provides the foundation for Responses API support without changing orchestrator behavior. * feat: integrate Responses API in orchestrators and add optimizations **Cascade Orchestrator:** - Update model selection to use agent.get_model_for_mode('cascade') - Integrate Responses API routing based on endpoint_preference - Add error handling for unsupported parameters - Extract TTS processing into separate tts_processor module **VoiceLive Orchestrator:** - Update to use agent.get_model_for_mode('voicelive') - Add registry cleanup to prevent unbounded growth - Improve memory management and stale orchestrator cleanup - Extract DTMF processing into separate dtmf_processor module **Tests:** - Add test_cascade_orchestrator_entry_points - Add test_cascade_llm_processing - Add test_dtmf_processor Depends on: PR #1 (Responses API Infrastructure) --------- Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat: Evaluation Framework + Frontend UI (#18) * feat: enhance azd environment variable handling with error checks and local state support * fix: update foundry account and project naming conventions for consistency * feat: add Responses API infrastructure and dual model configuration **Infrastructure Changes:** - Add UnifiedResponse dataclass for dual endpoint support - Implement _should_use_responses_endpoint() routing logic - Add _prepare_responses_params() and _prepare_chat_params() methods - Update generate_response() to route between /chat/completions and /responses **Model Configuration:** - Add cascade_model and voicelive_model fields to AgentConfig - Add get_model_for_mode() with support for 'cascade', 'media', 'voicelive', 'realtime' aliases - Add Responses API fields: endpoint_preference, verbosity, min_p, typical_p, reasoning_effort, include_reasoning, max_completion_tokens - Update ModelConfigSchema in agent_builder API **Tests:** - Add test_generate_response_respects_responses_config - Add test_generate_response_respects_chat_config - Add TestUnifiedAgentGetModelForMode test suite This PR provides the foundation for Responses API support without changing orchestrator behavior. * feat: add evaluation framework and frontend UI for Responses API **Evaluation Framework:** - Add EventRecorder with git commit SHA tracking - Add API-aware scoring with budget adjustments for verbosity - Add scenario runner for automated testing - Add CLI for running evaluations - Add validate_phases.py for phase-based validation - Add wrappers for endpoint detection **Frontend UI:** - Add cascade_model and voicelive_model selectors in Agent Builder - Add Responses API endpoint preference dropdown - Add conditional fields for verbosity, reasoning_effort, etc. - Update ScenarioBuilder with model configuration options - Display API version fields **Documentation:** - Add docs/testing/model-evaluation.md - Add evaluation playground Jupyter notebook Depends on: PR #1 (Responses API Infrastructure) --------- Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * Cleaning up lifecycle management logic into dedicated structure, keep main.py clean (#19) Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat: voice handler refactoring and MediaHandler migration Major refactoring of voice processing architecture: Core Voice Changes: - Implement new VoiceHandler as primary entry point for voice sessions - Delete deprecated speech_cascade/tts.py (652 lines removed) - Consolidate TTS functionality into voice/tts/playback.py - Enhance CascadeOrchestrator with improved turn management - Add VoiceSessionContext for clean dependency injection API & Integration: - Migrate /api/v1/browser/conversation to VoiceHandler - Migrate /api/v1/media/stream to VoiceHandler - Create MediaHandler→VoiceHandler compatibility alias - Update media_handler.py for backward compatibility Infrastructure: - Improve telemetry with Azure-style span naming - Enhance ACS helpers with better session management - Update session terminator for lifecycle management - Add orchestration improvements for unified agents Configuration & Samples: - Update auth agent and insurance scenario configs - Add handoff tool enhancements with context variables - Update gpt_flow sample for new patterns Frontend: - Refactor App.jsx for improved voice handling UI Testing & Documentation: - Add test_voice_handler_compat.py for backward compatibility - Add MEDIAHANDLER_MIGRATION.md tracking document This change maintains full backward compatibility while establishing the foundation for cleaner voice processing patterns going forward. Closes #[TBD] * Enhance logging and user prompts in preflight and pre-provisioning scripts (#20) - Updated logging functions in preflight-checks.sh, ssl-preprovision.sh, sync-appconfig.sh, postprovision.sh, and preprovision.sh for consistent output formatting. - Improved user prompts for SSL certificate configuration and Azure Entra group creation in ssl-preprovision.sh and postprovision.sh. - Added color-coded success, warning, and error messages for better visibility. - Modified the handling of environment variables in postprovision.sh to ensure updates are made without overwriting existing values. - Updated Terraform configurations to manage app configuration and cognitive account settings with soft delete options. Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * feat: voice handler refactoring and MediaHandler migration (#21) Major refactoring of voice processing architecture: Core Voice Changes: - Implement new VoiceHandler as primary entry point for voice sessions - Delete deprecated speech_cascade/tts.py (652 lines removed) - Consolidate TTS functionality into voice/tts/playback.py - Enhance CascadeOrchestrator with improved turn management - Add VoiceSessionContext for clean dependency injection API & Integration: - Migrate /api/v1/browser/conversation to VoiceHandler - Migrate /api/v1/media/stream to VoiceHandler - Create MediaHandler→VoiceHandler compatibility alias - Update media_handler.py for backward compatibility Infrastructure: - Improve telemetry with Azure-style span naming - Enhance ACS helpers with better session management - Update session terminator for lifecycle management - Add orchestration improvements for unified agents Configuration & Samples: - Update auth agent and insurance scenario configs - Add handoff tool enhancements with context variables - Update gpt_flow sample for new patterns Frontend: - Refactor App.jsx for improved voice handling UI Testing & Documentation: - Add test_voice_handler_compat.py for backward compatibility - Add MEDIAHANDLER_MIGRATION.md tracking document This change maintains full backward compatibility while establishing the foundation for cleaner voice processing patterns going forward. Closes #[TBD] Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * enhanced the scenariobuilder with flowy (#22) * docs: add comprehensive voice processing architecture documentation Add complete documentation for the voice processing architecture: New Documentation: - docs/architecture/voice/README.md - Comprehensive voice architecture guide * VoiceHandler overview and usage patterns * TTS playback and text processing * Speech cascade pipeline documentation * Audio specifications for browser and ACS transports * Testing guidelines with actual test file references * Troubleshooting guide for common issues - apps/artagent/backend/voice/README.md - Developer quick reference * Directory structure and module organization * Quick start examples * Common tasks and patterns * File location guide * Testing commands Documentation Updates: - docs/mkdocs.yml - Add voice architecture to navigation - docs/operations/troubleshooting.md - Add voice-specific troubleshooting Key Improvements: - Fixed mkdocs formatting for proper list rendering - Updated all test references to match actual test files: * test_voice_handler_components.py * test_voice_handler_compat.py * test_cascade_orchestrator_entry_points.py * test_cascade_llm_processing.py - Verified all script references (quick_test.sh, test_orchestrator.py) - Added prerequisites for running tests with dev dependencies - Included both basic and advanced testing examples All file paths and examples have been verified against the actual codebase. Related to #[TBD] * Add custom styles for Flowy flowchart integration with agent blocks * feat: Enhance output port visibility logic in ScenarioGraphCanvas * feat: Add expandable full prompt view for source agent in HandoffEditorDialog --------- Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * Refactor ACS logging and add default orchestration scenario - Removed info-level logging for ACS configuration details to reduce verbosity. - Changed some logging statements to debug level for better log management. - Updated peer.service attribute in telemetry to use "azure-communication-services". - Introduced a new orchestration.yaml file defining a default customer service scenario with multiple agents and handoff configurations. * Refactor ACS logging and add default orchestration scenario (#23) - Removed info-level logging for ACS configuration details to reduce verbosity. - Changed some logging statements to debug level for better log management. - Updated peer.service attribute in telemetry to use "azure-communication-services". - Introduced a new orchestration.yaml file defining a default customer service scenario with multiple agents and handoff configurations. Co-authored-by: Jin Lee (HLS US SE) <[email protected]> * Enhance logging functions to use log_plain for consistency and clarity in local development setup script * Disable view toggle buttons for chat/graph/timeline in ConversationControls * Add panning functionality to ScenarioGraphCanvas and reset button * Update CHANGELOG.md for 2.0.0-beta.1 release: add new features, enhancements, fixes, and infrastructure changes * feat: Add mkdocs-mermaid-zoom dependency and update locust load test scripts - Added mkdocs-mermaid-zoom to pyproject.toml and uv.lock for enhanced diagram support in documentation. - Enhanced locustfile.acs_media.py with rate limit detection and error handling improvements. - Introduced locustfile.browser_conversation.py for testing browser-based voice conversation endpoints. - Improved metrics naming conventions for clarity in load testing results. * feat: Update Voice Live readiness status to use event envelope format --------- Co-authored-by: Jin Lee <[email protected]> Co-authored-by: Jin Lee (HLS US SE) <[email protected]> Co-authored-by: Anna Quincy <[email protected]> Co-authored-by: Jin Lee <[email protected]> Co-authored-by: Copilot <[email protected]> * enhancement: infra docs readme update (#100) * Update version and SKU name in staging params * Change version for text-embedding-3-large model Updated the version of the text-embedding-3-large model. * Update main.tfvars.staging.json * Update communication.tf * feat: Enhance status envelope with optional label and update frontend to derive WS URL - Added optional `label` parameter to `make_status_envelope` function in `envelopes.py` to allow custom labels in status messages. - Updated `entrypoint.sh` to derive WebSocket URL from `BACKEND_URL` or use `WS_URL` if provided, replacing placeholders in frontend assets. - Upgraded `js-yaml` and `vite` dependencies in `package.json` and `package-lock.json`. - Enhanced `App.jsx` to format event type labels and summarize event data for better user experience. - Introduced new demo scenarios in `DemoScenariosWidget.jsx` to showcase Microsoft Copilot Studio integration and ACS call routing. - Added tests for call transfer events in `test_acs_events_handlers.py` to ensure correct envelope broadcasting for transfer accepted and failed events. - Created a new Jupyter notebook for custom speech model demonstration in `12-custom-speech-model.ipynb`. - Updated Terraform parameters to include a new text embedding model in `main.tfvars.dev.json`. * refactor: Comment out unused email communication service domain resource * refactor: Comment out unused Azure email communication service resources * feat: Enhance event handling and UI components - Added new utility functions for formatting event types and summarizing event data in App.jsx. - Improved ChatBubble component to display event messages with formatted labels and timestamps. - Updated DemoScenariosWidget to include new scenarios and enhanced filtering options based on tags. - Introduced websocket URL derivation in postprovision.sh for better backend integration. - Added tests for call transfer events in test_acs_events_handlers.py to ensure proper envelope broadcasting. - Updated package.json to include js-yaml and upgraded vite version. * add value * feat: Enhance distributed session handling and improve PayPal agent interactions - Implement distributed session bus using Redis for cross-replica session routing in connection manager. - Add methods for publishing session envelopes to Redis channels. - Introduce confirmation context for call center transfers to ensure explicit user consent. - Update PayPal agent templates to clarify authentication and routing guidelines. - Enhance real-time voice app to manage relay WebSocket connections and handle session updates more effectively. - Improve error handling and logging for distributed session delivery and Redis interactions. - Refactor session envelope handling in frontend to accommodate new event types and improve user experience. * feat: Enhance status tone metadata and improve chat bubble styling * feat: Implement background task handling for MFA delivery and improve greeting messages in handoff processes * feat: Enhance call escalation process with detailed transfer context and improve PayPal agent handoff scenarios * feat: Implement retry mechanism for browser session ID resolution in media streaming * feat: Enhance session management and greeting handling across various components * fixing session mapping for acs calls * add value * add value * adding test file * Adding agents and templates for credit card recommendation and fee dispute agents * add value * Enhance audio transcription settings across agents and adjust logging levels for better debugging * Enhance audio transcription settings across agents and adjust logging levels for better debugging * add value * add value * Implement Azure Voice Live service integration and enhance Terraform configurations for voice model deployments * add value * Add Azure Voice Live model configuration and outputs * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * fixing voicelive chat sequence on the ui * remove sensitive contact information and unused transfer agency client data * feat: Introduce Agent Consolidation Plan with YAML-driven architecture - Added a comprehensive proposal for consolidating agent architecture in `apps/rtagent/backend/src/agents/`. - Established key goals including single source of truth for agent definitions, auto-discovery, and unified tool registry. - Analyzed current architecture and identified pain points such as manual handoff registration and duplicate tool registries. - Proposed a new solution architecture featuring enhanced YAML schema, auto-discovery engine, and unified tool registry. - Detailed implementation roadmap divided into phases for gradual migration and integration. - Included backward compatibility strategy to ensure existing agents function without modification. - Provided extensive documentation on YAML schema, CLI tool usage, and migration checklist. * Refactor speech cascade handler and routing for browser communication - Updated speech cascade handler to prioritize `on_greeting` callback over `on_tts_request` for greeting events. - Added `queue_user_text` method to `SpeechCascadeHandler` for queuing user text input. - Changed routing from `/realtime` to `/browser` for browser communication endpoints. - Modified orchestration logic to ensure TTS responses are sent with blocking behavior to prevent overlap. - Introduced WebSocket helper functions for better organization and clarity in messaging. - Enhanced connection manager to handle Redis pubsub reconnections on credential expiration. - Updated frontend components to reflect routing changes for browser communication. - Adjusted tests to align with the new browser routing and functionality. - Commented out live metrics enabling condition in telemetry configuration for future consideration. * feat(telemetry): add decorators for tracing LLM, dependency, speech, and ACS calls - Introduced , , , and decorators for OpenTelemetry instrumentation. - Implemented context manager for tracking conversation turns with detailed metrics. - Added helper functions for recording GenAI and speech metrics. - Enhanced span attributes for Azure Application Insights visualization. * Remove telemetry configuration module (telemetry_config_v2.py) to streamline codebase and eliminate unused functionality. * feat: Enhance telemetry and tracing for CosmosDB and latency tool - Added OpenTelemetry tracing to CosmosDB operations with a decorator for latency tracking. - Integrated tracing spans in the LatencyTool for better observability in Application Insights. - Updated telemetry configuration to suppress noisy logs and added new attributes for speech cascade metrics. - Created unit tests for SessionAgentManager, covering configuration management, override resolution, handoff management, and persistence. - Removed outdated endpoints review document. * feat: Add useBackendHealth hook for backend health checks and integrate with readiness, agents, and health endpoints test: Implement integration tests for VoiceLive Session Agent Manager, covering agent resolution, handoff mapping, and runtime modifications * WARNING!!!! MAJOR REFACTOR COMMIT - Removed the VoiceLive SDK integration module from the backend. - Added a new AgentTopologyPanel component to the frontend for displaying agent inventory and connections. - Integrated the AgentTopologyPanel into the main application layout. - Updated the BackendIndicator to include agent count and selection functionality. - Enhanced the ConversationControls with a fixed view switcher for better accessibility. - Improved the useBackendHealth hook to handle various agent data structures. - Updated styles for better responsiveness and visual consistency across components. - Modified utility functions to format agent inventory data correctly. - Adjusted import paths in orchestrators and tests to reflect the new backend structure. * feat: Enhance agent handoff process and response handling; refactor UI components for improved usability * feat: Update change notes for v2/speech-orchestration-and-monitoring branch; highlight major features, improvements, and new agents * refactor: Remove Unified Agent Configuration Module; streamline agent management and improve code organization * feat: Enhance ProfileDetailsPanel with resizable functionality and UI improvements - Added resizable panel feature to ProfileDetailsPanel, allowing users to adjust width dynamically. - Updated panel styling for improved aesthetics, including a gradient background and adjusted borders. - Enhanced scrollbar visibility and overflow handling for better user experience. refactor: Simplify GraphListView filter logic - Removed default selection logic for filters in GraphListView, allowing users to start with no filters applied. - Cleaned up useEffect dependencies for better performance and clarity. docs: Introduce Backend Voice & Agents Architecture documentation - Added comprehensive documentation outlining the architecture of backend voice and agent modules. - Detailed separation of concerns between voice transport and agent business logic. - Included data flow diagrams and module responsibilities for clarity. docs: Create Handoff Logic Inventory for better understanding of handoff processes - Documented the handoff logic across backend voice and agent modules. - Established a single source of truth for handoff mappings and protocols. - Summarized cleanup phases and their impact on the codebase. fix: Update logging to safely handle span names - Modified TraceLogFilter to safely retrieve span names, preventing attribute errors with NonRecordingSpan. fix: Adjust telemetry configuration to capture all loggers - Changed logger_name default to an empty string in TelemetryConfig to capture all loggers. * feat: Implement context-aware greeting rendering in VoiceLive agent; enhance session management and logging * feat: Refactor agent configuration and voice handling; streamline agent switching and TTS integration * feat: Enhance Agent Details Panel and Session Management - Added sessionAgentConfig prop to AgentDetailsPanel for dynamic agent configuration display. - Implemented logic to show agent name, description, tools, and model/voice details based on session configuration. - Introduced a new PanelCard in AgentDetailsPanel to display session agent configuration, including model, voice, and prompt preview. - Updated App component to fetch session agent configuration on agent panel visibility and manage agent creation/updating. - Added validation for TTS client initialization in dedicated_tts_pool.py to ensure clients are ready before use. - Enhanced on_demand_pool.py to validate cached resources and remove invalid ones. - Improved error logging in text_to_speech.py to include detailed initialization failure information and added is_ready property for synthesizer readiness check. * Refactor code structure for improved readability and maintainability * feat: Enhance MemoManager with background persistence and lifecycle management - Added support for background persistence in MemoManager, allowing non-blocking state saving to Redis. - Implemented task deduplication to cancel previous persistence tasks when a new one is initiated. - Removed unused auto-refresh functionality and related attributes from MemoManager. - Updated tests to verify new persistence behavior and ensure proper task management. - Enhanced error handling and logging for background persistence operations. * feat: Add Connection Warmup Analysis document for Azure Speech & OpenAI optimization * feat(session): enhance session ID management and URL parameter support - Added `pickSessionIdFromUrl` function to extract session ID from URL parameters. - Updated `getOrCreateSessionId` to allow session ID restoration from URL. - Refactored `setSessionId` for better logging and session management. - Improved `createNewSessionId` to utilize `setSessionId`. docs(api): restructure API documentation for clarity and completeness - Organized API endpoints into categories: Health & Monitoring, Call Management, Media Streaming, Browser Conversations, Session Metrics, …
1 parent eebb209 commit a199518

54 files changed

Lines changed: 13261 additions & 809 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitattributes

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
2+
# Use bd merge for beads JSONL files
3+
.beads/issues.jsonl merge=beads

.github/copilot-instructions.md

Lines changed: 39 additions & 83 deletions
Original file line numberDiff line numberDiff line change
@@ -1,83 +1,39 @@
1-
Developer: # 🦠 Copilot Developer Guide for Real-Time Voice Apps (Python 3.11, FastAPI, Azure)
2-
3-
---
4-
5-
## 🚀 Overview
6-
Develop Python 3.11 code for a **low-latency, real-time voice application** utilizing the following technologies:
7-
8-
- **FastAPI**
9-
- **Azure Communication Services** (Call Automation & Media Streaming)
10-
- **Azure Speech** (Speech-to-Text/Text-to-Speech)
11-
- **Azure OpenAI**
12-
13-
Begin every significant task by outlining a concise checklist (3–7 bullets) of conceptual steps before proceeding; keep these at a high level. Focus on clarity, manageable increments, and code simplicity. Avoid unnecessary abstraction or complexity. Prioritize practical, focused updates over clever or intricate changes.
14-
15-
---
16-
17-
## 📄 General Principles
18-
- **Readability & Simplicity:** Write clear, maintainable code that is easy to understand.
19-
- **Simplicity First:** Choose the simplest working solution. Avoid over-engineering and premature optimization.
20-
- **Incremental Development:** Implement small, meaningful changes rather than large, complex modifications.
21-
- **Modular Design:** Separate infrastructure, backend logic, and user experience layers.
22-
- **Asynchronous Endpoints:** Define all HTTP and WebSocket handlers as `async` functions.
23-
- **Schemas:** Use `pydantic.BaseModel` for all request and response definitions.
24-
- **Dependency Injection:** Use FastAPI `Depends` for managing sessions, authentication, and Redis clients.
25-
- **Configuration Management:** Store secrets and configuration in environment variables or a `.env` file.
26-
- **Structured Logging:** Output logs in JSON format, including `correlation ID`, `callConnectionId`, etc.
27-
- **Avoid Blocking I/O:** Do not use global state; manage resource lifecycles in scoped containers.
28-
29-
---
30-
31-
## 🔎 Tracing & Application Instrumentation
32-
- **OpenTelemetry:** Instrument all code using OpenTelemetry (OTEL). Set `service.name` and `service.instance.id` on the `TracerProvider` resource.
33-
- **Span Kinds:** Use `SERVER` for inbound HTTP/WS handlers, `CLIENT` for outbound requests, and `INTERNAL` for local processing activities.
34-
- **Context Propagation:** Employ the W3C `traceparent` header for HTTP/WS and span links for inter-process activities.
35-
- **Root Traces:** Create one per `callConnectionId`, including `rt.call.connection_id` and `rt.session.id` as attributes.
36-
- **Span Volume:** Limit span creation; generate one session span for STT (with events) and, optionally, one per VAD segment. **Do not create spans per audio frame.**
37-
- **Semantic Attributes:** Apply attributes like `peer.service`, `net.peer.name`, `http.request.method`, `server.address`, and `network.protocol.name="websocket"`.
38-
- **Error Reporting:** If errors occur, set span status to `ERROR` and attach an event with `error.type` and `error.message`.
39-
40-
After each tool invocation or code edit, validate the result in 1–2 lines. If the output does not meet expectations, self-correct and retry as needed before proceeding.
41-
42-
---
43-
44-
## 🏗️ App Structure & Dependency Management
45-
- **No Client Attachment:** Refrain from storing clients on `Request` or `WebSocket` objects.
46-
- **Typed AppContainer:** Define protocols for Redis, Speech, and Azure OpenAI; attach these to `app.state` and provide access via FastAPI dependencies.
47-
- **WebSocket Dependency Injection:** Inject dependencies with `container_from_ws(ws)`; do not access `ws.app.state.*` directly.
48-
49-
---
50-
51-
## 📞 Azure Communication Services (ACS) Best Practices
52-
- **Call Connection ID:** Treat `callConnectionId` as a correlation token, not a secret; prefer passing via headers or message bodies.
53-
- **Spans for Media Operations:** Use `SERVER` spans for WebSocket accept operations, `CLIENT` spans for ACS control commands (answer, play, stop, hangup).
54-
55-
---
56-
57-
## ✨ Code Style Guide
58-
- **Small, Focused Functions:** Use explicit timeouts on `await` statements; avoid blocking event loops. Make code changes as minimal, reviewable increments.
59-
- **Favor Clarity:** Regularly review solutions to eliminate unnecessary complexity. Avoid deep inheritance, extra abstraction, or unused patterns.
60-
- **Background Tasks:** Use `asyncio.create_task` and manage background task lifecycles appropriately.
61-
- **Docstrings:** Always include descriptions of function inputs, outputs, and latency concerns.
62-
- **Unit Testing:** Support testability by faking/mocking Redis, Speech, and AOAI via Protocols. Ensure your code can be unit tested.
63-
64-
When editing code:
65-
1. Clearly state your assumptions.
66-
2. Create or run minimal, relevant tests when possible.
67-
3. Produce reviewable diffs and adhere to project standards.
68-
If tests cannot be run, note that they are speculative and provide instructions for local validation.
69-
70-
After code changes, always verify the edits against expected behavior and prepare to self-correct if validation fails.
71-
72-
---
73-
74-
## 🚫 Strictly Prohibited
75-
- Creating spans per audio chunk.
76-
- Using global singletons.
77-
- Adding `service.name` or `span.kind` attributes to spans manually.
78-
79-
---
80-
81-
> **Tip:** Use code blocks, lists, and semantic section headers to maximize clarity and increase inferencing accuracy. Default to plain text unless markdown is requested; use code fences for code and backticks for identifiers.
82-
83-
---
1+
# Copilot Guide: Real-Time Voice Apps (Python 3.11, FastAPI, Azure)
2+
3+
## Stack
4+
- **FastAPI** + **Pydantic** for APIs
5+
- **Azure Communication Services** (Call Automation, Media Streaming)
6+
- **Azure Speech** (TTS/STT), **Azure OpenAI**
7+
- **Redis** for session state, **Cosmos DB** for persistence
8+
9+
## Core Principles
10+
- **Simplicity First:** Choose the simplest working solution. No over-engineering.
11+
- **Reuse Before Create:** Check `src/`, `utils/`, `config/` before writing new code.
12+
- **Async Everything:** All HTTP/WebSocket handlers must be `async`.
13+
- **No Wrappers:** Do not create adapter/facade/manager classes around existing services.
14+
- **No New Dependencies:** Do not add pip packages without explicit approval.
15+
16+
## Key Modules to Reuse
17+
| Need | Use |
18+
|------|-----|
19+
| Logging | `from utils.ml_logging import get_logger` |
20+
| Configuration | `from config.settings import X` (not `os.getenv`) |
21+
| Redis | `src/redis/manager.py` |
22+
| Azure OpenAI | `src/aoai/` |
23+
| Speech TTS/STT | `src/speech/` |
24+
| Agents | `registries/agentstore/base.py``UnifiedAgent` |
25+
| Tools | `registries/toolstore/` |
26+
| Pydantic models | Extend `api/v1/models/base.py``BaseModel` |
27+
28+
## Anti-Patterns (Never Do)
29+
- Global singletons or module-level mutable state
30+
- Factory classes when a function suffices
31+
- Abstract base classes for single implementations
32+
- `logging.getLogger()` — use `get_logger(__name__)`
33+
- `os.getenv()` — import from `config/settings.py`
34+
- `requests` library — use `aiohttp` or `httpx`
35+
36+
## Telemetry
37+
- Use OpenTelemetry with W3C `traceparent` propagation
38+
- One trace per `callConnectionId`, not per audio frame
39+
- Span kinds: `SERVER` (inbound), `CLIENT` (outbound), `INTERNAL` (local)
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
---
2+
applyTo: "**/api/**/*.py,**/endpoints/**/*.py,**/handlers/**/*.py"
3+
---
4+
5+
# API Endpoint Standards
6+
7+
## Router Setup
8+
```python
9+
from fastapi import APIRouter, Request, HTTPException, Depends
10+
from utils.ml_logging import get_logger
11+
12+
router = APIRouter()
13+
logger = get_logger(__name__)
14+
```
15+
16+
## Endpoint Pattern
17+
```python
18+
@router.get("/path", response_model=ResponseSchema, tags=["Category"])
19+
async def endpoint_name(request: Request) -> ResponseSchema:
20+
"""Brief description of what this endpoint does."""
21+
# Access shared resources via request.app.state
22+
# Return Pydantic model, not dict
23+
```
24+
25+
## Request/Response
26+
- Use Pydantic schemas from `api/v1/schemas/` for all request/response models
27+
- Extend `BaseModel` from `api/v1/models/base.py` for new models
28+
- Never return raw dicts — always use response_model
29+
30+
## Dependency Injection
31+
- Use `Depends()` for auth, sessions, clients
32+
- Access app state: `request.app.state.redis_client`
33+
- WebSocket: use `container_from_ws(ws)`, not direct state access
34+
35+
## Error Responses
36+
```python
37+
# Standard error format
38+
raise HTTPException(status_code=404, detail="Resource not found")
39+
40+
# With logging
41+
logger.error(f"Failed to fetch resource {id}: {e}")
42+
raise HTTPException(status_code=500, detail=str(e))
43+
```
44+
45+
## Tags for OpenAPI
46+
- `Health` — health/readiness endpoints
47+
- `Calls` — call management
48+
- `Agents` — agent operations
49+
- `Voice` — voice/speech operations
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
applyTo: "**/*.py"
3+
---
4+
5+
# Python Code Standards
6+
7+
## Imports & Setup
8+
```python
9+
from __future__ import annotations
10+
from utils.ml_logging import get_logger
11+
12+
logger = get_logger(__name__)
13+
```
14+
15+
## Type Hints
16+
- Full type hints on all functions and methods
17+
- Use `X | None` instead of `Optional[X]`
18+
- Use `TypeVar` for generic classes
19+
20+
## Async Patterns
21+
- All I/O operations must be `async`
22+
- Use explicit timeouts: `await asyncio.wait_for(coro, timeout=5.0)`
23+
- Background tasks: `asyncio.create_task()` with proper lifecycle management
24+
- Never use blocking `requests` — use `aiohttp` or `httpx`
25+
26+
## Error Handling
27+
- Use `HTTPException` for API errors with appropriate status codes
28+
- Always log errors before raising: `logger.error(f"...: {e}")`
29+
- Use `tenacity` for retries (already in deps), not custom retry logic
30+
31+
## Functions Over Classes
32+
```python
33+
# ✅ Prefer
34+
async def process_item(item: Item) -> Result:
35+
return Result(data=item.transform())
36+
37+
# ❌ Avoid unnecessary abstraction
38+
class ItemProcessor:
39+
async def process(self, item: Item) -> Result:
40+
return Result(data=item.transform())
41+
```
42+
43+
## Docstrings
44+
- Include for public functions
45+
- Describe inputs, outputs, and any latency concerns
46+
- Keep concise — one-liner if behavior is obvious

.github/skills/README.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Skills Directory
2+
3+
Task-oriented skills for AI assistants working with this codebase.
4+
5+
## Philosophy
6+
7+
**Code is documentation.** Skills provide focused task guidance; the codebase itself is the reference. Agents should explore existing patterns via:
8+
9+
- `registries/toolstore/registry.py` - Tool registration patterns
10+
- `registries/agentstore/base.py` - Agent schema and helpers
11+
- `registries/agentstore/*/agent.yaml` - Agent configuration examples
12+
13+
## Available Skills
14+
15+
| Skill | Description |
16+
| ----- | ----------- |
17+
| `add-component` | Add React component with Material UI |
18+
| `add-endpoint` | Add FastAPI endpoint |
19+
| `add-evaluation` | Create evaluation scenario |
20+
| `add-message-handler` | Handle new WebSocket message type |
21+
| `add-tool` | Add tool to agent registry |
22+
| `add-voice-handler` | Add voice module feature |
23+
| `create-agent` | Create complete agent with tools |
24+
25+
## Skill Conventions
26+
27+
Each skill lives in `.github/skills/{skill-name}/SKILL.md` with frontmatter:
28+
29+
```yaml
30+
---
31+
name: skill-name
32+
description: Brief task description
33+
---
34+
```
35+
36+
**Supported attributes:** `name`, `description`, `compatibility`, `license`, `metadata`
37+
38+
## Naming
39+
40+
- `add-*` for creating new items (add-endpoint, add-tool)
41+
- `create-*` for complex multi-file creation (create-agent)
42+
- Verb-noun pattern, lowercase, hyphenated

0 commit comments

Comments
 (0)