Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/docker/compose/third_parties-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,7 @@ services:
build:
dockerfile: comps/third_parties/sglang/src/Dockerfile
image: ${REGISTRY:-opea}/sglang:${TAG:-latest}
funasr-paraformer:
build:
dockerfile: comps/third_parties/funasr/src/Dockerfile
image: ${REGISTRY:-opea}/funasr-paraformer:${TAG:-latest}
10 changes: 10 additions & 0 deletions comps/asr/deployment/docker_compose/compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

include:
- ../../../third_parties/whisper/deployment/docker_compose/compose.yaml
- ../../../third_parties/funasr/deployment/docker_compose/compose.yaml

services:
asr:
Expand Down Expand Up @@ -33,6 +34,15 @@ services:
depends_on:
whisper-gaudi-service:
condition: service_healthy
asr-funasr-paraformer:
extends: asr
container_name: asr-funasr-paraformer-service
environment:
ASR_COMPONENT_NAME: ${ASR_COMPONENT_NAME:-OPEA_PARAFORMER_ASR}
ENABLE_MCP: ${ENABLE_MCP:-False}
depends_on:
funasr-paraformer-service:
condition: service_healthy

networks:
default:
Expand Down
9 changes: 6 additions & 3 deletions comps/asr/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,16 @@ ASR (Audio-Speech-Recognition) microservice helps users convert speech to text.

- **ASR Server**: This microservice is responsible for converting speech audio into text. It receives an audio file as input and returns the transcribed text, enabling downstream applications such as conversational bots to process spoken language. The ASR server supports deployment on both CPU and HPU platforms.
- **Whisper Server**: This microservice is responsible for converting speech audio into text using the Whisper model. It exposes an API endpoint that accepts audio files and returns the transcribed text, supporting both CPU and HPU deployments. The Whisper server acts as the backend for ASR functionality in the overall architecture.
- **FunASR Paraformer Server**: This microservice is responsible for converting speech audio into text using the Paraformer model with the FunASR toolkit. Similar to the Whisper Server, it exposes an API endpoint that accepts audio files and returns the transcribed text, supporting CPU deployments. The FunASR Paraformer server acts as the backend for ASR functionality in the overall architecture.

## Deployment Options

For detailed, step-by-step instructions on how to deploy the ASR microservice using Docker Compose on different Intel platforms, please refer to the deployment guide. The guide contains all necessary steps, including building images, configuring the environment, and running the service.

| Platform | Deployment Method | Link |
| ----------------- | ----------------- | ---------------------------------------------------------- |
| Intel Xeon/Gaudi2 | Docker Compose | [Deployment Guide](../deployment/docker_compose/README.md) |
| Platform | Deployment Method | Link |
| ----------------- | ----------------- | ------------------------------------------ |
| Intel Xeon/Gaudi2 | Docker Compose | [Deployment Guide](./README_whisper.md) |
| Intel Core | Docker Compose | [Deployment Guide](./README_paraformer.md) |

## Validated Configurations

Expand All @@ -28,3 +30,4 @@ The following configurations have been validated for the ASR microservice.
| **Deploy Method** | **Core Models** | **Platform** |
| ----------------- | --------------- | ----------------- |
| Docker Compose | Whisper | Intel Xeon/Gaudi2 |
| Docker Compose | Paraformer | Intel Core |
167 changes: 167 additions & 0 deletions comps/asr/src/README_paraformer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Deploying ASR Service

This document provides a comprehensive guide to deploying the ASR microservice pipeline with the Paraformer model on Intel platforms.

**Note:** This is an alternative of the [Whisper ASR service](./README.md). The Paraformer model supports both English and Mandarin audio input, and empirically it shows better performance in Mandarin than English.

## Table of contents

- [🚀 1. Quick Start with Docker Compose](#-1-quick-start-with-docker-compose): The recommended method for a fast and easy setup.
- [🚀 2. Manual Step-by-Step Deployment (Advanced)](#-2-manual-step-by-step-deployment-advanced): For users who want to build and run each container individually.
- [🚀 3. Start Microservice with Python](#-3-start-microservice-with-python): For users who prefer to run the ASR microservice directly with Python scripts.

## 🚀 1. Quick Start with Docker Compose

This method uses Docker Compose to start all necessary services with a single command. It is the fastest and easiest way to get the service running.

### 1.1. Access the Code

Clone the repository and navigate to the deployment directory:

```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps/comps/asr/deployment/docker_compose
```

### 1.2. Deploy the Service

Choose the command corresponding to your target platform.

```bash
export ip_address=$(hostname -I | awk '{print $1}')
export ASR_ENDPOINT=http://$ip_address:7066
export no_proxy=localhost,$no_proxy
```

- **For Intel® Core® CPU:**
```bash
docker compose -f ../docker_compose/compose.yaml up funasr-paraformer-service asr-funasr-paraformer -d
```
**Note:** it might take some time for `funasr-paraformer-service` to get ready, depending on the model download time in your network environment. If it fails to start with error message `dependency failed to start: container funasr-paraformer-service is unhealthy`, try increasing healthcheck retries in `GenAIComps/comps/third_parties/funasr/deployment/docker_compose/compose.yaml`

### 1.3. Validate the Service

Once the containers are running, you can validate the service. **Note:** Run these commands from the root of the `GenAIComps` repository.

```bash
# Test
wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav
curl http://localhost:9099/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F file="@./sample.wav" \
-F model="paraformer-zh"
```

### 1.4. Clean Up the Deployment

To stop and remove the containers, run the following command from the `comps/asr/deployment/docker_compose` directory:

```bash
docker compose down
```

---

## 🚀 2. Manual Step-by-Step Deployment (Advanced)

This section provides detailed instructions for building the Docker images and running each microservice container individually.

### 2.1. Clone the Repository

If you haven't already, clone the repository and navigate to the root directory:

```bash
git clone https://github.com/opea-project/GenAIComps.git
cd GenAIComps
```

### 2.2. Build the Docker Images

#### 2.2.1. Build FunASR Paraformer Server Image

- **For Intel® Core® CPU:**
```bash
docker build -t opea/funasr-paraformer:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/funasr/src/Dockerfile .
```

#### 2.2.2. Build ASR Service Image

```bash
docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/Dockerfile .
```

### 2.3 Start FunASR Paraformer and ASR Service

#### 2.3.1 Start FunASR Paraformer Server

- Core CPU

```bash
docker run -p 7066:7066 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy opea/funasr-paraformer:latest
```

#### 2.3.2 Start ASR service

```bash
ip_address=$(hostname -I | awk '{print $1}')

docker run -d -p 9099:9099 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e ASR_ENDPOINT=http://$ip_address:7066 opea/asr:latest
```

### 2.4 Validate the Service

After starting both containers, test the asr service endpoint. Make sure you are in the root directory of the `GenAIComps` repository.

```bash
# Use curl or python

# curl
wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav
curl http://localhost:9099/v1/audio/transcriptions \
-H "Content-Type: multipart/form-data" \
-F file="@./sample.wav" \
-F model="paraformer-zh"

# python
python check_asr_server.py
```

### 2.6. Clean Up the Deployment

To stop and remove the containers you started manually, use the `docker stop` and `docker rm` commands.

- **For Intel® Core® CPU:**
```bash
docker stop funasr-paraformer-service asr-funasr-paraformer-service
docker rm funasr-paraformer-service asr-funasr-paraformer-service
```

## 🚀 3. Start Microservice with Python

To start the ASR microservice with Python, you need to first install python packages.

### 3.1 Install Requirements

```bash
pip install -r requirements-cpu.txt
```

### 3.2 Start FunASR Paraformer Service/Test

- Core CPU

```bash
cd comps/third_parties/funasr/src
nohup python funasr_server.py --device=cpu &
python check_funasr_server.py
```

Note: please make sure that port 7066 is not occupied by other services. Otherwise, use the command `npx kill-port 7066` to free the port.

### 3.3 Start ASR Service/Test

```bash
cd ../../..
python opea_asr_microservice.py
python check_asr_server.py
```
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This document provides a comprehensive guide to deploying the ASR microservice pipeline on Intel platforms.

This guide covers two deployment methods:
## Table of contents

- [🚀 1. Quick Start with Docker Compose](#-1-quick-start-with-docker-compose): The recommended method for a fast and easy setup.
- [🚀 2. Manual Step-by-Step Deployment (Advanced)](#-2-manual-step-by-step-deployment-advanced): For users who want to build and run each container individually.
Expand Down
93 changes: 93 additions & 0 deletions comps/asr/src/integrations/funasr_paraformer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import asyncio
import os
from typing import List, Union

import requests
from fastapi import File, Form, UploadFile

from comps import CustomLogger, OpeaComponent, OpeaComponentRegistry, ServiceType
from comps.cores.proto.api_protocol import AudioTranscriptionResponse

logger = CustomLogger("opea_paraformer")
logflag = os.getenv("LOGFLAG", False)


@OpeaComponentRegistry.register("OPEA_PARAFORMER_ASR")
class OpeaParaformerAsr(OpeaComponent):
"""A specialized ASR (Automatic Speech Recognition) component derived from OpeaComponent for FUNASR Paraformer ASR services.

Attributes:
model_name (str): The name of the ASR model used.
"""

def __init__(self, name: str, description: str, config: dict = None):
super().__init__(name, ServiceType.ASR.name.lower(), description, config)
self.base_url = os.getenv("ASR_ENDPOINT", "http://localhost:7066")
health_status = self.check_health()
if not health_status:
logger.error("OpeaParaformerAsr health check failed.")

async def invoke(
self,
file: Union[str, UploadFile], # accept base64 string or UploadFile
model: str = Form("paraformer-zh"),
language: str = Form("english"),
prompt: str = Form(None),
response_format: str = Form("json"),
temperature: float = Form(0),
timestamp_granularities: List[str] = Form(None),
) -> AudioTranscriptionResponse:
"""Involve the ASR service to generate transcription for the provided input."""
if isinstance(file, str):
data = {"audio": file}
# Send the file and model to the server
response = await asyncio.to_thread(
requests.post,
f"{self.base_url}/v1/asr",
json=data,
)
res = response.json()["asr_result"]
return AudioTranscriptionResponse(text=res)
else:
# Read the uploaded file
file_contents = await file.read()

# Prepare the files and data
files = {
"file": (file.filename, file_contents, file.content_type),
}
data = {
"model": model,
"language": language,
"prompt": prompt,
"response_format": response_format,
"temperature": temperature,
"timestamp_granularities": timestamp_granularities,
}

# Send the file and model to the server
response = await asyncio.to_thread(
requests.post, f"{self.base_url}/v1/audio/transcriptions", files=files, data=data
)
res = response.json()["text"]
return AudioTranscriptionResponse(text=res)

def check_health(self) -> bool:
"""Checks the health of the embedding service.

Returns:
bool: True if the service is reachable and healthy, False otherwise.
"""
try:
response = requests.get(f"{self.base_url}/health")
if response.status_code == 200:
return True
else:
return False
except Exception as e:
# Handle connection errors, timeouts, etc.
logger.error(f"Health check failed: {e}")
return False
1 change: 1 addition & 0 deletions comps/asr/src/opea_asr_microservice.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from typing import List, Union

from fastapi import File, Form, UploadFile
from integrations.funasr_paraformer import OpeaParaformerAsr
from integrations.whisper import OpeaWhisperAsr

from comps import (
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
funasr-paraformer-service:
image: ${REGISTRY:-opea}/funasr-paraformer:${TAG:-latest}
container_name: funasr-paraformer-service
ports:
- ${FUNASR_PARAFORMER_PORT:-7066}:7066
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7066/health"]
interval: 10s
timeout: 6s
retries: 60
# mount a host directory to cache models if needed, make sure it exists before starting the container
# volumes:
# - /home/user/.cache:/home/user/.cache:rw
31 changes: 31 additions & 0 deletions comps/third_parties/funasr/src/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

RUN useradd -m -s /bin/bash user && \
mkdir -p /home/user && \
chown -R user /home/user/

# Set environment variables
ENV LANG=en_US.UTF-8
ARG ARCH=cpu

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends --fix-missing \
curl \
ffmpeg

COPY --chown=user:user comps /home/user/comps

ARG uvpip='uv pip install --system --no-cache-dir'
RUN pip install --no-cache-dir --upgrade pip setuptools uv && \
$uvpip torch torchaudio --index-url https://download.pytorch.org/whl/cpu ; \
$uvpip -r /home/user/comps/third_parties/funasr/src/requirements-cpu.txt


ENV PYTHONPATH=$PYTHONPATH:/home/user
USER user
WORKDIR /home/user/comps/third_parties/funasr/src

ENTRYPOINT ["python", "funasr_server.py", "--device", "cpu"]
2 changes: 2 additions & 0 deletions comps/third_parties/funasr/src/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
Loading
Loading