Skip to content

Commit f4f4da2

Browse files
add AudioQnA readme with supported model (#689)
* add readme with supported model * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add explaination --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 1e47444 commit f4f4da2

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

AudioQnA/README.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# AudioQnA Application
2+
3+
AudioQnA is an example that demonstrates the integration of Generative AI (GenAI) models for performing question-answering (QnA) on audio files, with the added functionality of Text-to-Speech (TTS) for generating spoken responses. The example showcases how to convert audio input to text using Automatic Speech Recognition (ASR), generate answers to user queries using a language model, and then convert those answers back to speech using Text-to-Speech (TTS).
4+
5+
## Deploy AudioQnA Service
6+
7+
The AudioQnA service can be deployed on either Intel Gaudi2 or Intel XEON Scalable Processor.
8+
9+
### Deploy AudioQnA on Gaudi
10+
11+
Refer to the [Gaudi Guide](./docker/gaudi/README.md) for instructions on deploying AudioQnA on Gaudi.
12+
13+
### Deploy AudioQnA on Xeon
14+
15+
Refer to the [Xeon Guide](./docker/xeon/README.md) for instructions on deploying AudioQnA on Xeon.
16+
17+
## Supported Models
18+
19+
### ASR
20+
21+
The default model is [openai/whisper-small](https://huggingface.co/openai/whisper-small). It also supports all models in the Whisper family, such as `openai/whisper-large-v3`, `openai/whisper-medium`, `openai/whisper-base`, `openai/whisper-tiny`, etc.
22+
23+
To replace the model, please edit the `compose.yaml` and add the `command` line to pass the name of the model you want to use:
24+
25+
```yml
26+
services:
27+
whisper-service:
28+
...
29+
command: --model_name_or_path openai/whisper-tiny
30+
```
31+
32+
### TTS
33+
34+
The default model is [microsoft/SpeechT5](https://huggingface.co/microsoft/speecht5_tts). We currently do not support replacing the model. More models under the commercial license will be added in the future.

0 commit comments

Comments
 (0)