Run
pip install -r requirements.txt
Python bindings for whisper.cpp with a simple Pythonic API on top of it.
- Install ffmpeg
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg- Once ffmpeg is installed, install
pywhispercpp
pip install pywhispercppIf you want to use the examples, you will need to install extra dependencies
pip install pywhispercpp[examples]Or install the latest dev version from GitHub
pip install git+https://github.com/abdeladim-s/pywhispercppfrom pywhispercpp.model import Model
model = Model('base.en', n_threads=6)
segments = model.transcribe('file.mp3', speed_up=True)
for segment in segments:
print(segment.text)You can also assign a custom new_segment_callback
from pywhispercpp.model import Model
model = Model('base.en', print_realtime=False, print_progress=False)
segments = model.transcribe('file.mp3', new_segment_callback=print)- The
ggmlmodel will be downloaded automatically. - You can pass any
whisper.cppparameter as a keyword argument to theModelclass or to thetranscribefunction. - The
transcribefunction accepts any media file (audio/video), in any format. - Check the Model class documentation for more details.
- If you encounter any issue with gr.Audio uncomment the commented line and comment the existing one.
- The existing works with gradio
- transcribe: This function takes the audio input resample it using resamplr_to_16k function and saves it in a temporary .wav file which will be deleted later.
- resample_to_16k: This function resamples the speech rate of audio to 16k
- Main Github : https://github.com/abdeladim-s/pywhispercpp/
- Documentation : https://abdeladim-s.github.io/pywhispercpp/