Skip to content

Support Kokoro model#1653

Open
openvino-agent wants to merge 4 commits intohuggingface:mainfrom
openvino-agent:support_kokoro
Open

Support Kokoro model#1653
openvino-agent wants to merge 4 commits intohuggingface:mainfrom
openvino-agent:support_kokoro

Conversation

@openvino-agent
Copy link
Copy Markdown

@openvino-agent openvino-agent commented Mar 27, 2026

What does this PR do?

Set-up environment:

pip install git+https://github.com/openvino-agent/optimum-intel.git@support_kokoro
pip install kokoro

Run exporting the model:

optimum-cli export openvino -m hexgrad/Kokoro-82M ov_Kokoro-82M --trust-remote-code

Run inference:

import soundfile as sf
import torch
from optimum.intel import OVModelForTextToSpeechSeq2Seq

model_id = "hexgrad/Kokoro-82M"

# Export from Hub and save locally
ov_pipe = OVModelForTextToSpeechSeq2Seq.from_pretrained(model_id, trust_remote_code=True)

# Preprocess text into model inputs (G2P + tokenization + voice loading)
inputs = ov_pipe.preprocess_input(
    text="Hello, this is a test of Kokoro text-to-speech using OpenVINO.",
    voice="af_heart",
    speed=1.0,
)

# Generate audio waveform
speech = ov_pipe.generate(**inputs)

sf.write("speech.wav", speech.numpy(), samplerate=24000)

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@openvino-agent openvino-agent marked this pull request as draft March 27, 2026 12:36
@openvino-agent openvino-agent marked this pull request as ready for review April 3, 2026 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants