Skip to content

Conversation

@simllll
Copy link
Contributor

@simllll simllll commented Nov 11, 2025

Description

fixes #819 (without blingfire implementation)

Changes Made

  • added auto_mode flag (kept it to false by default, in python it is true by default though)
  • added checks for sentencetokenizer
  • added types for sentencetokenizer in the elevenlabs implementation

Pre-Review Checklist

  • Build passes: All builds (lint, typecheck, tests) pass locally
  • AI-generated code reviewed: Removed unnecessary comments and ensured code quality
  • Changes explained: All changes are properly documented and justified above
  • Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included

Testing

  • manual test locally

@changeset-bot
Copy link

changeset-bot bot commented Nov 11, 2025

🦋 Changeset detected

Latest commit: 19a21a4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@livekit/agents-plugin-elevenlabs Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@simllll simllll marked this pull request as draft November 11, 2025 14:15
@simllll
Copy link
Contributor Author

simllll commented Nov 11, 2025

WARN (285528): skipping user input, current speech generation cannot be interrupted when using the "flush" parameter.

I tried mirroring the python implementation and therefore also added the flush paramater when using sentence tokenizer and automode. but adding "flush" breaks it somehow....I guess I'm missing something that python has and we don't have in typescript.
I'm removing the flush parameter for now again, without it it seems to work quite fine.

@simllll simllll marked this pull request as ready for review November 11, 2025 14:21
@simllll
Copy link
Contributor Author

simllll commented Nov 11, 2025

Followup on "flush: true".

The reason why this actually broke thigns is that the livekit implemenation currently is not using the multi stream input api.. it starts a new websocket on each turn.

the python version uses this: https://elevenlabs.io/docs/api-reference/text-to-speech/v-1-text-to-speech-voice-id-multi-stream-input, there we also need the flush parameter. I made a quick implementation, it seems to be portable from python but needs more work.

Copy link
Contributor

@toubatbrian toubatbrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! Really nice about the inline comments!

@toubatbrian
Copy link
Contributor

the python version uses this: https://elevenlabs.io/docs/api-reference/text-to-speech/v-1-text-to-speech-voice-id-multi-stream-input, there we also need the flush parameter. I made a quick implementation, it seems to be portable from python but needs more work.

Thanks for digging into this! Let me know if you need help!

@toubatbrian toubatbrian merged commit daadcb4 into livekit:main Nov 12, 2025
5 checks passed
@github-actions github-actions bot mentioned this pull request Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

elevenlabs autoMode and SentenceTokenizer

2 participants