feat: summarization chunking by Ada-lave · Pull Request #1004 · thewh1teagle/vibe

Ada-lave · 2026-03-09T17:26:50Z

Summary

Long recordings (1h+) can produce 50–200K characters of transcript text, which exceeds the context window of most local models (e.g. Ollama llama3.2 with 8K ctx ≈ ~10K chars for Russian). Previously the entire transcript was sent in a single LLM request with no length checking, causing silent failures or truncated summaries.

This PR implements a Map-Reduce summarization strategy:

If the transcript fits within the configured limit → single request, zero change to existing behavior
If the transcript is too long → split into segment-boundary chunks, summarize each sequentially with rolling context (each chunk receives the previous partial summary as context), then synthesize all partial summaries into a single coherent result

Changes

src/lib/llm/chunking.ts (new) — core chunking logic: segment-boundary splitting, rolling-context chunk prompts, synthesis prompt, summarizeWithChunking() function with optional onProgress callback
src/lib/llm/index.ts — added maxInputChars?: number field to LlmConfig
src/lib/config.ts — added llmDefaultMaxInputChars = 24_000 (≈ 6000 tokens, safe for small local models)
src/pages/home/view-model.ts — replaced both llm.ask() call sites (auto-summarize + re-summarize) with summarizeWithChunking(); toast now shows live progress: "Summarizing part 2 of 4..." → "Merging summaries..."
src/pages/batch/view-model.tsx — same replacement for batch mode
src/components/params.tsx — added Max Input Characters input in LLM settings
locales/en-US/common.json, locales/ru-RU/common.json — added translation keys for the new setting and progress messages

How it works

Transcript (100K chars)
    ├─ ≤ 24K → single request  [unchanged behavior]
    └─ > 24K → chunk by segment boundaries
          ├─ Chunk 1 ─────────────────────► Partial 1
          ├─ Chunk 2 + [ctx: Partial 1] ──► Partial 2
          ├─ Chunk 3 + [ctx: Partial 2] ──► Partial 3
          └─ Synthesize [P1 + P2 + P3] ──► Final summary

Configuration

Default maxInputChars = 24000. For models with 8K context window, recommended values: 10000–12000 for Russian, 16000–18000 for English.

Closes #999

Ada-lave added 5 commits March 9, 2026 21:21

feat: summarization chanking

ee75cca

feat: add progress display

0288cdc

cargo fmt

eb8412e

cargo clippy

b536d7e

file rename

a2c7538

Ada-lave changed the title ~~Summarization chunking~~ feat: summarization chunking Mar 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: summarization chunking#1004

feat: summarization chunking#1004
Ada-lave wants to merge 5 commits intothewh1teagle:mainfrom
Ada-lave:vs-summarization-chunking

Ada-lave commented Mar 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Ada-lave commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

How it works

Configuration

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ada-lave commented Mar 9, 2026 •

edited

Loading