Skip to content

feat: summarization chunking#1004

Open
Ada-lave wants to merge 5 commits intothewh1teagle:mainfrom
Ada-lave:vs-summarization-chunking
Open

feat: summarization chunking#1004
Ada-lave wants to merge 5 commits intothewh1teagle:mainfrom
Ada-lave:vs-summarization-chunking

Conversation

@Ada-lave
Copy link
Contributor

@Ada-lave Ada-lave commented Mar 9, 2026

Summary

Long recordings (1h+) can produce 50–200K characters of transcript text, which exceeds the context window of most local models (e.g. Ollama llama3.2 with 8K ctx ≈ ~10K chars for Russian). Previously the entire transcript was sent in a single LLM request with no length checking, causing silent failures or truncated summaries.

This PR implements a Map-Reduce summarization strategy:

  • If the transcript fits within the configured limit → single request, zero change to existing behavior
  • If the transcript is too long → split into segment-boundary chunks, summarize each sequentially with rolling context (each chunk receives the previous partial summary as context), then synthesize all partial summaries into a single coherent result

Changes

  • src/lib/llm/chunking.ts (new) — core chunking logic: segment-boundary splitting, rolling-context chunk prompts, synthesis prompt, summarizeWithChunking() function with optional onProgress callback
  • src/lib/llm/index.ts — added maxInputChars?: number field to LlmConfig
  • src/lib/config.ts — added llmDefaultMaxInputChars = 24_000 (≈ 6000 tokens, safe for small local models)
  • src/pages/home/view-model.ts — replaced both llm.ask() call sites (auto-summarize + re-summarize) with summarizeWithChunking(); toast now shows live progress: "Summarizing part 2 of 4...""Merging summaries..."
  • src/pages/batch/view-model.tsx — same replacement for batch mode
  • src/components/params.tsx — added Max Input Characters input in LLM settings
  • locales/en-US/common.json, locales/ru-RU/common.json — added translation keys for the new setting and progress messages

How it works

Transcript (100K chars)
    ├─ ≤ 24K → single request  [unchanged behavior]
    └─ > 24K → chunk by segment boundaries
          ├─ Chunk 1 ─────────────────────► Partial 1
          ├─ Chunk 2 + [ctx: Partial 1] ──► Partial 2
          ├─ Chunk 3 + [ctx: Partial 2] ──► Partial 3
          └─ Synthesize [P1 + P2 + P3] ──► Final summary

Configuration

Default maxInputChars = 24000. For models with 8K context window, recommended values: 10000–12000 for Russian, 16000–18000 for English.

Closes #999

@Ada-lave Ada-lave changed the title Summarization chunking feat: summarization chunking Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add a chunking to summarize the text

1 participant