Skip to content

Token summary table shows misleading column categories — split into 4 API token fields and rename "input" to "uncached" #604

@Trecek

Description

@Trecek

Problem

The token summary table appended to PRs (and shown in terminal/compact views) displays 3 token columns (input | output | cached) but the Claude API returns 4 mutually exclusive token categories. The current display is technically correct per API semantics but deeply misleading:

  • "input" column shows only input_tokens — the uncached delta after the last cache breakpoint (typically 30–58 tokens for cached sessions). Users expect "input" to mean total context consumed.
  • "cached" column silently sums cache_read_input_tokens + cache_creation_input_tokens together, hiding the cost distinction between cache reads (0.1x billing) and cache writes (1.25x billing).
  • No "total effective input" exists — readers cannot see the actual context size (input + cache_read + cache_creation).

Example from PR #598: input=152, output=86.9k, cached=6.0M — the "152" looks like a bug but is correct (152 uncached tokens across 5 pipeline steps). The real total context was ~6M tokens.

Root Cause

The display/formatting layer collapses 4 API fields into 3 columns. The extraction and storage pipeline correctly preserves all 4 fields end-to-end — this is purely a formatting issue.

Affected Files

There are three independent implementations of the table format that all need updating:

  1. src/autoskillit/pipeline/telemetry_fmt.py — Canonical formatter

    • _TOKEN_COLUMNS tuple (line ~13-20): Terminal column definitions — 3 token columns
    • format_token_table() (line ~67): Hardcoded markdown header | Step | input | output | cached | count | time |
    • format_token_table() (line ~74-75): Collapses cached = cache_read + cache_creation
    • format_token_table_terminal() (line ~126-129): Same collapse for terminal output
    • format_compact_kv() (line ~179-181): Same collapse for compact key-value format
  2. src/autoskillit/hooks/token_summary_appender.py — Stdlib-only PostToolUse hook

    • _format_table() (line ~240-277): Independent inline copy of table generation
    • _humanize() (line ~87-97): Inline copy of number formatter
    • Must remain stdlib-only (no autoskillit imports)
  3. src/autoskillit/hooks/pretty_output.py — Stdlib-only PostToolUse hook

    • _fmt_get_token_summary() (line ~255-258): Compact format with same collapse
    • _fmt_run_skill() (line ~119): Only shows cache_read_input_tokens, silently drops cache_creation_input_tokens
    • Must remain stdlib-only (no autoskillit imports)

Required Changes

1. Rename "input" to "uncached" across all three formatters

  • Markdown header: | Step | uncached | output | cache_read | cache_write | count | time |
  • Terminal columns: Update _TOKEN_COLUMNS labels
  • Compact KV: Update in: prefix to uncached: (or abbreviated uc:)

2. Split "cached" into "cache_read" and "cache_write" columns

  • Map cache_read_input_tokens → "cache_read" column
  • Map cache_creation_input_tokens → "cache_write" column
  • This matches what the pipeline-summary SKILL.md template already does (it defines a 6-column table with separate Cache Write and Cache Read columns)

3. Update the golden snapshot test

  • tests/pipeline/test_telemetry_formatter.pyTestFormatTokenTable.test_snapshot has a hardcoded expected table string that must be updated for the new columns
  • Other tests in that file checking column presence may need updates

4. Update the hook parity test

  • tests/infra/test_pretty_output.py (test 1g, line ~1221) enforces parity between pretty_output._fmt_tokens and telemetry_fmt._humanize — may need updating if the format strings change

5. Ensure token_summary_appender test reflects new columns

  • tests/infra/test_token_summary_appender.pytest_tsa5_matching_sessions_formats_table_and_edits_pr verifies the PR body contains a token table — must match new column format

Design Notes

  • The pipeline-summary SKILL.md template already uses the correct 4-category column layout — this change aligns the programmatic formatters with that existing precedent
  • TokenEntry dataclass already has all 4 fields — no data model changes needed
  • DefaultTokenLog.record() already accumulates all 4 fields — no accumulation changes needed
  • extract_token_usage() already extracts all 4 fields — no extraction changes needed
  • The _humanize() / _fmt_tokens number formatting function is correct and unchanged
  • All three formatter files must remain consistent (canonical + two stdlib-only hook copies)

Investigation Report

Full investigation: .autoskillit/temp/investigate/investigation_token_summary_pr598_2026-04-04_120000.md

Requirements

FMT — Token Table Column Formatting

  • REQ-FMT-001: The markdown token table must display 5 token-related columns: uncached, output, cache_read, cache_write, count, time.
  • REQ-FMT-002: The column formerly labeled "input" must be renamed to "uncached" and map exclusively to the input_tokens API field.
  • REQ-FMT-003: The column formerly labeled "cached" must be split into "cache_read" (mapping to cache_read_input_tokens) and "cache_write" (mapping to cache_creation_input_tokens).
  • REQ-FMT-004: The terminal-format token table (format_token_table_terminal) must use the same column structure as the markdown table.
  • REQ-FMT-005: The compact key-value format (format_compact_kv) must display all 4 token categories with distinct prefixes.

HOOK — Stdlib-Only Hook Consistency

  • REQ-HOOK-001: The token_summary_appender.py hook must produce the same column structure as the canonical telemetry_fmt.py formatter.
  • REQ-HOOK-002: The pretty_output.py hook must display all 4 token categories in both _fmt_get_token_summary and _fmt_run_skill output.
  • REQ-HOOK-003: Both hook implementations must remain stdlib-only (no autoskillit imports).

TEST — Test Suite Updates

  • REQ-TEST-001: The golden snapshot test in test_telemetry_formatter.py must be updated to assert the new column structure.
  • REQ-TEST-002: The hook parity test in test_pretty_output.py must continue to enforce formatting consistency between hook and canonical implementations.
  • REQ-TEST-003: The test_token_summary_appender.py integration test must validate the new column format in the PR body output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrecipe:implementationRoute: proceed directly to implementationstagedImplementation staged and waiting for promotion to main

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions