Add integration test for full pipeline with LLM cache fixtures#16
Merged
nicpottier merged 1 commit intomainfrom Feb 11, 2026
Merged
Add integration test for full pipeline with LLM cache fixtures#16nicpottier merged 1 commit intomainfrom
nicpottier merged 1 commit intomainfrom
Conversation
Refactor cli.ts to extract core pipeline orchestration into a reusable runPipeline() function in pipeline.ts. Both the CLI and integration tests call this same function, eliminating pipeline logic duplication. The integration test runs the full pipeline against raven.pdf (pages 1-3) using pre-populated LLM cache fixtures for reproducible, fast execution (~5s with no API calls). Cache files are git-tracked; regenerate by deleting fixtures/raven-cache/ and rerunning with OPENAI_API_KEY set. All 272 tests pass; integration test validates complete pipeline: PDF extraction, metadata extraction, text classification, image classification, page sectioning, and web rendering.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactored the CLI to extract core pipeline orchestration into a reusable
runPipeline()function, eliminating logic duplication between the CLI and tests. The CLI is now a thin wrapper that translates progress events to terminal UI.Created an integration test that runs the full pipeline against raven.pdf (pages 1-3) with pre-populated LLM cache fixtures for reproducible, fast execution (~5 seconds with no API calls).
Details
runPipeline()andRunPipelineOptionsinterfacerunPipelineandRunPipelineOptionsCache regeneration is simple: delete fixtures/raven-cache/ and rerun test with
OPENAI_API_KEYset.Tests
All 272 tests pass. The integration test confirms all pipeline steps complete successfully: