Add activity rendering and direct PDF raster extraction by nicpottier · Pull Request #20 · unicef/adt-studio

nicpottier · 2026-02-12T03:30:33Z

Summary

Activity rendering: Add LLM-driven rendering for 7 interactive activity types (multiple choice, fill-in-the-blank, fill-in-a-table, matching, sorting, true/false, open-ended) with Liquid prompt templates and automatic answer generation via a second LLM call
PDF raster extraction: Replace SVG-based raster image extraction with direct extraction from PDF XObject dictionaries via mupdf, including recursion into Form XObjects for nested images, CMYK JPEG color conversion, and deduplication by object number
Schema fixes: Change activity answers LLM schema from z.record() to array of {id, value} for OpenAI structured output compatibility; add config validation for answer_prompt only on activity render types

Test plan

All 281 tests pass (pnpm test)
TypeScript strict mode passes (pnpm typecheck)
Manual verification: raven.pdf images extract correctly
Manual verification: ancient_egypt.pdf CMYK JPEGs display with correct colors
Manual verification: ancient_egypt pages 6-7 extract nested Form XObject images
Manual verification: cuaderno3.pdf extracts correctly

…jects Activity rendering: - Add Liquid prompt templates for 7 activity types with answer variants - Extend config with activity render strategies and per-section model resolution - Add HTML validation support for activity-generated IDs - Update web-rendering types for activity sections PDF raster extraction: - Replace SVG-based raster image extraction with direct PDF object extraction - Recurse into Form XObjects to find nested images - Handle CMYK JPEG color conversion via mupdf - Deduplicate images by PDF object number - Support JPEG format throughout storage and API layers

OpenAI structured outputs don't support z.record() (additionalProperties). Change activityAnswersLLMSchema from record to array of {id, value} objects, and convert back to record in render-llm.ts for storage. Update all answer prompt templates to show the new array format.

…e prompt - Handle .jpeg extension (not just .jpg) in image serving route - Add superRefine validation: answer_prompt only allowed for activity render type - Fix true/false prompt: radio values use "true"/"false", shared data-activity-item per question

Resolve conflicts in config.ts (BookFormat/LayoutType from main + activity RenderType from branch) and books.ts (new routes from main + updated image serving from branch). Fix books.test.ts to use new buffer/format image fields.

nicpottier added 5 commits February 11, 2026 17:47

Merge main into nicpottier/activity-rendering

eb88c15

Resolve conflicts in config.ts (BookFormat/LayoutType from main + activity RenderType from branch) and books.ts (new routes from main + updated image serving from branch). Fix books.test.ts to use new buffer/format image fields.

Regenerate pipeline integration test LLM cache fixtures

5cda993

nicpottier merged commit 90222be into main Feb 12, 2026
1 check passed

nicpottier deleted the nicpottier/activity-rendering branch February 12, 2026 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add activity rendering and direct PDF raster extraction#20

Add activity rendering and direct PDF raster extraction#20
nicpottier merged 5 commits intomainfrom
nicpottier/activity-rendering

nicpottier commented Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nicpottier commented Feb 12, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant