English | 简体中文
Code2Video: A Code-centric Paradigm for Educational Video Generation
以代码为中心的教学视频生成新范式
Yanzhe Chen*,
Kevin Qinghong Lin*,
Mike Zheng Shou
Show Lab @ National University of Singapore
📄 Paper | 🤗 Daily Paper | 🤗 Dataset | 🌐 Project Website | 💬 X (Twitter)
code2video_light.mp4
| Learning Topic | Veo3 | Wan2.2 | Code2Video (Ours) |
|---|---|---|---|
| Hanoi Problem | |||
| Large Language Model | |||
| Pure Fourier Series |
Any contributions are welcome!
- [2025.11.25] Our Code2Video has reached 1000 stars!
- [2025.11.06] We optimized
requirements.txt, which resulted in an 80-90% reduction in installation time. Thanks to daxiongshu! - [2025.10.11] Due to issues on ICONFINDER, we’ve updated Code2Video auto-collected icons at MMMC as a temporary alternative.
- [2025.10.6] We have updated the ground truth human-made videos and metadata for the MMMC dataset.
- [2025.10.3] Thanks @_akhaliq for sharing our work on Twitter!
- [2025.10.2] We release the arXiv, code and dataset .
- [2025.9.22] Code2Video has been accepted to the Deep Learning for Code (DL4C) Workshop at NeurIPS 2025.
Code2Video is an agentic, code-centric framework that generates high-quality educational videos from knowledge points.
Unlike pixel-based text-to-video models, our approach leverages executable Manim code to ensure clarity, coherence, and reproducibility.
Key Features:
- 🎬 Code-Centric Paradigm — executable code as the unified medium for both temporal sequencing and spatial organization of educational videos.
- 🤖 Modular Tri-Agent Design — Planner (storyboard expansion), Coder (debuggable code synthesis), and Critic (layout refinement with anchors) work together for structured generation.
- 📚 MMMC Benchmark — the first benchmark for code-driven video generation, covering 117 curated learning topics inspired by 3Blue1Brown, spanning diverse areas.
- 🧪 Multi-Dimensional Evaluation — systematic assessment on efficiency, aesthetics, and end-to-end knowledge transfer.
cd src/
pip install -r requirements.txtHere is the official installation guide for Manim Community v0.19.0, to help everyone correctly set up the environment.
Fill in your API credentials in api_config.json.
-
LLM API:
- Required for Planner & Coder.
- Best Manim code quality achieved with Claude-4-Opus.
-
VLM API:
- Required for Planner Critic.
- For layout and aesthetics optimization, provide Gemini API key.
- Best quality achieved with gemini-2.5-pro-preview-05-06.
-
Visual Assets API:
- To enrich videos with icons, set
ICONFINDER_API_KEYfrom IconFinder.
- To enrich videos with icons, set
We provide two shell scripts for different generation modes:
Script: run_agent_single.sh
Generates a video from a single knowledge point specified in the script.
sh run_agent_single.sh --knowledge_point "Linear transformations and matrices"Important parameters inside run_agent_single.sh:
API: specify which LLM to use.FOLDER_PREFIX: output folder prefix (e.g.,TEST-single).KNOWLEDGE_POINT: target concept, e.g."Linear transformations and matrices".
Script: run_agent.sh
Runs all (or a subset of) learning topics defined in long_video_topics_list.json.
sh run_agent.shImportant parameters inside run_agent.sh:
API: specify which LLM to use.FOLDER_PREFIX: name prefix for saving output folders (e.g.,TEST-LIST).MAX_CONCEPTS: number of concepts to include (-1means all).PARALLEL_GROUP_NUM: number of groups to run in parallel.
A suggested directory structure:
src/
│── agent.py
│── run_agent.sh
│── run_agent_single.sh
│── api_config.json
│── ...
│
├── assets/
│ ├── icons/ # downloaded visual assets cache via IconFinder API
│ └── reference/ # reference images
│
├── json_files/ # JSON-based topic lists & metadata
├── prompts/ # prompt templates for LLM calls
├── CASES/ # generated cases, organized by FOLDER_PREFIX
│ └── TEST-LIST/ # example multi-topic generation results
│ └── TEST-single/ # example single-topic generation results
We evaluate along three complementary dimensions:
-
Knowledge Transfer (TeachQuiz)
python3 eval_TQ.py
-
Aesthetic & Structural Quality (AES)
python3 eval_AES.py
-
Efficiency Metrics (During Video Creating)
- Token usage
- Execution time
👉 More data and evaluation scripts are available at: HuggingFace: MMMC Benchmark
- Video data is sourced from the 3Blue1Brown official lessons and Grant Sanderson. These videos represent the upper bound of clarity and aesthetics in educational video design and inform our evaluation metrics.
- We thank all the Show Lab @ NUS members for support!
- This project builds upon open-source contributions from Manim Community and the broader AI research ecosystem.
- High-quality visual assets (icons) are provided by IconFinder and Icons8, which were used to enrich the educational videos.
If you find our work useful, please cite:
@misc{code2video,
title={Code2Video: A Code-centric Paradigm for Educational Video Generation},
author={Yanzhe Chen and Kevin Qinghong Lin and Mike Zheng Shou},
year={2025},
eprint={2510.01174},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.01174},
}If you like our project, please give us a star ⭐ on GitHub for the latest update!










