Lecture Builder Agent

A local-first, multi-agent system that generates high‑quality, university‑grade lecture and workshop outlines (with supporting materials) from a single topic prompt. Agents are defined with Pydantic‑AI models and coordinated by a custom Python orchestrator. The system integrates OpenAI o4‑mini/o3 models, web search via Tavily, and a React‑based UX. Full state, citations, logs, and intermediates persist in SQLite or Postgres. Observability is handled by Logfire. Exports include Markdown, DOCX, and PDF.

Key Features

Multi-Agent Workflow: Researcher, Planner, Learning Advisor, Content Weaver, Editor, Final Reviewer and Exporter nodes coordinated by a lightweight custom orchestrator.
Streaming UI: Token-level draft streaming with diff highlights; action/reasoning log streaming via SSE.
Robust Citations: Tavily search, citation metadata stored in SQLite, Creative Commons and university domain filtering.
Local-First: Operates offline using cached corpora and fallback to local dense retrieval.
Flexible Session Model: Lectures and workshops share a composable schema with optional pedagogical styles and learning methods.
Progressive Document Graph: Research results, modules, and slide details are captured in a DAG so individual nodes can be revised without regenerating the whole session.
Flexible Exports: Markdown (canonical), DOCX (python-docx), PDF (WeasyPrint), with cover page, TOC, and bibliography.
Audit & Governance: Immutable action logs, SHA‑256 state hashes, role‑based access, database encryption.

Architecture Overview

The system comprises:

Custom Orchestrator: Manages typed State objects through nodes and edges defined with Pydantic models. Handles checkpointing in SQLite.
Agents:
- Researcher-Web: Executes parallel Tavily searches, dedupes and ranks sources.
- Curriculum Planner: Defines learning objectives and module structure.
- Learning Advisor: Converts pedagogical intent into Bloom-aligned lesson plans.
- Content Weaver: Generates slide decks and assessments from lesson plans.
- Editor: Reviews narrative coherence and learning goals.
- Final Reviewer: Scores overall consistency and completeness.
- Exporter: Renders final deliverables.
Web UX: React + Primer, SSE-driven, panels for document, log, sources, and controls.
Storage Layer: SQLite or Postgres for state, logs, citations via repository abstraction.
Export Pipeline: Pandoc-ready Markdown, python-docx, WeasyPrint PDF.

Stream Channels

Agents emit events over several orchestrator channels that the frontend can subscribe to via Server-Sent Events:

messages – token-level content such as LLM outputs.
debug – diagnostic information and warnings.
values – structured state snapshots.
updates – citation and progress updates.

All stream connections require a short-lived JWT passed as a query parameter. Fetch a token from GET /stream/token and connect using /stream/<channel>?token=<JWT> or /stream/<workspace>/<channel>?token=<JWT>.

Getting Started

Prerequisites

Python 3.11 or later
Node.js 18+ (for frontend)
poetry (recommended) or pipenv
OpenAI API key
Tavily API key (optional)

Installation

Clone the repository:

git clone https://github.com/your-org/lecture-builder-agent.git
cd lecture-builder-agent

Backend dependencies:
```
poetry install
```
Frontend dependencies:
```
cd frontend
npm install
```
(Optional) Build the frontend locally:
```
./scripts/build_frontend.sh
```
Docker images build the frontend automatically, so this step is only required when running the app directly on your host. The command generates static assets in frontend/dist that the FastAPI server serves.

Configuration

Configuration is managed with pydantic-settings and sourced from environment variables (e.g. a .env file):

cp .env.example .env
# Edit .env:
# OPENAI_API_KEY=sk-...
# TAVILY_API_KEY=...
# LOGFIRE_API_KEY=...
# LOGFIRE_PROJECT=...
# MODEL=openai:o4-mini
# DATA_DIR=./workspace
# OFFLINE_MODE=false
# ENABLE_TRACING=true
# ALLOWLIST_DOMAINS=["wikipedia.org",".edu",".gov"]
# ALERT_WEBHOOK_URL=https://example.com/hook

Running Locally

For a single-command startup that builds the frontend, waits for the database, applies migrations, and serves the app with built assets, use Docker Compose:

docker compose up --build

Then open your browser at http://localhost:8000.

The frontend seeds a development JWT (role: user) in localStorage so API requests work without a login flow:

localStorage.setItem(
  "jwt",
  "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyb2xlIjoidXNlciJ9.hM2ErG_O2e4d9YiCeVoTbiRDoo4ziDiIPfDFE40GUlg",
);

Replace the token with one signed using your JWT_SECRET for custom setups.

To run the services directly on your host for development:

Start the backend (FastAPI + custom orchestrator):
```
./scripts/run.sh [--offline]
```
This helper invokes Uvicorn with:
```
poetry run uvicorn web.main:create_app --reload
```
Start the frontend:
```
cd frontend
npm run dev
```
The Vite dev server proxies /api and /stream requests to http://localhost:8000, allowing the frontend to call those paths without specifying a host.
Open your browser at http://localhost:3000 and enter a topic to begin.

Component Breakdown

Orchestration

Custom orchestrator implemented in src/core/orchestrator.py and leveraging Pydantic‑AI models for agent interfaces.
Checkpointing handled in src/core/checkpoint.py with SQLite or Postgres backends.
Edge policies enforce confidence thresholds and retry loops.

Retrieval & Citation

SearchClient abstraction in src/agents/researcher_web.py supporting Tavily
Citation objects stored in state.citations table.
Filtering by domain allowlist and SPDX license checks.

Content Synthesis

OpenAIFunctionCaller utilizes function-call pattern for outline JSON.
Schema enforcement in schemas/outline.json.
Token streaming via the orchestrator's messages channel.

Quality Control

Editor records narrative and objective feedback.
FinalReviewer outputs a QAReport object with an overall score.
Integration tests in tests/quality/.

Persistence & Versioning

SQLite schema managed in src/persistence/.
Parquet blobs for document versions.
Optional Postgres: swap storage/sqlite.py with storage/postgres.py.

Frontend UX

React app (frontend/src/): Panels for Document, Log, Sources.
SSE client in frontend/src/services/stream.ts.
Diff highlighting via diff-match-patch.
Primer CSS provides the base styling for components and layout.
Command palette (Cmd/Ctrl+K) for quick Run, Retry, and Export actions.

Exporters

Markdown: direct from outline JSON → Markdown converter.
DOCX: generated via src/export/docx_exporter.py.
PDF: headless WeasyPrint configured in src/export/pdf_exporter.py.

API

All endpoints, except /healthz and /readyz, are namespaced under /api and require a JWT via the Authorization: Bearer <token> header. Tokens are signed using JWT_SECRET and validated with the HS256 algorithm by default.

Code	Description
401	Missing token or failed signature check
403	Token valid but caller lacks required role

Authoring Process

Researcher collects web snippets and extracts keywords.
Planner turns research into learning outcomes and pedagogical intent.
Learning Advisor expands topics into lesson plans aligned with Bloom's taxonomy.
Content Weaver drafts slides, visualization notes, and speaker notes.
Editor reviews the material and may trigger targeted rewrites.
Final Check scores overall consistency and completeness.

Each stage updates the DocumentDAG so specific nodes—such as a slide's copy or visualization notes—can be refined without regenerating the entire session.

Examples

Invoking the Orchestrator

from core.orchestrator import GraphOrchestrator, build_main_flow
from core.state import State

state = State(prompt="Explain quantum computing basics")
orch = GraphOrchestrator(build_main_flow())
await orch.run(state)

Working with Pydantic Models

from agents.models import WeaveResult

model = WeaveResult(
    title="Intro to AI",
    learning_objectives=["Define artificial intelligence"],
    duration_min=60,
    session_type="workshop",
    pedagogical_styles=["flipped"],
    learning_methods=["case study"],
)

Storage Options

SQLite (default): single workspace.db file in DATA_DIR.
Postgres: set DATABASE_URL and install psycopg2; update config.

Database Migrations

Schema changes are managed with Alembic. After pulling new code, apply migrations to your workspace database:

alembic upgrade head

To create a new migration after modifying models:

alembic revision --autogenerate -m "add new table"

The command reads alembic.ini and writes migration scripts to migrations/versions/.

Command-line Lecture Generation

Run the CLI to generate lecture material while persisting state to the workspace database. Each invocation creates a new workspace identified by a slug of the topic and timestamp:

poetry run python -m cli.generate_lecture "Intro to Quantum"

The resulting workspace data is stored in DATA_DIR/workspace.db and exports are written alongside the generated Markdown file.

Configuration & Environment Variables

All runtime configuration is supplied via environment variables or a .env file. Secret tokens (API keys, project identifiers, etc.) must be sourced from your secret manager and never hard-coded. The legacy LangSmith variables have been replaced by Logfire's settings.

Variable	Description	Default
`OPENAI_API_KEY`	API key for OpenAI	(required)
`TAVILY_API_KEY`	API key for Tavily search
`LOGFIRE_API_KEY`	API key for Logfire
`LOGFIRE_PROJECT`	Logfire project identifier
`MODEL`	LLM provider and model (`openai:o4-mini`)	`openai:o4-mini`
`DATA_DIR`	Path for SQLite DB, cache, logs	(required)
`FRONTEND_DIST`	Directory containing built frontend assets	`frontend/dist`
`DATABASE_URL`	SQLAlchemy connection string	`sqlite:///${DATA_DIR}/workspace.db`
`OFFLINE_MODE`	Run without external network calls	`false`
`ENABLE_TRACING`	Enable Logfire tracing instrumentation	`true`
`ALLOWLIST_DOMAINS`	JSON list of citation-allowed domains	`["wikipedia.org", ".edu", ".gov"]`
`ALERT_WEBHOOK_URL`	Optional webhook for alert notifications
`JWT_SECRET`	HMAC secret for signing JWTs	(required)
`JWT_ALGORITHM`	JWT signing algorithm	`HS256`

Logging & Tracing

Logfire: Handles structured JSON logs and spans. Use core.logging.get_logger(job_id, user_id) to bind contextual identifiers for correlation. OpenTelemetry metrics track request counts, active SSE clients, and export durations, exposed at /metrics for Prometheus scraping.

Testing & QA

Unit tests: pytest in tests/
Integration tests: mock orchestrator runs in CI.
Performance tests: k6 scripts in performance/
Benchmarks: scripts/benchmark_pipeline.py compares the current pipeline against the previous implementation to surface regressions.
CLI run output: ./scripts/cli.sh "<topic>" [--portfolio <portfolio> ...] writes the generated lecture in Markdown to run_output_<portfolio>.md for each portfolio.
Accessibility: Lighthouse and axe-core audits configured in CI pipeline.

Operational Governance

Metrics exposed via OpenTelemetry at /metrics.
Alerts: configurable thresholds for latency, error rates, unsupported-claim rate.
Audit: verify hash chain integrity using CLI tools.

Roadmap & Next Steps

Basic JWT authentication is in place for remote deployments; requests sent to http://localhost or http://127.0.0.1 bypass authentication entirely. Role- based authorization is planned for v2.

Refer to ROADMAP.md for sprint plans and milestones.

Contributing

We welcome contributions! Please review:

License

This project is licensed under the MIT License.

Additional Documentation

DESIGN.md — Detailed design decisions and component diagrams
ARCHITECTURE.md — Data model, state graph definitions, and sequence diagrams
SECURITY.md — Security posture, secrets management, and encryption options
TEST_PLAN.md — Test cases, performance benchmarks, and QA checklist
GOVERNANCE.md — SLOs, metrics dashboard configuration, and audit procedures
CLI_REFERENCE.md — Commands for running, testing, and maintenance tasks
MIGRATION_GUIDE.md — Rationale and API changes for the new orchestrator and models

Name		Name	Last commit message	Last commit date
Latest commit History 842 Commits
.github		.github
.idea		.idea
docs		docs
frontend		frontend
migrations		migrations
scripts		scripts
src		src
tests		tests
workspace		workspace
.DS_Store		.DS_Store
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.stylelintrc.cjs		.stylelintrc.cjs
.whitesource		.whitesource
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json
poetry.lock		poetry.lock
postcss.config.js		postcss.config.js
prettier.config.mjs		prettier.config.mjs
pyproject.toml		pyproject.toml
renovate.json		renovate.json
run_output.md		run_output.md
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

License

fenrick/agentic-demo

Folders and files

Latest commit

History

Repository files navigation

Lecture Builder Agent

Table of Contents

Key Features

Architecture Overview

Stream Channels

Getting Started

Prerequisites

Installation

Configuration

Running Locally

Component Breakdown

Orchestration

Retrieval & Citation

Content Synthesis

Quality Control

Persistence & Versioning

Frontend UX

Exporters

API

Authoring Process

Examples

Invoking the Orchestrator

Working with Pydantic Models

Storage Options

Database Migrations

Command-line Lecture Generation

Configuration & Environment Variables

Logging & Tracing

Testing & QA

Operational Governance

Roadmap & Next Steps

Contributing

License

Additional Documentation

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages