Skip to content

fenrick/agentic-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lecture Builder Agent

A local-first, multi-agent system that generates high‑quality, university‑grade lecture and workshop outlines (with supporting materials) from a single topic prompt. Agents are defined with Pydantic‑AI models and coordinated by a custom Python orchestrator. The system integrates OpenAI o4‑mini/o3 models, web search via Tavily, and a React‑based UX. Full state, citations, logs, and intermediates persist in SQLite or Postgres. Observability is handled by Logfire. Exports include Markdown, DOCX, and PDF.


Table of Contents


Key Features

  • Multi-Agent Workflow: Researcher, Planner, Learning Advisor, Content Weaver, Editor, Final Reviewer and Exporter nodes coordinated by a lightweight custom orchestrator.
  • Streaming UI: Token-level draft streaming with diff highlights; action/reasoning log streaming via SSE.
  • Robust Citations: Tavily search, citation metadata stored in SQLite, Creative Commons and university domain filtering.
  • Local-First: Operates offline using cached corpora and fallback to local dense retrieval.
  • Flexible Session Model: Lectures and workshops share a composable schema with optional pedagogical styles and learning methods.
  • Progressive Document Graph: Research results, modules, and slide details are captured in a DAG so individual nodes can be revised without regenerating the whole session.
  • Flexible Exports: Markdown (canonical), DOCX (python-docx), PDF (WeasyPrint), with cover page, TOC, and bibliography.
  • Audit & Governance: Immutable action logs, SHA‑256 state hashes, role‑based access, database encryption.

Architecture Overview

The system comprises:

  1. Custom Orchestrator: Manages typed State objects through nodes and edges defined with Pydantic models. Handles checkpointing in SQLite.

  2. Agents:

    • Researcher-Web: Executes parallel Tavily searches, dedupes and ranks sources.
    • Curriculum Planner: Defines learning objectives and module structure.
    • Learning Advisor: Converts pedagogical intent into Bloom-aligned lesson plans.
    • Content Weaver: Generates slide decks and assessments from lesson plans.
    • Editor: Reviews narrative coherence and learning goals.
    • Final Reviewer: Scores overall consistency and completeness.
    • Exporter: Renders final deliverables.
  3. Web UX: React + Primer, SSE-driven, panels for document, log, sources, and controls.

  4. Storage Layer: SQLite or Postgres for state, logs, citations via repository abstraction.

  5. Export Pipeline: Pandoc-ready Markdown, python-docx, WeasyPrint PDF.

Stream Channels

Agents emit events over several orchestrator channels that the frontend can subscribe to via Server-Sent Events:

  • messages – token-level content such as LLM outputs.
  • debug – diagnostic information and warnings.
  • values – structured state snapshots.
  • updates – citation and progress updates.

All stream connections require a short-lived JWT passed as a query parameter. Fetch a token from GET /stream/token and connect using /stream/<channel>?token=<JWT> or /stream/<workspace>/<channel>?token=<JWT>.


Getting Started

Prerequisites

  • Python 3.11 or later
  • Node.js 18+ (for frontend)
  • poetry (recommended) or pipenv
  • OpenAI API key
  • Tavily API key (optional)

Installation

  1. Clone the repository:

    git clone https://github.com/your-org/lecture-builder-agent.git
    cd lecture-builder-agent
  2. Backend dependencies:

    poetry install
  3. Frontend dependencies:

    cd frontend
    npm install
  4. (Optional) Build the frontend locally:

    ./scripts/build_frontend.sh

    Docker images build the frontend automatically, so this step is only required when running the app directly on your host. The command generates static assets in frontend/dist that the FastAPI server serves.

Configuration

Configuration is managed with pydantic-settings and sourced from environment variables (e.g. a .env file):

cp .env.example .env
# Edit .env:
# OPENAI_API_KEY=sk-...
# TAVILY_API_KEY=...
# LOGFIRE_API_KEY=...
# LOGFIRE_PROJECT=...
# MODEL=openai:o4-mini
# DATA_DIR=./workspace
# OFFLINE_MODE=false
# ENABLE_TRACING=true
# ALLOWLIST_DOMAINS=["wikipedia.org",".edu",".gov"]
# ALERT_WEBHOOK_URL=https://example.com/hook

Running Locally

For a single-command startup that builds the frontend, waits for the database, applies migrations, and serves the app with built assets, use Docker Compose:

docker compose up --build

Then open your browser at http://localhost:8000.

The frontend seeds a development JWT (role: user) in localStorage so API requests work without a login flow:

localStorage.setItem(
  "jwt",
  "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyb2xlIjoidXNlciJ9.hM2ErG_O2e4d9YiCeVoTbiRDoo4ziDiIPfDFE40GUlg",
);

Replace the token with one signed using your JWT_SECRET for custom setups.

To run the services directly on your host for development:

  1. Start the backend (FastAPI + custom orchestrator):

    ./scripts/run.sh [--offline]

    This helper invokes Uvicorn with:

    poetry run uvicorn web.main:create_app --reload
  2. Start the frontend:

    cd frontend
    npm run dev

    The Vite dev server proxies /api and /stream requests to http://localhost:8000, allowing the frontend to call those paths without specifying a host.

  3. Open your browser at http://localhost:3000 and enter a topic to begin.


Component Breakdown

Orchestration

  • Custom orchestrator implemented in src/core/orchestrator.py and leveraging Pydantic‑AI models for agent interfaces.
  • Checkpointing handled in src/core/checkpoint.py with SQLite or Postgres backends.
  • Edge policies enforce confidence thresholds and retry loops.

Retrieval & Citation

  • SearchClient abstraction in src/agents/researcher_web.py supporting Tavily
  • Citation objects stored in state.citations table.
  • Filtering by domain allowlist and SPDX license checks.

Content Synthesis

  • OpenAIFunctionCaller utilizes function-call pattern for outline JSON.
  • Schema enforcement in schemas/outline.json.
  • Token streaming via the orchestrator's messages channel.

Quality Control

  • Editor records narrative and objective feedback.
  • FinalReviewer outputs a QAReport object with an overall score.
  • Integration tests in tests/quality/.

Persistence & Versioning

  • SQLite schema managed in src/persistence/.
  • Parquet blobs for document versions.
  • Optional Postgres: swap storage/sqlite.py with storage/postgres.py.

Frontend UX

  • React app (frontend/src/): Panels for Document, Log, Sources.
  • SSE client in frontend/src/services/stream.ts.
  • Diff highlighting via diff-match-patch.
  • Primer CSS provides the base styling for components and layout.
  • Command palette (Cmd/Ctrl+K) for quick Run, Retry, and Export actions.

Exporters

  • Markdown: direct from outline JSON → Markdown converter.
  • DOCX: generated via src/export/docx_exporter.py.
  • PDF: headless WeasyPrint configured in src/export/pdf_exporter.py.

API

All endpoints, except /healthz and /readyz, are namespaced under /api and require a JWT via the Authorization: Bearer <token> header. Tokens are signed using JWT_SECRET and validated with the HS256 algorithm by default.

Code Description
401 Missing token or failed signature check
403 Token valid but caller lacks required role

Authoring Process

  1. Researcher collects web snippets and extracts keywords.
  2. Planner turns research into learning outcomes and pedagogical intent.
  3. Learning Advisor expands topics into lesson plans aligned with Bloom's taxonomy.
  4. Content Weaver drafts slides, visualization notes, and speaker notes.
  5. Editor reviews the material and may trigger targeted rewrites.
  6. Final Check scores overall consistency and completeness.

Each stage updates the DocumentDAG so specific nodes—such as a slide's copy or visualization notes—can be refined without regenerating the entire session.

Examples

Invoking the Orchestrator

from core.orchestrator import GraphOrchestrator, build_main_flow
from core.state import State

state = State(prompt="Explain quantum computing basics")
orch = GraphOrchestrator(build_main_flow())
await orch.run(state)

Working with Pydantic Models

from agents.models import WeaveResult

model = WeaveResult(
    title="Intro to AI",
    learning_objectives=["Define artificial intelligence"],
    duration_min=60,
    session_type="workshop",
    pedagogical_styles=["flipped"],
    learning_methods=["case study"],
)

Storage Options

  • SQLite (default): single workspace.db file in DATA_DIR.
  • Postgres: set DATABASE_URL and install psycopg2; update config.

Database Migrations

Schema changes are managed with Alembic. After pulling new code, apply migrations to your workspace database:

alembic upgrade head

To create a new migration after modifying models:

alembic revision --autogenerate -m "add new table"

The command reads alembic.ini and writes migration scripts to migrations/versions/.

Command-line Lecture Generation

Run the CLI to generate lecture material while persisting state to the workspace database. Each invocation creates a new workspace identified by a slug of the topic and timestamp:

poetry run python -m cli.generate_lecture "Intro to Quantum"

The resulting workspace data is stored in DATA_DIR/workspace.db and exports are written alongside the generated Markdown file.


Configuration & Environment Variables

All runtime configuration is supplied via environment variables or a .env file. Secret tokens (API keys, project identifiers, etc.) must be sourced from your secret manager and never hard-coded. The legacy LangSmith variables have been replaced by Logfire's settings.

Variable Description Default
OPENAI_API_KEY API key for OpenAI (required)
TAVILY_API_KEY API key for Tavily search
LOGFIRE_API_KEY API key for Logfire
LOGFIRE_PROJECT Logfire project identifier
MODEL LLM provider and model (openai:o4-mini) openai:o4-mini
DATA_DIR Path for SQLite DB, cache, logs (required)
FRONTEND_DIST Directory containing built frontend assets frontend/dist
DATABASE_URL SQLAlchemy connection string sqlite:///${DATA_DIR}/workspace.db
OFFLINE_MODE Run without external network calls false
ENABLE_TRACING Enable Logfire tracing instrumentation true
ALLOWLIST_DOMAINS JSON list of citation-allowed domains ["wikipedia.org", ".edu", ".gov"]
ALERT_WEBHOOK_URL Optional webhook for alert notifications
JWT_SECRET HMAC secret for signing JWTs (required)
JWT_ALGORITHM JWT signing algorithm HS256

Logging & Tracing

  • Logfire: Handles structured JSON logs and spans. Use core.logging.get_logger(job_id, user_id) to bind contextual identifiers for correlation. OpenTelemetry metrics track request counts, active SSE clients, and export durations, exposed at /metrics for Prometheus scraping.

Testing & QA

  • Unit tests: pytest in tests/
  • Integration tests: mock orchestrator runs in CI.
  • Performance tests: k6 scripts in performance/
  • Benchmarks: scripts/benchmark_pipeline.py compares the current pipeline against the previous implementation to surface regressions.
  • CLI run output: ./scripts/cli.sh "<topic>" [--portfolio <portfolio> ...] writes the generated lecture in Markdown to run_output_<portfolio>.md for each portfolio.
  • Accessibility: Lighthouse and axe-core audits configured in CI pipeline.

Operational Governance

  • Metrics exposed via OpenTelemetry at /metrics.
  • Alerts: configurable thresholds for latency, error rates, unsupported-claim rate.
  • Audit: verify hash chain integrity using CLI tools.

Roadmap & Next Steps

Basic JWT authentication is in place for remote deployments; requests sent to http://localhost or http://127.0.0.1 bypass authentication entirely. Role- based authorization is planned for v2.

Refer to ROADMAP.md for sprint plans and milestones.


Contributing

We welcome contributions! Please review:


License

This project is licensed under the MIT License.


Additional Documentation

  • DESIGN.md — Detailed design decisions and component diagrams
  • ARCHITECTURE.md — Data model, state graph definitions, and sequence diagrams
  • SECURITY.md — Security posture, secrets management, and encryption options
  • TEST_PLAN.md — Test cases, performance benchmarks, and QA checklist
  • GOVERNANCE.md — SLOs, metrics dashboard configuration, and audit procedures
  • CLI_REFERENCE.md — Commands for running, testing, and maintenance tasks
  • MIGRATION_GUIDE.md — Rationale and API changes for the new orchestrator and models

About

An agentic demo

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •