Skip to content

javiramos1/recipe-agent

Repository files navigation

Image-Based Recipe Recommendation Service

Learning & Demonstration Project showcasing comprehensive agentic application patterns.

The goal of this project is to test the capabilities of agentic AI using high-level frameworks like Agno AI. To learn low-level details like implementing ReACT, tool calling, and RAG from scratch, see this repo: https://github.com/javiramos1/hybrid-rag-system

A simple recipe recommendation problem intentionally solved with production-grade patterns to demonstrate how to build professional GenAI systems. Shows all key capabilities: memory management, knowledge bases, tool calling, structured outputs, tracing, testing, and the complete development lifecycle.

Overview

This service demonstrates best practices in agentic system design by leveraging AgentOS as the complete runtime, Agno Agent as the orchestrator, and Gemini Vision API for ingredient detection.

⚠️ Intentional Complexity: Recipe recommendation is deliberately simple—a problem normally solved with a single API call. This complexity is intentional: the goal is to showcase production-grade architecture patterns (memory, tools, retries, observability, testing) on a familiar domain.

We also use Spoonacular MCP to demonstrate how to connect to production external systems (APIs, databases, services) via the MCP protocol—but LLMs are fully capable of generating recipes independently. The choice to use MCP here illustrates best practices for integrating external data sources in agentic systems.

What You'll Learn:

  • Agentic Architecture: Stateful applications with automatic memory, preference extraction, and multi-turn conversations
  • Tool Integration: Internal @tool functions + external MCP services (Model Context Protocol) for connecting to production systems
  • Production Patterns: Exponential backoff retries, guardrails, rate limiting, error recovery, Pydantic schema validation
  • Observability: Distributed tracing with OpenTelemetry, structured logging, performance metrics
  • Quality Assurance: Unit tests, integration tests, and Agno Evals Framework for multi-dimensional evaluation
  • Complete Lifecycle: Requirements → Design → Implementation → Testing → Monitoring → Iteration
  • Agentic RAG: You can switch between adding history, memory, knowledge, etc to every call (classical RAG) or provide a tool to the LLM to decide if it should query the memory, history, etc. (Agentic RAG). The first option is simpler and faster (no extra roundtrips), the second option is more flexible and powerful but more expensive and slow.

Key Capabilities

  • 📸 Image-Based Ingredient Detection - Upload ingredient photos, automatically extract ingredients using Gemini vision API
  • 🍳 Recipe Recommendations - Get personalized recipes based on detected ingredients and preferences
  • 💬 Conversational Memory - Multi-turn conversations with automatic preference tracking (dietary restrictions, cuisines, meal types)
  • 🎯 Domain-Focused - Specialized for recipes only, with guardrails preventing off-topic requests
  • 🔄 Session Management - Persistent conversation history and user preferences across sessions
  • 📊 Structured Output - All responses validated with Pydantic schemas

Features

Core Agent Features

  • Exponential Backoff Retries (3 attempts, 2s→4s→8s delays) - Handles transient failures and rate limits
  • Structured Output - Pydantic schema validation for type-safe responses
  • Multi-Model Support - Configurable LLM selection (Gemini Flash/Pro, Claude, GPT)
  • Image Compression - Automatic image compression (JPEG quality optimization) reduces upload size by ~70% while maintaining quality for ingredient detection
  • Automatic Memory - User preferences and conversation history (with compression)
  • Session Persistence - SQLite/PostgreSQL support for production deployments
  • Tool Integration - MCP protocol + internal tools with pre/post hooks and guardrails

Observability & Quality

  • Distributed Tracing - AgentOS built-in OpenTelemetry integration for execution visibility
  • Structured Logging - Python logging with debug/info/warning levels (no sensitive data)
  • Unit Tests - 150+ tests for models, config, ingredients, MCP, tracing
  • Integration Tests - E2E evaluation tests with Agno Evals Framework (accuracy, reliability, performance)
  • REST API Tests - 10 endpoint tests covering session management, file uploads, error handling

Data & Knowledge

  • Knowledge Graph - LanceDB vector store with SentenceTransformer embeddings (no API cost)
  • Semantic Search - Troubleshooting, Error and other findings stored and searchable for agent learning
  • Error Tracking - API errors (402 quota, 429 rate limit) documented for diagnostics

System Design

  • MCP Tools - Spoonacular recipe API with custom retry logic and connection validation (optional)
  • Internal Tools - Ingredient detection tool (image→ingredients with confidence scores)
  • Pre/Post Hooks - Image processing, guardrails, metadata injection, troubleshooting tracking
  • Templated Prompts - Configurable system instructions with dynamic parameters (MAX_RECIPES, MIN_CONFIDENCE, etc)
  • AgentOS Integration - REST API + Web UI + chat interface provided out-of-the-box

Architecture Overview

graph TD
    A["User Request<br/>(Image or Text)"] --> B["Pre-Hook<br/>Ingredient Detection"]
    B --> C["Agno Agent<br/>Orchestrator"]
    C --> D["Spoonacular MCP<br/>Recipe Search - optional"]
    D --> E["Recipe Response<br/>with Preferences"]
    
    C -.->|Session Memory| F["SQLite/PostgreSQL<br/>Database"]
    F -.->|Load History| C
    
    C -.->|Search & Learn| G["LanceDB<br/>Knowledge Base"]
    G -.->|Troubleshooting| C
Loading

Development Process

This project demonstrates a modern AI-powered SDLC using GitHub Copilot throughout the entire development lifecycle:

  • 🔍 Discovery (ChatGPT) - Gathered requirements and selected Agno AI framework
  • 📋 Design (Claude Sonnet 4.5) - Created comprehensive PRD, architecture, and implementation plan
  • 💻 Implementation (Claude Haiku 4.5) - Built application incrementally using task specifications
  • Testing & Refinement - Iterative improvements with unit, integration, and evaluation tests
  • 📚 Documentation - Maintained living documentation across all phases

See .docs/PROCESS.md for detailed workflow documentation.


Quick Start

Prerequisites

Setup

# 1. Clone and initialize
git clone <repo-url>
cd recipe-agent

# 2. Setup dependencies and environment
make setup

# 3. Edit .env with your API keys
# Required:
#   GEMINI_API_KEY=your_key_here
# Optional (only needed if USE_SPOONACULAR=true, which is the default):
#   SPOONACULAR_API_KEY=your_key_here
# For internal LLM-only mode (no Spoonacular), skip SPOONACULAR_API_KEY and set:
#   USE_SPOONACULAR=false
nano .env

# 4. Start the application
make dev

# Or run single queries without keeping server running:
make query Q="What can I make with chicken and rice?"

The service will start at:

  • Backend API: http://localhost:7777
  • OpenAPI Docs: http://localhost:7777/docs
  • Agno OS Platform: https://os.agno.com (recommended UI - connect local agent)

Tech Stack

Component Technology Purpose
Runtime AgentOS Complete application backend (REST API, orchestration, tracing)
Orchestrator Agno Agent Stateful agent with memory, retries, and tool routing
Vision API Gemini (3-flash-preview) Ingredient detection from images
Recipe Search Spoonacular MCP 50K+ verified recipes via external service
Database SQLite (dev) / PostgreSQL (prod) Session storage and memory
Knowledge Base LanceDB Vector database for semantic search, multi-modal embeddings, and agent learning
Validation Pydantic v2 Input/output schema validation
Testing pytest + Agno Evals Unit, integration, and evaluation testing

Setup Instructions

1. Install Dependencies

make setup

This will:

  • Install all Python dependencies from requirements.txt
  • Create .env file from .env.example (if not already present)
  • Display instructions for adding API keys

2. Configure Environment

Edit .env and add your API keys:

# Required - Get from Google Cloud Console
GEMINI_API_KEY=your_gemini_key_here

# Optional - Get from spoonacular.com/food-api or rapidapi.com
SPOONACULAR_API_KEY=your_spoonacular_key_here

# Optional - Defaults shown
# Main recipe recommendation model
GEMINI_MODEL=gemini-3-flash-preview

# Image detection model (can be different from GEMINI_MODEL)
# Use gemini-3-pro-preview for better accuracy on complex images
IMAGE_DETECTION_MODEL=gemini-3-flash-preview

PORT=7777
MAX_HISTORY=3
MAX_RECIPES=3
MAX_IMAGE_SIZE_MB=5
MIN_INGREDIENT_CONFIDENCE=0.7
COMPRESS_IMG_THRESHOLD_KB=300
IMAGE_DETECTION_MODE=pre-hook
LOG_LEVEL=INFO
OUTPUT_FORMAT=json
# DATABASE_URL=postgresql://user:pass@localhost:5432/recipe_service

Getting Spoonacular API Key (Optional)

Only required if using Spoonacular for recipe search (when USE_SPOONACULAR=true, which is the default).

If you want to use internal LLM knowledge instead, skip this step and set USE_SPOONACULAR=false.

To get a Spoonacular key:

  1. Visit spoonacular.com/food-api
  2. Click "Get API Key" button
  3. Sign up or log in with email
  4. Your API key appears on the dashboard
  5. Copy to .env as SPOONACULAR_API_KEY

Understanding Spoonacular Quotas (Spoonacular Mode Only)

Applies when USE_SPOONACULAR=true.

Free Plan:

  • API Calls: 100/day
  • Cost: $0/month
  • Throttle: 1 request/second

Quota Status:

  • Check remaining quota in .env file or dashboard
  • When exceeded: API returns 402 Payment Required error
  • Daily quota resets at 00:00 UTC
  • If quota exceeded, switch to internal LLM mode: USE_SPOONACULAR=false make dev

3. Start Development Server

make dev

The application will start at:

  • Backend API: http://localhost:7777
  • OpenAPI Docs: http://localhost:7777/docs

Using the Web UI with Agno OS Platform

Quick Start with Agno OS

  1. Create Free Account: Visit os.agno.com and sign up (free tier available)
  2. Start Backend: Run make dev in terminal
  3. Connect Agent: In Agno OS platform → Click team dropdown → "Add new OS" → Select "Local" → Enter endpoint http://localhost:7777
  4. Chat in Platform: UI automatically adapts to your agent's input schema. Enter an image URL or base64 data.
  5. View Traces: Agno OS platform shows execution traces, tool calls, and reasoning

Why Agno OS Platform?

  • Schema-Aware UI: Forms and inputs automatically adapt to your agent's request schema
  • Execution Traces: See every step of ingredient detection and recipe search
  • Session Persistence: Multi-turn conversations with automatic memory management
  • Evaluations: Built-in framework for testing agent quality
  • No Hosting Needed: Connect local agents instantly; data stays on your machine

Key Features in Agno OS

Feature Description
Dynamic UI Forms Input forms automatically generated from your agent's schema
Execution Traces Full visibility: agent thinking, tool calls, LLM reasoning
Session History Persistent multi-turn conversations
Image Support Upload images directly (base64 conversion handled)
Tool Inspection See which MCP tools were called and their outputs
Performance Metrics Response times, token usage, execution breakdown

Image Requirements:

  • Max Size: 5MB per image
  • Formats: JPEG, PNG, WebP, GIF
  • Best Practice: Clear, well-lit photos of actual food items

Development Workflow

Start Server

make dev     # Development mode (debug log level, JSON output format)
make debug   # Full debug mode (log level + agent debug + debug level 2, JSON output)
make run     # Production mode (markdown output format for UI rendering)
make stop    # Stop running server

Output Formats:

  • make dev and make debug: JSON - Full RecipeResponse structure (useful for testing/debugging)
  • make run: Markdown - Extracted response field only (optimized for Agno Agent UI rendering)

Access Points:

  • Agno OS Platform: https://os.agno.com → Connect local agent → View UI, traces, and evaluations
  • REST API: http://localhost:7777/agents/recipe-recommendation-agent/runs (AgentOS API)
  • API Documentation: http://localhost:7777/docs (Swagger UI)

Run Ad Hoc Queries

Execute single queries without starting the full server:

make query Q="What can I make with chicken and rice?"
make query Q="Show me vegetarian recipes"

The query command initializes the agent and executes one-off requests from the CLI. Useful for testing or integration with shell scripts.

Debug Mode

Development Server:

  • make dev - Starts with debug log level (LOG_LEVEL=DEBUG)
  • make debug - Full debug mode with agent debugging enabled:
    • LOG_LEVEL=DEBUG - Structured logging at debug level
    • DEBUG_MODE=1 - Enables Agno agent debug mode for detailed execution traces
    • DEBUG_LEVEL=2 - Maximum verbosity for agent reasoning and tool calls

Debug output shows:

  • 📞 Tool Calls - Which MCP tools were called and their arguments
  • 💬 LLM Communication - Complete system prompt, user input, and model output
  • ⏱️ Performance Metrics - Token counts, execution time, tokens/second throughput
  • 🔄 Session Management - Memory operations and preference tracking
  • 🎯 Pre-Hooks - Ingredient detection from images

Run Tests

make test       # Unit tests (fast, isolated)
make int-tests  # REST API integration tests (starts app automatically, no pre-setup required)
make eval       # Integration evaluation tests (requires API keys)

Code Quality

make lint      # Run Ruff and Flake8 linters (checks code style and unused imports)
make format    # Fix formatting issues with Ruff (auto-fixes whitespace, line length, etc.)

Linting checks for:

  • Unused imports (F401)
  • Code style violations (E, W, F codes)
  • Missing placeholders in f-strings (F541)
  • Syntax errors

Formatting automatically fixes:

  • Trailing whitespace
  • Blank lines with whitespace
  • Line length (max 120 chars)
  • Import organization

Note: make zip automatically runs make lint before creating the archive to ensure code quality.

Clean Cache

make clean   # Remove __pycache__, .pyc, pytest cache

Usage Examples

Web UI via Agno OS Platform (Recommended)

  1. Start backend: make dev
  2. Go to https://os.agno.com (free account available)
  3. Add new OS: Team dropdown → "Add new OS" → Local → http://localhost:7777
  4. Chat: UI auto-adapts to agent schema, view traces and executions

REST API (Programmatic)

Text Query

curl -X POST http://localhost:7777/agents/recipe-recommendation-agent/runs \
  -d 'message={"message": "I have tomatoes and basil", "session_id": "user-123"}'

Image Upload (Base64)

curl -X POST http://localhost:7777/agents/recipe-recommendation-agent/runs \
  -d "message={\"images\": \"data:image/jpeg;base64,...\"}"

Multi-Turn Conversation (Preferences Persisted)

# Turn 1: Express preference
curl -X POST http://localhost:7777/agents/recipe-recommendation-agent/runs \
  -d "message={\"message\": \"I am vegetarian\", \"session_id\": \"user-123\"}"

# Turn 2: Agent remembers vegetarian preference
curl -X POST http://localhost:7777/agents/recipe-recommendation-agent/runs \
  -d "message={\"message\": \"I have potatoes and garlic\", \"session_id\": \"user-123\"}"

Configuration

Environment Variables

Variable Type Default Description
GEMINI_API_KEY string required Google Gemini API key (vision model)
USE_SPOONACULAR bool true Enable Spoonacular MCP for external recipe API. When false, uses internal LLM knowledge to generate recipes
SPOONACULAR_API_KEY string required if USE_SPOONACULAR=true Spoonacular recipe API key (only needed when Spoonacular mode is enabled)
GEMINI_MODEL string gemini-3-flash-preview Main recipe recommendation model. Options: gemini-3-flash-preview, gemini-3-pro-preview
IMAGE_DETECTION_MODEL string gemini-2.5-flash-lite Image detection model (independent from main model). Options: gemini-2.5-flash-lite, gemini-3-flash-preview, gemini-3-pro-preview. Use gemini-3-pro-preview for better accuracy on complex images
AGENT_MNGT_MODEL string gemini-2.5-flash-lite Agent management model for memories, summaries, compression, and learning. Uses smaller model to reduce API costs for background operations (98% cheaper)
TEMPERATURE float 0.3 Response randomness (0.0=deterministic, 1.0=creative). For recipes: 0.3 balances consistency with variety
MAX_OUTPUT_TOKENS int 8192 Maximum response length. For multiple recipes: 8192 supports full response with 10 recipes including instructions (Gemini supports up to 65,536)
THINKING_LEVEL string None Extended reasoning level: None (fastest), low (balanced), high (thorough). Recipe recommendations work well with None
Agent Context Awareness
ADD_DATETIME_TO_CONTEXT bool true Include current date/time in agent context for time-aware recipes ("quick weeknight dinners", seasonal recipes)
TIMEZONE_IDENTIFIER string Etc/UTC Timezone for datetime context (TZ Database format: America/New_York, Europe/London, etc.)
ADD_LOCATION_TO_CONTEXT bool false Include user location for geo-aware recipes (requires user permission, privacy consideration)
PORT int 7777 Server port
MAX_HISTORY int 3 Conversation turns to keep in memory
MAX_RECIPES int 3 Maximum recipes to return per request
MAX_IMAGE_SIZE_MB int 5 Maximum image upload size
MIN_INGREDIENT_CONFIDENCE float 0.7 Confidence threshold for detected ingredients (0.0-1.0)
COMPRESS_IMG_THRESHOLD_KB int 300 Only compress images smaller than this threshold (KB). Images above this are already compressed. Set to 0 to disable compression
IMAGE_DETECTION_MODE string pre-hook Detection mode: pre-hook (fast) or tool (agent-controlled)
TOOL_CALL_LIMIT int 12 Maximum tool calls per agent request (prevents excessive API usage)
OUTPUT_FORMAT string json Response format: json or markdown
LOG_LEVEL string INFO Logging level: DEBUG, INFO, WARNING, ERROR
DATABASE_URL string optional PostgreSQL connection (SQLite default)
ENABLE_TRACING bool true Enable distributed tracing
TRACING_DB_TYPE string sqlite Tracing database type: sqlite or postgres
TRACING_DB_FILE string agno_traces.db Path for SQLite tracing database
Agent Retry Configuration
MAX_RETRIES int 3 Number of retry attempts for transient API failures
DELAY_BETWEEN_RETRIES int 2 Initial delay in seconds between retries (doubles each retry with exponential backoff)
EXPONENTIAL_BACKOFF bool true Enable exponential backoff (2s → 4s → 8s) for rate limit handling
Agent Memory & Context
ADD_HISTORY_TO_CONTEXT bool true Include conversation history in LLM context for coherence (local database operation)
READ_TOOL_CALL_HISTORY bool false Give LLM access to previous tool calls (avoid redundancy with automatic history, local operation)
UPDATE_KNOWLEDGE bool true Allow LLM to add learnings to knowledge base (local vector database operation)
READ_CHAT_HISTORY bool false Provide dedicated tool for LLM to query chat history (avoid redundancy with automatic history, local operation)
ENABLE_USER_MEMORIES bool true Store and track user preferences for personalization (requires extra LLM API calls for extraction)
ENABLE_SESSION_SUMMARIES bool false Auto-summarize sessions for context compression (requires extra LLM API calls for generation)
COMPRESS_TOOL_RESULTS bool true Compress tool outputs to reduce context size (uses cost-optimized MEMORY_MODEL)
SEARCH_KNOWLEDGE bool true Give LLM ability to search knowledge base during reasoning
SEARCH_SESSION_HISTORY bool true Enable searching across multiple past sessions for long-term preference tracking (local database operation)
NUM_HISTORY_SESSIONS int 2 Number of past sessions to include in history search (keep low 2-3 to avoid context bloat and performance impact)
Agent Performance & Debugging
CACHE_SESSION bool false Cache agent session in memory for faster access (single-server only, not for distributed systems)
DEBUG_MODE bool false Enable detailed logging and system message inspection (development only, see compiled system message)
Learning Machine
ENABLE_LEARNING bool true Enable Learning Machine for dynamic user profile and recipe insight extraction
LEARNING_MODE string AGENTIC How agent learns: ALWAYS (auto), AGENTIC (agent-controlled, recommended), PROPOSE (user-approved)

Design Note on Knowledge vs LearnedKnowledge:

  • Knowledge (SEARCH_KNOWLEDGE): For external documents/FAQs (read-only, static). Disabled by default.
  • LearnedKnowledge (ENABLE_LEARNING): For dynamic insights from conversations (agent-driven, evolving). Enabled by default.
  • ⚠️ WARNING: Using both is overkill for recipe agents. Choose one: static documents (Knowledge) OR dynamic learning (LearnedKnowledge). Do NOT enable both.
  • For this project: LearnedKnowledge preferred because user preferences and recipe learnings are dynamic and change over conversations.

Learning Strategy: Uses AGENTIC mode by default - agent receives tools and decides when to save insights (user preferences, recipe recommendations, cooking techniques). Learnings saved to global namespace (shared insights benefit all users). Uses cost-optimized MEMORY_MODEL for extraction.

Memory Strategy: Uses Option 1 - Automatic Context for optimal balance of conversational continuity and performance. Recent history is automatically included while avoiding redundant on-demand history tools that could bloat token usage. Session history search (last 2 sessions by default) provides long-term preference context without excessive context overhead. Session summaries are disabled by default to minimize LLM API costs - enable only for long conversations requiring context compression. Memory operations (user memories, session summaries) and tool result compression use a separate cost-optimized model (MEMORY_MODEL) to reduce API costs by up to 98% for background operations.

Observability & Tracing

Built-in tracing with OpenTelemetry. View traces in Agno OS Platform:

# Start backend
make dev

# Connect to https://os.agno.com
# 1. Create account (free tier available)
# 2. Add new OS → Local → http://localhost:7777
# 3. View traces in /traces tab

Disable Tracing (if needed):

ENABLE_TRACING=false

Database Configuration

Development (Default): SQLite file: agno.db (auto-created)

Production (PostgreSQL):

DATABASE_URL=postgresql://user:password@localhost:5432/recipe_service

Clear sessions, memory and knowledge base: make clear-memories

Operating Modes

The agent supports two operating modes:

Mode 1: Spoonacular MCP (Default - USE_SPOONACULAR=true)

  • Uses external Spoonacular recipe API for recipe search and details
  • Two-step process: search recipes → get full details on request
  • Requires SPOONACULAR_API_KEY environment variable
  • System instructions focus on tool-calling with recipe search semantics
  • Ideal when you want access to a large recipe database with verified accuracy

Mode 2: Internal LLM Knowledge (USE_SPOONACULAR=false)

  • Generates recipes from Gemini's internal culinary knowledge
  • No external API calls for recipe generation
  • Does NOT require SPOONACULAR_API_KEY
  • System instructions focus on direct recipe generation and culinary reasoning
  • Ideal for cost reduction, privacy, or when external API access is not available
  • Still uses Gemini vision API for ingredient detection from images

Switching Modes:

# Use Spoonacular MCP (default)
USE_SPOONACULAR=true SPOONACULAR_API_KEY=your_key make dev

# Use internal LLM knowledge (no Spoonacular key needed)
USE_SPOONACULAR=false make dev

Both modes maintain the same API interface and response format. The system prompts automatically adapt to each mode - when using internal knowledge, the LLM never sees Spoonacular-related instructions.

Architecture

Data Flow

  1. Request → User sends message and/or image
  2. Pre-Hook → Images processed through Gemini vision API to extract ingredients
  3. Agent → Agno Agent routes to recipe tools with extracted ingredients + preferences
  4. Recipe Generation → Spoonacular MCP called (if enabled) or LLM generates from knowledge
  5. Response → Agent synthesizes human-friendly response with recipe details

Session Management

  • Each conversation has a unique session_id
  • Agent automatically maintains chat history (last N turns, configurable)
  • User preferences extracted and persisted per session
  • Preferences applied to subsequent requests without re-stating
  • Sessions survive application restarts (stored in database)

Key Design Decisions

1. Pre-Hook Pattern (images processed BEFORE agent executes)

  • Eliminates extra LLM round-trip
  • Keeps ingredients as text (not raw bytes) in chat history
  • Faster responses overall
  • Configurable via IMAGE_DETECTION_MODE environment variable

Flexible Ingredient Detection (Pre-Hook vs. Tool):

  • IMAGE_DETECTION_MODE=pre-hook (default): Fast, processes before agent, no extra LLM call
  • IMAGE_DETECTION_MODE=tool: Agent control, visible tool call, agent decides when/if to call
  • Same core detection code used for both modes
  • Switch modes via environment variable only (no code changes needed)

3. Two-Step Recipe Process

  • Search recipes (ingredients + filters) via Spoonacular MCP
  • Get full recipe details (prevents hallucination)
  • All responses grounded in actual data

4. System Instructions Over Code

  • Agent behavior defined declaratively (not hard-coded)
  • Domain boundaries, preference extraction in prompts.py
  • Easy to modify behavior without code changes

Module Responsibilities

  • app.py: Orchestration (minimal ~50 lines)
  • agent.py: Agent initialization (factory pattern)
  • prompts.py: Behavior definition (system instructions)
  • hooks.py: Pre-hook configuration (factory pattern)
  • config.py: Environment and validation
  • logger.py: Structured logging
  • models.py: Data validation (Pydantic)
  • ingredients.py: Image processing (core functions)
  • mcp_tools/spoonacular.py: MCP initialization

Database & Storage

Development (Default):

  • SQLite + LanceDB (file-based, zero setup)
  • Database file: agno.db

Production (Optional):

  • PostgreSQL + pgvector (set DATABASE_URL)

Knowledge Base

Agent maintains a searchable LanceDB knowledge base - a high-performance vector database optimized for multi-modal AI applications:

LanceDB Capabilities:

  • Vector Search: Semantic similarity search using embeddings for natural language queries
  • Multi-Modal Support: Stores and searches across text, images, and structured data
  • High Performance: Columnar storage with SIMD-accelerated vector operations
  • ACID Transactions: Reliable data consistency with full transactional support
  • Zero-Copy Operations: Memory-efficient data access without unnecessary copying
  • Integration: Native Python API with SentenceTransformer embeddings (no API costs)

Automatically Stored:

  • Failed recipe searches (queries, failure reasons)
  • API errors and retries with resolution patterns
  • User preferences and dietary patterns over time
  • Recipe combinations that work/don't work with success rates
  • Ingredient detection accuracy and edge cases
  • Troubleshooting findings and system improvements

Used For:

  • Avoiding repeated failed searches through pattern recognition
  • Context-aware recommendations based on historical user behavior
  • Improved ingredient detection accuracy over time
  • Debugging search patterns and API reliability issues
  • Learning from user feedback and preference evolution

Search Enabled: Agent can search knowledge base for similar past issues before attempting new searches (configurable via search_knowledge=True in agent config).

Design Decisions: Features Not Implemented

Skills (Not Used - Intentionally Simple)

  • Agno Skills provide progressive capability discovery via lazy-loaded instruction packages
  • Decision: Unnecessary for recipe agents. Our system instructions are concise and sufficient. Skills shine in complex multi-domain agents (code review, debugging, documentation) with reusable knowledge across teams. Recipe recommendations don't need this complexity.

Culture (Not Used - Overkill for Single Agent)

  • Agno Culture provides shared organizational knowledge/principles for multi-agent systems
  • Decision: Designed for teams of agents that learn from each other. Single agent systems don't benefit. Better alternatives: system instructions (declarative) or LearnedKnowledge (dynamic).

API Reference

POST /api/agents/chat

Send a message and/or image to get recipe recommendations.

Request Schema (ChatMessage):

{
  "message": "string (optional) - Natural language query like 'I have chicken and rice' (1-2000 chars, optional if images provided)",
  "images": "string or string[] (optional) - Image URL(s) or base64-encoded image(s) (optional if message provided)"
}

Notes:

  • Either message or images must be provided
  • If only images provided, defaults to: "What can I cook with these ingredients?"
  • Images can be URLs (http://, https://) or base64-encoded strings
  • Max 10 images per request
  • Automatically extracts ingredients from images using Gemini vision API

Response Schema (RecipeResponse):

{
  "response": "string - LLM-generated conversational response",
  "recipes": [
    {
      "title": "string - Recipe name",
      "description": "string (optional) - Brief description",
      "ingredients": ["string"] - Ingredient list with quantities",
      "instructions": ["string"] - Step-by-step cooking instructions",
      "prep_time_min": integer - Preparation time in minutes (0-1440)",
      "cook_time_min": integer - Cooking time in minutes (0-1440)",
      "source_url": "string (optional) - URL to original recipe"
    }
  ],
  "ingredients": ["string (optional)"] - Detected/provided ingredients",
  "preferences": "object (optional) - User preferences (diet, cuisine, meal_type, intolerances)",
  "session_id": "string (optional) - Session identifier for continuity",
  "run_id": "string (optional) - Unique ID for this execution",
  "execution_time_ms": integer - Total execution time in milliseconds
}

Error Responses:

  • 400 Bad Request - Missing both message and images, or malformed JSON
  • 413 Payload Too Large - Image exceeds MAX_IMAGE_SIZE_MB
  • 422 Unprocessable Entity - Off-topic request (guardrail enforced)
  • 500 Internal Server Error - Unexpected system error

Testing

Unit Tests (Fast, Isolated)

make test

Tests:

  • Configuration loading and validation
  • Pydantic model validation
  • Schema serialization/deserialization

Integration Evals (Agno Evals Framework)

# Terminal 1: Start AgentOS
make dev

# Terminal 2: Run evaluations
make eval

Uses Agno evals framework (AgentOS built-in evaluation system) for multi-dimensional testing:

  • AccuracyEval: Ingredient detection accuracy using LLM-as-judge
  • AgentAsJudgeEval: Recipe quality, preference persistence, guardrails, session isolation
  • ReliabilityEval: Correct tool sequence (find_recipes_by_ingredients → get_recipe_information)
  • PerformanceEval: Response time under 5 seconds

Coverage: 8 comprehensive eval tests covering all dimensions.

Viewing Results in UI:

  1. Keep AgentOS running (make dev)
  2. Run evaluations (make eval)
  3. Connect os.agno.com to http://localhost:7777
  4. View eval results in the "Evaluations" tab
  5. Results are persisted in tmp/recipe_agent_sessions.db (shared with agent)

Note: Requires GEMINI_API_KEY. SPOONACULAR_API_KEY is optional (only needed if USE_SPOONACULAR=true). Internet connection required for API calls.

REST API Integration Tests

# Start app in one terminal
make dev

# In another terminal, run REST API tests
make int-tests

Tests REST API endpoints directly using httpx client. Validates:

  • Successful requests: HTTP 200 with proper response schema
  • Session management: Preference persistence across requests, session isolation
  • Image handling: Base64 image upload and ingredient extraction
  • Error handling: Missing fields (400), invalid JSON (400), off-topic (422), oversized image (413)
  • Response validation: RecipeResponse schema compliance
  • Rapid requests: Handles sequential requests without resource exhaustion

Coverage: 10 test functions covering all HTTP status codes and edge cases.

Note: Requires running app (make dev in separate terminal). Tests connect to http://localhost:7777.

Troubleshooting

"Spoonacular MCP unreachable" on startup

Problem: Application fails to start with MCP connection error (when USE_SPOONACULAR=true).

Solution:

  1. Verify SPOONACULAR_API_KEY is set and valid
  2. Check internet connection
  3. Test API key: curl "https://api.spoonacular.com/recipes/info?ids=1&apiKey=YOUR_KEY"
  4. If you don't have a Spoonacular key, use internal LLM mode: USE_SPOONACULAR=false make dev
  5. If valid, MCP may be temporarily unavailable; try again

"Spoonacular API key missing" error when starting app

Problem: Application fails because SPOONACULAR_API_KEY is not set and USE_SPOONACULAR=true.

Solution:

  • Option 1: Provide Spoonacular key: SPOONACULAR_API_KEY=your_key make dev
  • Option 2: Switch to internal LLM mode: USE_SPOONACULAR=false make dev
    • No API key needed, uses Gemini's internal knowledge for recipe generation
    • Still requires GEMINI_API_KEY for ingredient detection from images

"API key invalid" errors during requests

Problem: Requests fail with authentication errors.

Solution:

  1. Verify .env file has correct keys (no extra spaces)
  2. Regenerate API keys from their respective consoles
  3. Check key hasn't expired or been revoked
  4. Restart application after updating keys

"Image too large" error

Problem: Image upload fails with 413 error.

Solution:

  1. Check image size: ls -lh path/to/image.jpg
  2. Resize if needed: convert image.jpg -resize 1024x1024 image-resized.jpg
  3. Adjust threshold if needed: Set MAX_IMAGE_SIZE_MB=10 in .env
  4. Restart application

"No ingredients detected" despite uploading image

Problem: Image uploaded but no ingredients extracted.

Solution:

  1. Verify image shows clear, recognizable food items
  2. Try increasing confidence threshold: Lower MIN_INGREDIENT_CONFIDENCE in .env
  3. Try different image format (JPEG vs PNG)
  4. Check image is not too dark, blurry, or obscured

Port already in use

Problem: "Address already in use" error on startup.

Solution:

  1. Change port: PORT=8888 make dev
  2. Or kill process using port: lsof -ti:7777 | xargs kill -9

Database errors

Problem: SQLite or PostgreSQL connection errors.

Solution:

  1. For SQLite: Run make clear-memories
  2. For PostgreSQL: Verify DATABASE_URL is correct
  3. Check database server is running
  4. Ensure database user has create/read/write permissions

Project Structure

recipe-agent/
├── app.py                 # AgentOS entry point (~50 lines, minimal orchestration)
├── requirements.txt       # Python dependencies
├── .env.example           # Configuration template
├── Makefile               # Development commands (setup, dev, test, etc.)
├── README.md              # This file
│
├── src/                   # Application source code (organized by responsibility)
│   ├── utils/
│   │   ├── __init__.py
│   │   ├── config.py      # Environment configuration and validation
│   │   └── logger.py      # Structured logging infrastructure
│   │
│   ├── models/
│   │   ├── __init__.py
│   │   └── models.py      # Pydantic schemas (RecipeRequest, RecipeResponse)
│   │
│   ├── agents/
│   │   ├── __init__.py
│   │   └── agent.py       # Agent factory function (initialize_recipe_agent)
│   │
│   ├── prompts/
│   │   ├── __init__.py
│   │   └── prompts.py     # System instructions (SYSTEM_INSTRUCTIONS constant)
│   │
│   ├── hooks/
│   │   ├── __init__.py
│   │   └── hooks.py       # Pre-hooks factory (get_pre_hooks)
│   │
│   └── mcp_tools/
│       ├── __init__.py
│       ├── ingredients.py # Ingredient detection (core functions + pre-hook/tool)
│       └── spoonacular.py # SpoonacularMCP class (MCP initialization with retry logic)
│
├── tests/
│   ├── unit/              # Unit tests (isolated, no external APIs)
│   │   ├── test_config.py
│   │   ├── test_models.py
│   │   ├── test_logger.py
│   │   ├── test_ingredients.py
│   │   ├── test_mcp.py
│   │   └── test_app.py
│   └── integration/       # Integration tests (evals + REST API)
│       ├── conftest.py     # Pytest fixtures and configuration
│       ├── test_eval.py    # Agno evals framework (8 comprehensive tests)
│       └── test_integration.py # REST API endpoint tests (13 tests)
│
├── images/                # Sample test images
│   ├── sample_vegetables.jpg
│   ├── sample_fruits.jpg
│   └── sample_pantry.jpg
│
└── .docs/                 # Documentation
    ├── PRD.md             # Product requirements
    ├── DESIGN.md          # Technical design (factory pattern, architecture)
    └── IMPLEMENTATION_PLAN.md  # Task breakdown

Development Guidelines

Code Quality

  • Type Hints: All functions have type annotations
  • Docstrings: All public functions documented
  • Error Handling: Graceful error messages, no stack traces in API responses
  • Logging: Structured logging with no sensitive data (keys, images)
  • Testing: Unit tests for all new code, integration tests for flows

Adding New Features

To add a new recipe tool:

  1. Create MCPTools instance with new command
  2. Update system instructions to explain when to use it
  3. Add integration test for new tool
  4. Update README with new capability

To modify agent behavior:

  1. Edit system instructions in app.py
  2. No code logic needed (instructions guide behavior)
  3. Test with different user inputs
  4. Update README if behavior changes significantly

To add new preferences:

  1. Update RecipeRequest model in models.py
  2. Update system instructions to extract new preference
  3. Preference automatically tracked per session
  4. Add test case for new preference

References

License

This is a project demonstrating production-quality GenAI system design.


Questions or Issues? See Troubleshooting section above.

About

Simple AI Agent to demonstrate Agnos AI capabilities

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors