Overshoot Python SDK

Python SDK for real-time video AI inference using the Overshoot Media Gateway API.

Features

Real-time video streaming via WebRTC
AI inference on video frames with configurable prompts
Action detection with built-in storage and querying
Stream relay for external video sources (mobile apps, WebSocket feeds)
Cross-platform camera support using OpenCV (Windows, macOS, Linux)
Frame preprocessing with custom callbacks (OpenCV, MediaPipe, etc.)
Multiple video sources: camera or video files
Structured output with JSON schema support
Async/await support with context managers

Installation

pip install aiohttp aiortc opencv-python numpy

Optional dependencies:

pip install mediapipe      # For hand/pose/face detection preprocessing
pip install python-dotenv  # For .env file support

Core Components

Component	Description
`RealtimeVision`	Stream from local camera/video to Overshoot
`OvershootStreamRelay`	Relay frames from external sources to Overshoot
`ActionDetector`	High-level action detection with storage
`ActionStore`	Thread-safe storage for detected actions
`OvershootHttpClient`	Low-level HTTP client

Quick Start

Basic Streaming

import asyncio
from overshoot import RealtimeVision, RealtimeVisionConfig

async def main():
    config = RealtimeVisionConfig(
        api_url="https://cluster1.overshoot.ai/api/v0.2",
        api_key="your-api-key",
        prompt="Describe what you see in this video",
        on_result=lambda r: print(f"Result: {r.result}"),
    )

    async with RealtimeVision(config) as vision:
        await asyncio.sleep(60)  # Stream for 60 seconds

asyncio.run(main())

Action Detection

Detect specific actions in video with automatic storage and querying.

ActionDetector

import asyncio
from overshoot import ActionDetector

async def main():
    detector = ActionDetector(
        api_url="https://cluster1.overshoot.ai/api/v0.2",
        api_key="your-api-key",
        actions=["waving hand", "thumbs up", "pointing", "clapping"],
        min_confidence=0.6,
        on_action=lambda a: print(f"Detected: {a.action} ({a.confidence:.0%})"),
    )

    # Start real-time detection
    await detector.start()
    await asyncio.sleep(60)
    await detector.stop()

    # Query detected actions
    waves = detector.get_actions(action="waving hand")
    recent = detector.get_actions(last_seconds=10)
    high_conf = detector.get_actions(min_confidence=0.8)

    # Get summary
    print(detector.summary())  # {"waving hand": 5, "thumbs up": 2}

    # Export results
    detector.export("results.json", format="json")
    detector.export("results.csv", format="csv")

asyncio.run(main())

Analyze Video File

await detector.analyze_video("path/to/video.mp4")

for action in detector.actions:
    print(f"[{action.timestamp:.1f}s] {action.action} ({action.confidence:.0%})")

ActionStore

Thread-safe storage for detected actions with querying and export.

from overshoot import ActionStore, DetectedAction

store = ActionStore()

# Add actions
store.add(DetectedAction("waving", timestamp=1.5, frame_number=45, confidence=0.92))
store.add(DetectedAction("thumbs up", timestamp=3.2, frame_number=96, confidence=0.85))

# Query with filters
all_waves = store.get_actions(action="waving")
time_range = store.get_actions(start_time=1.0, end_time=5.0)
high_conf = store.get_actions(min_confidence=0.9)
recent = store.get_actions(last_seconds=10)

# Combine filters
combined = store.get_actions(action="waving", min_confidence=0.8)

# Summary statistics
print(store.summary())  # {"waving": 1, "thumbs up": 1}
print(len(store))       # 2

# Export
store.export_json("actions.json")
store.export_csv("actions.csv")

DetectedAction

@dataclass
class DetectedAction:
    action: str              # Action name (e.g., "waving hand")
    timestamp: float         # Seconds from stream start
    frame_number: int        # Frame index
    confidence: float        # 0.0 to 1.0
    duration: Optional[float] = None
    metadata: dict = {}      # Latency info, custom data

Stream Relay

Relay video frames from external sources (mobile apps, WebSocket feeds) to Overshoot.

Basic Relay

import asyncio
from overshoot import OvershootStreamRelay

async def main():
    relay = OvershootStreamRelay(
        api_url="https://cluster1.overshoot.ai/api/v0.2",
        api_key="your-api-key",
        prompt="Describe what you see",
        on_result=lambda r: print(r["result"]),
    )

    await relay.start()

    # Push frames from your source (RGB24 numpy array)
    relay.push_frame(frame_data, timestamp=time.time())

    await relay.stop()

asyncio.run(main())

Relay with Action Detection

Combine StreamRelay with ActionStore for action detection on external video sources:

import asyncio
import json
from overshoot import OvershootStreamRelay, ActionStore, DetectedAction

store = ActionStore()

def on_result(result):
    """Parse inference results and add to store."""
    if result.get("type") != "inference":
        return

    data = json.loads(result["result"])
    for action_data in data.get("detected_actions", []):
        if action_data.get("detected") and action_data.get("confidence", 0) >= 0.6:
            store.add(DetectedAction(
                action=action_data["action"],
                timestamp=result.get("timestamp", 0),
                frame_number=0,
                confidence=action_data["confidence"],
            ))

# Action detection schema
output_schema = {
    "type": "object",
    "properties": {
        "detected_actions": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "action": {"type": "string", "enum": ["waving", "thumbs up"]},
                    "confidence": {"type": "number"},
                    "detected": {"type": "boolean"},
                },
                "required": ["action", "confidence", "detected"],
            },
        },
    },
}

relay = OvershootStreamRelay(
    api_url="https://cluster1.overshoot.ai/api/v0.2",
    api_key="your-api-key",
    prompt="Detect waving or thumbs up gestures",
    on_result=on_result,
    output_schema=output_schema,
)

async with relay:
    # Push frames...
    pass

# Query results
print(store.summary())
store.export_json("results.json")

StreamRelay Parameters

Parameter	Type	Default	Description
`api_url`	`str`	Required	API endpoint URL
`api_key`	`str`	Required	Your API key
`prompt`	`str`	Required	AI prompt
`on_result`	`Callable`	Required	Result callback
`on_error`	`Callable`	None	Error callback
`model`	`str`	`"gemini-2.0-flash"`	Model name
`backend`	`str`	`"gemini"`	Backend name
`output_schema`	`dict`	None	JSON schema for structured output
`width`	`int`	640	Frame width
`height`	`int`	480	Frame height
`fps`	`int`	15	Frames per second
`sampling_ratio`	`float`	0.8	Frame sampling ratio
`clip_length_seconds`	`float`	0.5	Clip duration
`delay_seconds`	`float`	0.5	Processing delay

Demos

Action Detector Demo

Live camera with action detection overlay:

python demo_action_detector.py

Features:

Live camera preview with action overlay
Detected actions displayed with confidence bars
Press S for store summary, E to export, Q to quit

StreamRelay Action Demo

StreamRelay with ActionStore integration:

python demo_stream_relay_actions.py

Features:

Push frames via StreamRelay
Action detection with structured output
Results stored in ActionStore
Live overlay display

Video Sources

Camera (Default)

from overshoot import CameraSource

source = CameraSource()                    # Default camera
source = CameraSource(device_index=1)      # Second camera
source = CameraSource(frame_processor=fn)  # With preprocessing

Video File

from overshoot import VideoFileSource

source = VideoFileSource(file_path="/path/to/video.mp4")

Frame Preprocessing

Process frames before streaming using the frame_processor callback.

Timestamp Overlay

import cv2
from datetime import datetime

def add_timestamp(frame):
    timestamp = datetime.now().strftime("%H:%M:%S")
    cv2.putText(frame, timestamp, (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
    return frame

source = CameraSource(frame_processor=add_timestamp)

MediaPipe Hand Tracking

import cv2
import mediapipe as mp

mp_hands = mp.solutions.hands
mp_draw = mp.solutions.drawing_utils
hands = mp_hands.Hands()

def add_hand_landmarks(frame):
    rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    results = hands.process(rgb)
    if results.multi_hand_landmarks:
        for hand in results.multi_hand_landmarks:
            mp_draw.draw_landmarks(frame, hand, mp_hands.HAND_CONNECTIONS)
    return frame

source = CameraSource(frame_processor=add_hand_landmarks)

Configuration

RealtimeVisionConfig

Parameter	Type	Required	Description
`api_url`	`str`	Yes	API endpoint URL
`api_key`	`str`	Yes	Your API key
`prompt`	`str`	Yes	AI prompt for video analysis
`on_result`	`Callable`	Yes	Callback for inference results
`source`	`StreamSource`	No	Video source (camera or file)
`backend`	`str`	No	`"overshoot"` or `"gemini"`
`model`	`str`	No	Model name
`output_schema`	`dict`	No	JSON schema for structured output
`on_error`	`Callable`	No	Error callback
`processing`	`dict`	No	Processing parameters
`debug`	`bool`	No	Enable debug logging

Processing Parameters

config = RealtimeVisionConfig(
    # ... required params ...
    processing={
        "sampling_ratio": 0.1,      # Frame sampling (0.0-1.0)
        "fps": 30,                  # Frames per second (1-120)
        "clip_length_seconds": 1.0, # Clip duration (0.1-60.0)
        "delay_seconds": 1.0,       # Processing delay (0.0-60.0)
    },
)

Structured Output

Use JSON schema for structured responses:

config = RealtimeVisionConfig(
    # ... required params ...
    output_schema={
        "type": "object",
        "properties": {
            "objects": {
                "type": "array",
                "items": {"type": "string"}
            },
            "count": {"type": "integer"}
        }
    },
)

Error Handling

from overshoot import (
    ApiError,
    ValidationError,
    UnauthorizedError,
    NotFoundError,
    ServerError,
    NetworkError,
)

try:
    async with RealtimeVision(config) as vision:
        await asyncio.sleep(60)
except UnauthorizedError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Invalid configuration: {e.message}")
    print(f"Details: {e.details}")
except NetworkError as e:
    print(f"Connection failed: {e}")
except ApiError as e:
    print(f"API error {e.status_code}: {e.message}")

API Reference

RealtimeVision

Method	Description
`start()`	Start the video stream
`stop()`	Stop the stream and release resources
`update_prompt(prompt)`	Update the AI prompt
`submit_feedback(rating, category, feedback)`	Submit feedback
`get_stream_id()`	Get current stream ID
`is_active()`	Check if stream is running

ActionDetector

Method	Description
`start()`	Start real-time detection from camera
`stop()`	Stop detection
`analyze_video(path)`	Analyze a video file
`get_actions(**filters)`	Query detected actions
`summary()`	Get action counts by type
`export(path, format)`	Export to JSON or CSV
`clear()`	Clear stored actions

ActionStore

Method	Description
`add(action)`	Add a detected action
`add_many(actions)`	Add multiple actions
`get_actions(**filters)`	Query with filters
`summary()`	Get counts by action type
`export_json(path)`	Export to JSON
`export_csv(path)`	Export to CSV
`clear()`	Remove all actions

OvershootStreamRelay

Method	Description
`start()`	Start the relay
`stop()`	Stop and release resources
`push_frame(frame, timestamp)`	Push a frame to stream
`update_prompt(prompt)`	Update the AI prompt

Package Structure

overshoot/
├── __init__.py          # Public API exports
├── constants.py         # Default values and limits
├── exceptions.py        # Error classes
├── types.py             # Data classes
├── http_client.py       # Low-level HTTP client
├── realtime_vision.py   # RealtimeVision + OpenCVCameraTrack
├── stream_relay.py      # OvershootStreamRelay
├── action_detector.py   # ActionDetector
└── action_store.py      # ActionStore

Requirements

Python 3.9+
aiohttp
aiortc
opencv-python
numpy

Optional:

mediapipe (for ML preprocessing)
python-dotenv (for .env support)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
examples		examples
overshoot		overshoot
.gitignore		.gitignore
README.md		README.md
demo_action_detector.py		demo_action_detector.py
demo_stream_relay_actions.py		demo_stream_relay_actions.py
overshoot_client.py		overshoot_client.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_action_detector.py		test_action_detector.py
test_h264_video.py		test_h264_video.py
test_webcam.py		test_webcam.py

Folders and files

Latest commit

History

Repository files navigation

Overshoot Python SDK

Features

Installation

Core Components

Quick Start

Basic Streaming

Action Detection

ActionDetector

Analyze Video File

ActionStore

DetectedAction

Stream Relay

Basic Relay

Relay with Action Detection

StreamRelay Parameters

Demos

Action Detector Demo

StreamRelay Action Demo

Video Sources

Camera (Default)

Video File

Frame Preprocessing

Timestamp Overlay

MediaPipe Hand Tracking

Configuration

RealtimeVisionConfig

Processing Parameters

Structured Output

Error Handling

API Reference

RealtimeVision

ActionDetector

ActionStore

OvershootStreamRelay

Package Structure

Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages