Azure Content Understanding

A collection of Jupyter notebooks demonstrating the capabilities of Azure Content Understanding (GA version - API 2025-11-01) . These demos showcase how to process and analyze documents, images, videos, and audio files using generative AI to extract structured data from unstructured content.

🎯 Overview

Azure Content Understanding is a Generally Available (GA) Azure AI service that uses generative AI to transform unstructured content into structured, searchable data. This repository contains practical, ready-to-run notebooks that demonstrate various capabilities of the service across different content types.

🔍 What is Azure Content Understanding?

Azure Content Understanding in Foundry Tools is an AI service available as part of the Microsoft Foundry Resource in Azure. It processes and ingests content of many types:

Documents (PDF, DOCX, XLSX, images)
Videos (MP4, MOV, AVI)
Audio (WAV, MP3, M4A)
Images (JPG, PNG, TIFF)

The service offers a streamlined process to reason over large amounts of unstructured data, accelerating time-to-value by generating structured output that can be integrated into automation and analytical workflows.

✨ Key Features

Content Extraction

Document Processing: OCR, layout analysis, table recognition, and structural element detection
Video Analysis: Frame extraction, shot detection, speech-to-text transcription
Audio Processing: Speech-to-text transcription with high accuracy
Image Analysis: Visual content understanding and data extraction

Generative Capabilities

Field Extraction: Define custom schemas to extract specific fields from any content type
Classification: Categorize content into up to 200 categories with integrated classification
Content Summarization: Generate summaries and insights from extracted content
Face Description: Generate textual descriptions of faces in video and image content (with proper authorization)

Enterprise Features (GA)

Microsoft Entra ID authentication
Managed identities support
Customer-managed keys
Virtual networks and private endpoints
Transparent pricing model

📚 Python Notebooks

Notebook	Description
Managing analyzers	Learn how to create, configure, and manage analyzers for different content types.
Field extraction	Demonstrates extracting predefined fields from documents using built-in schemas.
Custom field extraction	Shows how to define and extract custom fields tailored to your business needs.
Classifier	Explains how to classify content into categories using integrated classification APIs.
Document content extraction	Focuses on OCR, layout analysis, and table recognition for multi-page documents.
Audio extraction	Covers speech-to-text transcription and audio content analysis with speaker identification.
Video content extraction	Demonstrates video frame extraction, scene detection, and speech transcription from video.

📦 Prerequisites

Before running these notebooks, ensure you have:

Azure Subscription
- An active Azure subscription (Create one for free)
Azure AI Foundry Resource
- Create an Azure AI Foundry resource
- Note your endpoint URL and API key
Model Deployments (Required for prebuilt analyzers)
- Deploy GPT-4.1-mini, GPT-4.1 model
- Deploy text-embedding-3-large model
- See deployment documentation
Role Assignment
- Grant yourself the Cognitive Services User role on the resource
- This is required even if you're the resource owner
Python Environment
- Python 3.8 or higher
- Jupyter Notebook or JupyterLab

📓 Notebooks Overview

This repository contains demonstration notebooks covering:

Document Analysis

Document field extraction with custom schemas
Layout analysis and table extraction
Multi-page document processing
Classification and routing

Image Processing

Image content extraction
Visual question answering
Figure detection and analysis
Object and text recognition

Video Analysis

Video frame extraction
Scene detection and segmentation
Speech transcription from video
Visual content summarization

Audio Processing

Audio transcription
Speaker identification
Audio content analysis

Advanced Scenarios

Multi-modal content analysis
RAG (Retrieval-Augmented Generation) integration
Batch processing workflows
Custom analyzer creation

💼 Use Cases

Azure Content Understanding is ideal for:

Financial Services: Tax document processing, mortgage application analysis
Healthcare: Medical record extraction and analysis
Legal: Contract review and clause extraction
Manufacturing: Quality control and defect detection
Retail: Inventory management and shelf analysis
Media: Content cataloging and metadata extraction
Analytics & Reporting: Enhanced business intelligence from unstructured data

📌 API Version

These notebooks use the GA API version: 2025-11-01

This is the Generally Available version with production-ready features, enterprise security, and enhanced capabilities compared to previous preview versions.

Migration from Preview

If you're migrating from preview API versions (2024-12-01-preview or 2025-05-01-preview), refer to the migration guide.

Breaking Changes from Preview

Managed capacity for preview models retired (BYO model deployments required)
Dedicated classifier APIs deprecated (now integrated in analyzer API)
Video segmentation unified with classification capabilities

📚 Resources

Documentation

Related Repositories

Responsible AI

Responsible AI Guidelines

👤 Author

Serge Retkowsky

Platform	Link
GitHub	https://github.com/retkowsky
LinkedIn	https://www.linkedin.com/in/serger/
YouTube	https://www.youtube.com/@serge1840/videos
Medium	https://medium.com/@sergems18
Role	AI & APPS Global Black Belt @ Microsoft France

Last Updated: 02-December-2025

For questions, issues, or feedback, please open an issue in this repository or contact through the channels above.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
documents		documents
helper		helper
1. Managing analyzers.ipynb		1. Managing analyzers.ipynb
2. Field extraction.ipynb		2. Field extraction.ipynb
3. Custom field extraction.ipynb		3. Custom field extraction.ipynb
4. Classifier.ipynb		4. Classifier.ipynb
5. Document content extraction.ipynb		5. Document content extraction.ipynb
6. Audio extraction.ipynb		6. Audio extraction.ipynb
7. Video content extraction.ipynb		7. Video content extraction.ipynb
README.md		README.md
azure.env		azure.env
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Azure Content Understanding

🎯 Overview

🔍 What is Azure Content Understanding?

✨ Key Features

Content Extraction

Generative Capabilities

Enterprise Features (GA)

📚 Python Notebooks

📦 Prerequisites

📓 Notebooks Overview

Document Analysis

Image Processing

Video Analysis

Audio Processing

Advanced Scenarios

💼 Use Cases

📌 API Version

Migration from Preview

Breaking Changes from Preview

📚 Resources

Documentation

Related Repositories

Responsible AI

👤 Author

About

Uh oh!

Releases

Packages

Languages

retkowsky/azure-content-understanding-ga

Folders and files

Latest commit

History

Repository files navigation

Azure Content Understanding

🎯 Overview

🔍 What is Azure Content Understanding?

✨ Key Features

Content Extraction

Generative Capabilities

Enterprise Features (GA)

📚 Python Notebooks

📦 Prerequisites

📓 Notebooks Overview

Document Analysis

Image Processing

Video Analysis

Audio Processing

Advanced Scenarios

💼 Use Cases

📌 API Version

Migration from Preview

Breaking Changes from Preview

📚 Resources

Documentation

Related Repositories

Responsible AI

👤 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages