Skip to content

TheDataMaestros/datametronome

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

50 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎡 DataMetronome

DataMetronome Logo

Real-time Data Quality & Anomaly Detection Platform

Python 3.13 License: MIT PostgreSQL

πŸš€ What is DataMetronome?

DataMetronome is an open-source, community-driven platform that provides real-time data quality monitoring, anomaly detection, and comprehensive analytics. Built with modern Python technologies, it's designed to help data engineers, DevOps teams, and data scientists ensure their data pipelines are healthy and reliable.

✨ Key Features

πŸ”Œ High-Performance DataPulse Connectors

  • asyncpg - Lightning-fast async PostgreSQL driver
  • psycopg3 - Modern, feature-rich PostgreSQL connector
  • SQLAlchemy - ORM integration with async support
  • UUID optimization - Distributed system ready
  • Connection pooling - Enterprise-grade performance

πŸ€– ML-Powered Anomaly Detection

  • Isolation Forest algorithm for statistical outliers
  • Real-time monitoring of data quality metrics
  • Statistical analysis with configurable thresholds
  • Pattern recognition across multiple data sources
  • Automated alerting for data quality issues

πŸ“Š Beautiful Interactive Dashboard

  • Modern web UI built for real-time monitoring
  • Responsive layouts tuned for analysts and SREs
  • Chart.js visualizations for trends, anomalies, and forecasting
  • Interactive drilldowns across clefs, staves, and incident timelines
  • Dark/light themes with professional styling out of the box

πŸ—οΈ Modern Architecture

  • Modular design - Easy to extend and customize
  • Async-first - High-performance, non-blocking operations
  • Clean interfaces - Simple, consistent APIs
  • Standalone testing - Each datapulse has comprehensive, independent tests
  • Docker support - Easy deployment and testing

πŸ–ΌοΈ Visual Showcase

πŸ“Š Web Dashboard

DataMetronome Dashboard

UI experience showcasing trends, anomaly insights, and clef status

🎯 See it in action!

cd ui-nuxt
npm install
npm run dev

🎯 Perfect For

  • Data Engineers - Build robust, monitored data pipelines
  • DevOps Teams - Monitor data infrastructure health
  • Data Scientists - Ensure data quality for ML models
  • Startups - Get enterprise-grade tools on a budget
  • Open Source Contributors - Extend and improve the platform
  • Enterprise Teams - Deploy in production environments

πŸš€ Quick Start

Prerequisites

  • Python 3.13+
  • Docker and Docker Compose
  • uv package manager

1. Clone the Repository

git clone https://github.com/datametronome/datametronome.git
cd datametronome

2. Start Test Infrastructure

docker-compose -f docker-compose.test.yml up -d

3. Install Python Dependencies

uv pip install -e ./datametronome/pulse/core
uv pip install -e ./datametronome/pulse/postgres

4. Launch the UI

cd ui-nuxt
npm install
npm run dev

The dashboard will open at http://localhost:3000 with full anomaly detection capabilities!

πŸ§ͺ Testing Architecture

DataMetronome uses a standalone testing approach where each datapulse contains its own comprehensive test suite. This allows you to:

  • Test independently - Each datapulse can be tested without the entire ecosystem
  • Plugin and out - Easily add/remove datapulses as needed
  • Deploy separately - Each datapulse can be a standalone package
  • Maintain independently - Isolated dependencies and test coverage

Quick Testing Examples

# Test the core datapulse
cd datametronome/pulse/core
make install && make test

# Test the PostgreSQL datapulse (AsyncPG)
cd datametronome/pulse/postgres
make install && make test

# Test the PostgreSQL datapulse (Psycopg3)
cd datametronome/pulse/postgres-psycopg3
make install && make test

# Test the PostgreSQL datapulse (SQLAlchemy)
cd datametronome/pulse/postgres-sqlalchemy
make install && make test

For detailed testing information, see TESTING_ARCHITECTURE.md.

🌟 Showcase: What You'll See

πŸ“± Dashboard Overview

The DataMetronome dashboard provides 5 powerful tabs that showcase the complete platform:

πŸ“Š Overview Tab

  • Real-time system health metrics
  • Data quality score with beautiful visualizations
  • Key performance indicators and statistics
  • Professional metric cards with gradients

🚨 Anomalies Tab

  • Live anomaly detection from PostgreSQL
  • Statistical analysis of data quality issues
  • Detailed breakdown by table and issue type
  • Actionable insights for immediate action

οΏ½οΏ½ ML Anomalies Tab

  • Machine learning powered detection using Isolation Forest
  • Advanced outlier detection for numerical data
  • Interactive visualizations showing normal vs anomalous patterns
  • ML performance metrics and confidence scores

πŸ“ˆ Trends & Patterns Tab ⭐ NEW!

  • Data Distribution Analysis - Histograms with anomaly highlighting
  • Time Series Analysis - User registrations and orders over time
  • Correlation Analysis - Age vs order amount relationships with trend lines
  • Anomaly Pattern Analysis - Heatmaps and trend analysis over time

πŸ” Investigation Tab

  • Custom SQL queries for deep data exploration
  • Data profiling tools for comprehensive table analysis
  • Sample data viewing for quick insights
  • Interactive data exploration capabilities

🎨 Visualization Features

  • Interactive Histograms with anomaly highlighting
  • Time Series Charts with trend analysis
  • Scatter Plots with correlation analysis and trend lines
  • Heatmaps for anomaly distribution patterns
  • Real-time Metrics with professional styling
  • Responsive Design that works on any device

πŸ”§ Technical Architecture

Core Components

  • DataPulse Core - Abstract interfaces and base classes
  • PostgreSQL Connectors - High-performance database drivers
  • Anomaly Detection Engine - Statistical + ML algorithms
  • Web Dashboard - Dedicated operational console
  • API Layer - FastAPI backend for integrations

Technology Stack

  • Language: Python 3.13 (latest features)
  • Database: PostgreSQL 15+ with UUID extensions
  • ML Framework: scikit-learn for anomaly detection
  • Frontend: SPA web application
  • Charts: Chart.js for interactive visualizations
  • Containerization: Docker for easy deployment
  • Package Management: uv for fast dependency resolution

Architecture Overview

graph TB
    subgraph "DataMetronome Platform"
        A[πŸ“Š UI Dashboard] --> B[πŸ”Œ DataPulse Connectors]
        B --> C[πŸ“ˆ PostgreSQL Database]
        B --> D[πŸ€– Anomaly Detection Engine]
        D --> E[πŸ“Š ML Algorithms]
        D --> F[πŸ“ˆ Statistical Analysis]
        A --> G[πŸ“± Real-time Monitoring]
        G --> H[🚨 Alert System]
    end
    
    subgraph "Data Sources"
        I[πŸ—„οΈ PostgreSQL]
        J[πŸ“Š SQLite]
        K[πŸ”— Custom Connectors]
    end
    
    C --> I
    B --> J
    B --> K
    
    style A fill:#ff6b6b
    style D fill:#4ecdc4
    style E fill:#45b7d1
    style F fill:#96ceb4
Loading

Performance Highlights

  • 10x faster than traditional ORMs
  • Real-time monitoring with sub-second response
  • Scalable architecture for enterprise workloads
  • Optimized UUID handling for distributed systems

πŸ“Š Performance Benchmarks

Our comprehensive testing shows DataMetronome's superior performance:

Insert Performance (Records/Second)

  • asyncpg: 34,981 records/sec (πŸ₯‡ Winner)
  • SQLAlchemy: 15,137 records/sec
  • psycopg3: 1,615 records/sec

Query Performance (Queries/Second)

  • psycopg3: 788 queries/sec (πŸ₯‡ Winner)
  • asyncpg: 515 queries/sec
  • SQLAlchemy: 451 queries/sec

🀝 Get Involved

For Contributors

  • ⭐ Star the repository on GitHub
  • πŸ› Report bugs and request features
  • πŸ’» Contribute code and documentation
  • πŸ’¬ Join discussions in our community

For Users

  • πŸ“š Read the documentation
  • πŸš€ Try the quick start guide
  • 🎯 Explore the dashboard features
  • πŸ”§ Customize for your use case

πŸ“š Documentation

Core Documentation

Additional Resources

πŸ† Why Choose DataMetronome?

For Data Engineers

  • Proactive monitoring - Catch issues before they become problems
  • Real-time insights - Immediate visibility into data health
  • Easy integration - Works with existing PostgreSQL databases
  • Extensible platform - Add custom anomaly detection rules

For DevOps Teams

  • Infrastructure monitoring - Track database health and performance
  • Automated alerting - Get notified of data quality issues
  • Performance metrics - Monitor query performance and bottlenecks
  • Easy deployment - Docker support for containerized environments

For Data Scientists

  • Data quality assurance - Ensure ML models have clean data
  • Anomaly detection - Identify outliers and data drift
  • Statistical analysis - Built-in statistical tools and visualizations
  • ML integration - Use our algorithms or integrate your own

πŸ“ˆ Roadmap

  • Q1 2024 βœ… - Core DataPulse connectors, basic anomaly detection, UI prototype
  • Q2 2024 πŸ”„ - Advanced ML algorithms, real-time streaming, alert system
  • Q3 2024 πŸ“‹ - Multi-database support, advanced analytics, API integrations
  • Q4 2024 πŸ“‹ - Community features, plugin system, advanced reporting

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“§ Contact


🎡 DataMetronome - Making data quality better for everyone!

Built with ❀️ by the open source community

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published