Skip to content

Conversation

@chernishev
Copy link
Contributor

Desbordante is an open-source data profiler specifically focused on complex patterns — such as numerical association rules, differential dependencies, denial constraints, and more. The pip package lets you discover and validate patterns, inspect where they fail, and combine discovered patterns with other Python libraries (including machine learning ones) to build ad‑hoc data‑quality workflows. These patterns can help with data deduplication, schema matching, anomaly detection, data understanding, hypothesis generation, and more. You can also extract and enforce complex integrity constraints.

Key features:

  • Can discover and validate complex patterns in data
  • Supports tabular, transactional and graph data types
  • High-performance: the core is implemented in C++ with the emphasis on speed
  • Provides explanations: it can pinpoint why a given pattern fails
  • Each supported pattern comes with usage examples

What's the difference between this Python project and similar ones?

Traditional profilers (e.g., YData Profiling) focus on basic stats — min/max, NULL counts, distinct values, correlations — and generally don’t support complex patterns. Desbordante is the only tool aimed specifically at discovering those complex patterns.

Copy link

@YuzeHao2023 YuzeHao2023 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@chernishev
Copy link
Contributor Author

bump @vinta

@vinta vinta merged commit 5454e85 into vinta:master Nov 20, 2025
@chernishev
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants