Skip to content

Conversation

@yan-3005
Copy link
Contributor

@yan-3005 yan-3005 commented Nov 15, 2025

Describe your changes:

Fixes #24424

Timed Cleanup
Needs Flowable Batch Tables
Timed Cleanup active Every week Sunday 12 am, cleans up completed histories which are over 7 days.

Flowable Ops Behavior:
Command Parameters

cleanup-flowable-history --delete --runtime-batch-size=1000 --history-batch-size=1000 --cleanup-all

Parameters:

  - --delete (default: false): Dry-run mode by default. Only with this flag does actual cleanup occur
  - --runtime-batch-size (default: 1000): Batch size for cleaning runtime instances
  - --history-batch-size (default: 1000): Batch size for cleaning historic instances

Workflow Analysis & Cleanup Logic

  1. Workflow Discovery Phase
  • Scans all OpenMetadata workflow definitions to understand trigger types
  • Retrieves all Flowable process definitions (deployed workflows)
  • Creates mapping between workflow names and their trigger types
  • Groups process definitions by key to identify versions
  1. Version Management
  • Old Versions: Non-latest versions of each workflow → Full cleanup (deployments + history)
  • Latest Versions: Current versions → History-only cleanup (preserves active deployments)
  1. Trigger-Based Cleanup Strategy

The cleanup behavior depends on the workflow's trigger type:

Periodic Batch Entity Workflows (periodicBatchEntity): Eg TableEntityCertificationWorkflow

  • ✅ Runtime instances cleanup
  • ✅ Historic instances cleanup
  • ✅ Deployment cleanup
  • Reason: These run periodically, safe to clean old data

Event Based Entity Workflows (eventBasedEntity): GlossaryApprovalWorkflow

  • ❌ Runtime instances cleanup
  • ✅ Historic instances cleanup
  • ❌ Deployment cleanup
  • Reason: Event-driven, deployments must remain active

NoOp Workflows (noOp): AutoPilotWorkflow

  • ❌ Runtime instances cleanup
  • ✅ Historic instances cleanup
  • ❌ Deployment cleanup

Unknown Trigger Types:

  • ❌ Runtime instances cleanup
  • ✅ Historic instances cleanup
  • ❌ Deployment cleanup
  • Safety-first approach

Notes:

  • Dry-run by default: Shows what would be cleaned without making changes
  • Batch processing: Prevents memory issues with large datasets
  • Per-definition cleanup: Precise targeting using process definition IDs
  • Trigger-aware logic: Respects workflow types (never breaks event-based workflows)
  • Latest version protection: Never removes active deployment versions
  • Comprehensive logging: Detailed progress and error reporting

Output for Dry Run
image
image

Output for Actual Cleanup
image

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@github-actions
Copy link
Contributor

TypeScript types have been updated based on the JSON schema changes in the PR

@github-actions github-actions bot requested a review from a team as a code owner November 15, 2025 11:05
@yan-3005 yan-3005 changed the title Flowable History Timed Cleanup and Ops Command for cleanup Feat: #24424 Flowable History Timed Cleanup and Ops Command for cleanup Nov 18, 2025
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flowable History Cleanup Ops command and Timed Flowable History Cleanup

3 participants