Skip to content

Add single-tenant workflow cache mode and thread workflow_version_id across the stack#2031

Merged
alexnorell merged 2 commits intomainfrom
feature/single-tenant-workflow-cache-and-version-id-support
Feb 24, 2026
Merged

Add single-tenant workflow cache mode and thread workflow_version_id across the stack#2031
alexnorell merged 2 commits intomainfrom
feature/single-tenant-workflow-cache-and-version-id-support

Conversation

@alexnorell
Copy link
Contributor

@alexnorell alexnorell commented Feb 24, 2026

What does this PR do?

Two related changes to improve offline / single-tenant deployments and complete workflow_version_id support across the entire stack.

1. Single-tenant workflow cache (SINGLE_TENANT_WORKFLOW_CACHE)

The workflow file cache currently embeds an MD5 hash of the API key in the filename ({workflow_id}_{api_key_hash}.json). This makes it impossible to pre-populate the cache and run fully offline without an API key, since the hash changes (or becomes None).

Setting SINGLE_TENANT_WORKFLOW_CACHE=true drops the API key hash from cache filenames, producing {workflow_id}.json (or {workflow_id}_v{version}.json when a version is pinned). The existing path-traversal protections (sanitize_path_segment + startswith check) remain in place.

2. Thread workflow_version_id everywhere

PR #2022 added workflow_version_id to InferencePipeline and get_workflow_specification(), but the parameter was not yet exposed in the HTTP API, inference SDK, CLI, or WebRTC workers. This change threads it through:

  • HTTP API request models and handlers
  • InferenceHTTPClient SDK methods (run_workflow / infer_from_workflow)
  • All three CLI workflow commands (process_video, process_image, process_images_directory)
  • WebRTC / modal workers
  • File cache path (single-tenant mode) and ephemeral in-memory cache key

3. Fix CodeQL py/weak-sensitive-data-hashing alert

Replaced hashlib.md5 with hashlib.sha256 for API key hashing in both the file-cache path (get_workflow_cache_file) and the ephemeral in-memory cache key (_prepare_workflow_response_cache_key). MD5 was only used as a cache-key fingerprint — not for password storage — but SHA-256 satisfies the CodeQL security scanner with no functional downside. The only side-effect is a one-time cache miss on deploy since the hash output changes.

Type of Change

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change that fixes an issue)

Testing

  • I have tested this change locally
  • Linters pass (black, isort, flake8)

Test details:

All new parameters default to None, preserving existing behaviour. When SINGLE_TENANT_WORKFLOW_CACHE is not set (default False), cache filenames are identical to today (now using SHA-256 instead of MD5).

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings or errors

Additional Context

This is a follow-up to #2022. No new dependencies or config changes beyond the optional SINGLE_TENANT_WORKFLOW_CACHE env var.

@alexnorell alexnorell force-pushed the feature/single-tenant-workflow-cache-and-version-id-support branch from 97779b7 to e255019 Compare February 24, 2026 06:00
@alexnorell alexnorell force-pushed the feature/single-tenant-workflow-cache-and-version-id-support branch from 405a176 to aaa6168 Compare February 24, 2026 06:09
@alexnorell alexnorell force-pushed the feature/single-tenant-workflow-cache-and-version-id-support branch from aaa6168 to 96347ab Compare February 24, 2026 06:11
@alexnorell
Copy link
Contributor Author

@dkosowski87 @PawelPeczek-Roboflow

I've already swapped this out md5 of just the API key to sha256 salted with workspace.

I think the original intent of the workflow hashing feature was to fix an issue with workflow name collisions when using the hosted API. Not exactly excited that the api key is used as a hash key for the workflow, but I don't want to fully change the implementation here. Will leave it up to someone else to decide if we want to change it.

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for get_workflow_cache_file in inference/core/roboflow_api.py

⏱️ Runtime : 9.59 milliseconds 8.16 milliseconds (best of 69 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch feature/single-tenant-workflow-cache-and-version-id-support).

Static Badge

dkosowski87
dkosowski87 previously approved these changes Feb 24, 2026
This optimization achieves a **17% runtime improvement** by precompiling the regex pattern used in `sanitize_path_segment()`. 

**What changed:**
The regex pattern `r"[^A-Za-z0-9_-]"` is now compiled once at module load time as `_pattern = re.compile(...)` instead of being recompiled on every call to `re.sub()`.

**Why this improves performance:**
In Python, `re.sub()` with a string pattern internally compiles the regex on each invocation. The compiled pattern is cached by Python's regex engine, but there's still overhead from cache lookups and validation. By explicitly precompiling the pattern, we eliminate this overhead entirely. The line profiler shows `sanitize_path_segment()` dropped from 12.6ms to 2.85ms total time (77% faster for that function alone), with per-hit time decreasing from 3,844ns to 872ns.

**Impact on workloads:**
The function references show `get_workflow_cache_file()` is called from three different cache operations (`cache_workflow_response`, `load_cached_workflow_response`, `delete_cached_workflow_response_if_exists`). These are in file system caching hot paths where workflows are repeatedly loaded, saved, and validated. Each call to `get_workflow_cache_file()` invokes `sanitize_path_segment()` 2-3 times (for workspace_id, workflow_id, and optionally version_id), multiplying the benefit.

**Test results demonstrate:**
- Multi-tenant operations with API keys show 13-26% speedup (where hashing dominates, sanitization still contributes)
- Single-tenant operations with version IDs show 20% speedup (sanitization is more prominent)
- Large-scale tests (1000 iterations) show consistent 15.5% improvement, confirming the optimization scales well
- All test cases pass with identical outputs, confirming correctness is preserved

This is particularly valuable in production environments where workflow definitions are frequently accessed, as the cumulative effect of faster path sanitization reduces latency across all caching operations.

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 24, 2026

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Feb 24, 2026

This PR is now faster! 🚀 codeflash-ai[bot] accepted my code suggestion above.

@alexnorell alexnorell merged commit c9658d7 into main Feb 24, 2026
55 checks passed
@alexnorell alexnorell deleted the feature/single-tenant-workflow-cache-and-version-id-support branch February 24, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants