Skip to content

Remove content length validation to allow for chunked responses#2015

Merged
dkosowski87 merged 8 commits intomainfrom
issue-1967/content-length-validation
Feb 18, 2026
Merged

Remove content length validation to allow for chunked responses#2015
dkosowski87 merged 8 commits intomainfrom
issue-1967/content-length-validation

Conversation

@dkosowski87
Copy link
Contributor

@dkosowski87 dkosowski87 commented Feb 18, 2026

Context

Due to the content-length validation, we couldn't process chunked responses that didn't provide that header. This was evident when compression was introduced in roboflow repo.

Summary

  • Removed Content-Length validation from Roboflow API GET requests and dropped the verify_content_length parameter from the request helpers and all callers.
  • Improved MD5 verification logging: when MD5 verification is enabled but the response has no x-goog-hash header, or has one but without the md5 part in the value a warning is logged with a sanitized request URL (API key redacted).
  • Added X-Allow-Chunked: true to all outbound Roboflow API request headers by default.

Changes

Content-Length validation removal

  • Removed the verify_content_length parameter from get_from_url() and _get_from_url() in inference/core/roboflow_api.py, and deleted the Content-Length validation block.
  • Updated all call sites that passed verify_content_length=True to use the new signatures:
    • get_roboflow_model_data, get_roboflow_instant_model_data in roboflow_api.py
    • get_from_url in inference/core/models/roboflow.py (environment and model weights)
    • get_from_url in inference/models/sam3/visual_segmentation.py and inference/models/sam3/segment_anything3.py
  • Adjusted tests: removed Content-Length from response mocks in test_get_roboflow_model_data_when_response_parsing_error_occurs and test_get_roboflow_model_data_when_valid_response_expected.

MD5 verification warning (no x-goog-hash)

  • When MD5_VERIFICATION_ENABLED is True and the response has no x-goog-hash header, a warning is logged including the request URL.
  • When MD5_VERIFICATION_ENABLED is True and there is a response that has x-goog-hash header, but it's md5 part is missing, warning is logged including the request URL.
  • The URL is sanitized by removing any params, and just using the scheme, netloc and path.
  • Tests added:
    • test_get_from_url_when_md5_verification_enabled_but_x_goog_hash_header_missing – warning is logged and the request still succeeds.
    • test_get_from_url_when_md5_verification_enabled_but_x_goog_hash_missing_does_not_log_api_key – URL contains api_key but the logged message does not contain the secret.

New request header: X-Allow-Chunked

  • Introduced ALLOW_CHUNKED_RESPONSE_HEADER = "X-Allow-Chunked" and set it to "true" in build_roboflow_api_headers(), so every outbound Roboflow API request (GET and POST) sends this header by default.
  • All test_build_roboflow_api_headers_* tests and the get_roboflow_workspace_async header assertion were updated to expect this header.

Testing

  • [x]pytest tests/inference/unit_tests/core/test_roboflow_api.py — all relevant tests updated or added; no new failures.

* Simplified API calls by removing the `verify_content_length` parameter from `_get_from_url` and related functions.
* Updated all instances in the codebase to reflect this change, ensuring consistent behavior across model data retrieval.
* Adjusted unit tests to remove unnecessary content length headers in mock responses.
* Added a warning log when the x-goog-hash header is missing during MD5 verification.
* Updated the logic to check for the presence of the x-goog-hash header before performing MD5 hash comparison.
* Added a unit test to verify behavior when the x-goog-hash header is absent, ensuring proper logging and response handling.
* Updated the logging behavior to ensure that the API key is not included in warning messages when the x-goog-hash header is missing during MD5 verification.
* Added a unit test to verify that the API key is not logged in such scenarios, enhancing security and privacy in API interactions.
* Introduced a new header, X-Allow-Chunked, to enable chunked responses from the Roboflow API.
* Updated the build_roboflow_api_headers function to include the new header in the request.
* Enhanced unit tests to verify the inclusion of the X-Allow-Chunked header in various scenarios.
* Updated error messages in the get_roboflow_workspace and get_roboflow_workspace_async functions to remove unnecessary f-strings, enhancing clarity.
* Modified exception handling in load_cached_workflow_response to catch specific exceptions, improving robustness.
@CLAassistant
Copy link

CLAassistant commented Feb 18, 2026

CLA assistant check
All committers have signed the CLA.

if MD5_VERIFICATION_ENABLED:
if "x-goog-hash" not in response.headers:
safe_url = API_KEY_PATTERN.sub(deduct_api_key, wrap_url(url))
logger.warning(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was about dismissing alert - which probably would be fine, but maybe better idea is to urlparse and only provide schema, host and path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done that. Because there is more sensitive stuff than just the API_KEY in the params unfortunately.

* Introduced a new utility function, _url_for_safe_logging, to strip sensitive query parameters from URLs before logging.
* Updated logging behavior in the MD5 verification process to use the new utility, ensuring sensitive information is not exposed in logs.
* Added a unit test to verify that the logged URL path does not include query parameters, enhancing security in API interactions.
* Updated the logic in the _get_from_url function to check for the presence of the md5= part in the x-goog-hash header, adding a warning log if it is missing.
* Enhanced unit tests to cover scenarios where the x-goog-hash header is present but lacks the md5= part, ensuring proper logging behavior.
* This change improves the clarity of error messages related to MD5 verification, enhancing debugging and monitoring capabilities.
md5_part = part.strip()[4:]
break
if md5_part is not None:
md5_from_header = base64.b64decode(md5_part)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe good idea would be to try except and fail with md5 verification rather than base64 error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thx.

* Added error handling for invalid base64 values in the x-goog-hash header during MD5 verification in the _get_from_url function.
* Introduced a specific exception to raise when the MD5 value is not valid base64, improving clarity in error reporting.
* Added a unit test to verify behavior when an invalid MD5 part is encountered, ensuring robust error handling in API interactions.
@dkosowski87 dkosowski87 merged commit 7a846d9 into main Feb 18, 2026
52 checks passed
@dkosowski87 dkosowski87 deleted the issue-1967/content-length-validation branch February 18, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants