From fae9e512619cebdb00b104c6ba5f41fbd3a2d0ee Mon Sep 17 00:00:00 2001 From: Rod Vagg Date: Sat, 11 Oct 2025 14:41:33 +1100 Subject: [PATCH 1/3] docs(trustless): clarify streaming gateway behavior & response headers - Add Accept vs format parameter precedence - Clarify 404 vs streaming behavior for missing blocks - Document Cache-Control: only-if-cached (412 response) - Add X-Ipfs-Roots guidance for streaming gateways - Document Etag implementation variance - Add Content-Disposition filename guidance (.car vs .bin) - Improve wording (CAR stream vs CAR file) --- src/http-gateways/path-gateway.md | 10 ++++ src/http-gateways/trustless-gateway.md | 75 ++++++++++++++++++++++++-- 2 files changed, 81 insertions(+), 4 deletions(-) diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index af4c46c7..a45712c9 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -332,6 +332,11 @@ Gateways MUST use 404 to signal that content is not available, particularly when the gateway is [non recursive](#recursive-vs-non-recursive-gateways), and only provides access to a known dataset, so that it can assess that the requested content is not part of it. +NOTE: Gateways MUST return 404 for missing root blocks. However, for streaming +responses (such as CAR), once HTTP 200 OK status is sent, gateways cannot +change it. If a child block is missing during streaming, the gateway SHOULD +terminate the stream. Clients MUST verify response completeness. + ### `410` Gone Error to indicate that request was formally correct, but this specific Gateway @@ -668,6 +673,11 @@ NOTE: while the first CID will change every time any article is changed, the last root (responsible for specific article or a subdirectory) may not change at all, allowing for smarter caching beyond what standard Etag offers. +NOTE: Gateways that stream responses (e.g., CAR) without pre-resolving the +entire path MAY only include the root CID for simple `/ipfs/{cid}` requests, or +MAY omit this header for path requests where intermediate CIDs are not known +when headers are sent. + ### `X-Content-Type-Options` (response header) Optional, present in certain response types: diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index 4ee45d43..caa5cd3a 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -39,7 +39,7 @@ The minimal implementation means: - for raw blocks: - data is requested by CID, only supported path is `/ipfs/{cid}` - no path traversal or recursive resolution -- for CAR files: +- for CARs: - the pathing behavior is identical to :cite[path-gateway] # HTTP API @@ -123,6 +123,11 @@ A Client SHOULD include the `format` query parameter in the request URL, in addition to the `Accept` header. This provides the best interoperability and ensures consistent HTTP cache behavior across various gateway implementations. +When both the `Accept` header and `format` parameter are present, a specific +`Accept` value (e.g., `application/vnd.ipld.raw`) SHOULD take precedence over +`format`. Wildcards (e.g., `*/*`, `application/*`) are not specific and do not +take precedence (as specified in :cite[path-gateway]). + ::: ### :dfn[`dag-scope`] (request query parameter) @@ -130,8 +135,7 @@ ensures consistent HTTP cache behavior across various gateway implementations. Optional, `dag-scope=(block|entity|all)` with default value `all`, only available for CAR requests. Describes the shape of the DAG fetched the terminus of the specified path whose blocks -are included in the returned CAR file after the blocks required to traverse -path segments. +are included in the returned CAR stream after the blocks required to traverse path segments. - `block` - Only the root block at the end of the path is returned after blocks required to verify the specified path segments. @@ -248,6 +252,34 @@ Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gateway], with special attention to the "Response Status Codes" and the "Recursive vs non-recursive gateways" sections. +## Response Status Codes + +Trustless Gateways MUST follow the response status codes defined in :cite[path-gateway], including: + +### `404 Not Found` + +A Trustless Gateway MUST return `404 Not Found` when the **root block** (the CID in the request path) is not available in the gateway's storage. + +This applies to: +- HEAD requests for any CID +- GET requests for raw blocks (`application/vnd.ipld.raw`) +- GET requests for CAR streams (`application/vnd.ipld.car`) when the root block is missing + +For non-recursive Trustless Gateways (such as those serving from a local block store), this definitively signals that the requested content is not part of the gateway's dataset. + +### Streaming and Missing Child Blocks + +For CAR responses, once a gateway begins streaming (after successfully loading the root block), it has committed to HTTP `200 OK`. If a child block is encountered as missing during DAG traversal: + +- The gateway SHOULD terminate the stream (potentially with an incomplete CAR) +- Clients MUST verify CAR completeness and handle incomplete streams as retrieval failures + +This follows the streaming principle stated in the [`entity-bytes`](#entity-bytes-request-query-parameter) section above. + +### `500 Internal Server Error` + +A Trustless Gateway SHOULD return `500 Internal Server Error` only for genuine server errors, not for content unavailability. Examples include storage backend failures, resource exhaustion, or unexpected internal errors. + ## Response Headers ### `Content-Type` (response header) @@ -264,12 +296,47 @@ If a CAR stream was requested: MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser. +When no custom `filename` is provided: +- CAR responses should use `filename=".car"` +- Raw block responses should use `filename=".bin"` + ### `Content-Location` (response header) Same as in :cite[path-gateway], SHOULD be returned when Trustless Gateway supports more than a single response format and the `format` query parameter is missing or does not match well-known format from `Accept` header. +### `Cache-Control: only-if-cached` (request header) + +Trustless gateways, particularly non-recursive ones serving from a local block +store, are well-suited for :cite[path-gateway]'s `Cache-Control: only-if-cached` +request header. When received, gateway SHOULD return HTTP 412 if the root block +is not immediately available. + +### `X-Ipfs-Path` and `X-Ipfs-Roots` (response headers) + +See :cite[path-gateway] for definitions. Trustless gateways SHOULD return +`X-Ipfs-Path`. For `X-Ipfs-Roots`, streaming gateways MAY only include the root +CID or omit for path requests where intermediate CIDs are unknown when headers +are sent. + +### `Etag` (response header) + +MUST be returned and follow the recommendations in :cite[path-gateway]. + +:::note + +**Implementation Variance**: Etag generation for CAR responses is +implementation-specific. Different gateways may generate different Etags for +identical requests due to variations in what parameters are included (e.g., +`order`, `dups`) and how they are encoded in the Etag calculation. + +As a result, `If-None-Match` conditional requests may not work across different +gateway implementations. Clients SHOULD NOT assume Etags are portable between +gateways. + +::: + # Block Responses (application/vnd.ipld.raw) An opaque bytes matching the requested block CID @@ -466,7 +533,7 @@ that the endpoint corresponds to a trustless gateway. For block requests (signaled by `?format=raw` and `Accept: application/vnd.ipld.raw`), when supported, it MUST return `200 OK` and an empty body. -For CAR requests (signaled by `?format=car` and `Accept: application/vnd.ipld.car`), when supported, it MUST return `200 OK` and a valid CAR file with CAR Header `roots` set to `bafkqaaa`. Identity block MAY be skipped in the CAR Data section. +For CAR requests (signaled by `?format=car` and `Accept: application/vnd.ipld.car`), when supported, it MUST return `200 OK` and a valid CAR with CAR Header `roots` set to `bafkqaaa`. Identity block MAY be skipped in the CAR Data section. This specific identity CID is special for probing. Other random identity CIDs MAY not be handled. From d5533c1bc2b4b23a0241b6f0897196ddec991371 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 14 Oct 2025 00:06:32 +0200 Subject: [PATCH 2/3] docs: update spec metadata and fix cache-control placement - move Cache-Control: only-if-cached to request headers section - remove X-Ipfs-Path and X-Ipfs-Roots response headers section - update editor affiliations and reorganize contributors - update dates to 2025-10-13 --- src/http-gateways/path-gateway.md | 20 +++++++++-------- src/http-gateways/trustless-gateway.md | 31 ++++++++++++-------------- 2 files changed, 25 insertions(+), 26 deletions(-) diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index a45712c9..d75b5d3c 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -4,15 +4,23 @@ description: > The comprehensive low-level HTTP Gateway enables the integration of IPFS resources into the HTTP stack through /ipfs and /ipns namespaces, supporting both deserialized and verifiable response types. -date: 2024-04-17 +date: 2025-10-13 maturity: reliable editors: - name: Marcin Rataj github: lidel url: https://lidel.org/ affiliation: - name: Protocol Labs - url: https://protocol.ai/ + name: Shipyard + url: https://ipshipyard.com +former_editors: + - name: Henrique Dias + github: hacdias + url: https://hacdias.com/ + affiliation: + name: Shipyard + url: https://ipshipyard.com +thanks: - name: Adrian Lanzafame github: lanzafame affiliation: @@ -28,12 +36,6 @@ editors: affiliation: name: Protocol Labs url: https://protocol.ai/ - - name: Henrique Dias - github: hacdias - url: https://hacdias.com/ - affiliation: - name: Protocol Labs - url: https://protocol.ai/ xref: - url - trustless-gateway diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index caa5cd3a..e892d915 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -4,7 +4,7 @@ description: > The minimal subset of HTTP Gateway response types facilitates data retrieval via CID and ensures integrity verification, all while eliminating the need to trust the gateway itself. -date: 2025-03-06 +date: 2025-10-13 maturity: reliable editors: - name: Marcin Rataj @@ -12,13 +12,17 @@ editors: affiliation: name: Shipyard url: https://ipshipyard.com - - name: Henrique Dias - github: hacdias - name: Héctor Sanjuán github: hsanjuan affiliation: name: Shipyard url: https://ipshipyard.com +former_editors: + - name: Henrique Dias + github: hacdias +thanks: + - name: Rod Vagg + github: rvagg xref: - url - path-gateway @@ -108,6 +112,13 @@ gateway implementations. ::: +### `Cache-Control: only-if-cached` (request header) + +Trustless gateways, particularly non-recursive ones serving from a local block +store, are well-suited for :cite[path-gateway]'s `Cache-Control: only-if-cached` +request header. When received, gateway SHOULD return HTTP 412 if the root block +is not immediately available. + ## Request Query Parameters ### :dfn[`format`] (request query parameter) @@ -306,20 +317,6 @@ Same as in :cite[path-gateway], SHOULD be returned when Trustless Gateway supports more than a single response format and the `format` query parameter is missing or does not match well-known format from `Accept` header. -### `Cache-Control: only-if-cached` (request header) - -Trustless gateways, particularly non-recursive ones serving from a local block -store, are well-suited for :cite[path-gateway]'s `Cache-Control: only-if-cached` -request header. When received, gateway SHOULD return HTTP 412 if the root block -is not immediately available. - -### `X-Ipfs-Path` and `X-Ipfs-Roots` (response headers) - -See :cite[path-gateway] for definitions. Trustless gateways SHOULD return -`X-Ipfs-Path`. For `X-Ipfs-Roots`, streaming gateways MAY only include the root -CID or omit for path requests where intermediate CIDs are unknown when headers -are sent. - ### `Etag` (response header) MUST be returned and follow the recommendations in :cite[path-gateway]. From 0ba155e9d68b13c809aa78bd1143d2a990993748 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Tue, 14 Oct 2025 00:15:21 +0200 Subject: [PATCH 3/3] docs(path-gateway): clarify X-Ipfs-Path and X-Ipfs-Roots usage specify that X-Ipfs-Path and X-Ipfs-Roots headers should be returned with deserialized responses, and may be omitted with trustless response types (raw blocks and CAR) --- src/http-gateways/path-gateway.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index d75b5d3c..c207ae6d 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -646,6 +646,10 @@ Indicates the original, requested content path before any path resolution and tr Example: `X-Ipfs-Path: /ipns/k2..ul6/subdir/file.txt` +This header SHOULD be returned with deserialized responses. +Implementations MAY omit it with trustless response types +(`application/vnd.ipld.raw` and `application/vnd.ipld.car`). + ### `X-Ipfs-Roots` (response header) Used for HTTP caching. @@ -675,6 +679,10 @@ NOTE: while the first CID will change every time any article is changed, the last root (responsible for specific article or a subdirectory) may not change at all, allowing for smarter caching beyond what standard Etag offers. +This header SHOULD be returned with deserialized responses. +Implementations MAY omit it with trustless response types +(`application/vnd.ipld.raw` and `application/vnd.ipld.car`). + NOTE: Gateways that stream responses (e.g., CAR) without pre-resolving the entire path MAY only include the root CID for simple `/ipfs/{cid}` requests, or MAY omit this header for path requests where intermediate CIDs are not known