Skip to content

Conversation

@hawkw
Copy link
Contributor

@hawkw hawkw commented Jun 2, 2023

Depends on #2418

The latest proxy-api release, v0.10.0, adds fields to the
OutboundPolicies API for configuring HTTP request timeouts, based on
the proposed changes to HTTPRoute in kubernetes-sigs/gateway-api#1997.
PR #2418 updates the proxy to depend on the new proxy-api release, and
implements the Rule.request_timeout field added to the API. However,
that branch does not add a timeout for the
RouteBackend.request_timeout field. This branch changes the proxy to
apply the backend request timeout when configured by the policy
controller.

This branch implements RouteBackend.request_timeout by adding an
additional timeout layer in the MatchedBackend stack. This applies the
per-backend timeout once a backend is selected for a route. I've also
added stack tests for the interaction between the request and backend
request timeouts.

Note that once retries are added to client policy stacks, it may be
necessary to move the backend request timeout to ensure it occurs
"below" retries, depending on where the retry middleware ends up being
located in the proxy stack.

@hawkw hawkw requested a review from olix0r June 2, 2023 20:07
@hawkw hawkw requested a review from a team as a code owner June 2, 2023 20:07
@hawkw hawkw self-assigned this Jun 2, 2023
@adleong
Copy link
Member

adleong commented Jun 2, 2023

My 2 cents is that we have some leeway in how we choose to implement this and we don't need to follow the spec to the letter. If it is easier for us to implement and maintain to put this timeout in the logical stack after a backend is selected, I think that would be okay. The important part is that the timeout happens after retries and after backend selection.

@hawkw hawkw requested a review from adleong June 13, 2023 15:54
Base automatically changed from eliza/client-policy-timeouts to main June 14, 2023 18:36
@hawkw hawkw force-pushed the eliza/backend-request-timeout branch from 69671c6 to 97e4fcc Compare June 15, 2023 14:38
@hawkw
Copy link
Contributor Author

hawkw commented Jun 15, 2023

@adleong I've updated this branch based on your feedback, removing the TimeoutFromRequest middleware. Now, the backend request timeout is applied in the policy logical stack's matched backend stack.

Copy link
Member

@mateiidavid mateiidavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to work most of the code out and this looks good to me.

@hawkw hawkw merged commit 966306b into main Jun 15, 2023
@hawkw hawkw deleted the eliza/backend-request-timeout branch June 15, 2023 17:48
hawkw added a commit that referenced this pull request Jun 15, 2023
PRs #2418 and #2419 add per-route and per-backend request timeouts
configured by the `OutboundPolicies` API to the `MatchedRoute` and
`MatchedBackend` layers in the outbound `ClientPolicy` stack,
respectively. This means that — unlike in the `ServiceProfile` stack —
two separate request timeouts can be configured in `ClientPolicy`
stacks. However, because both the `MatchedRoute` and `MatchedBackend`
layers are in the HTTP logical stack, the errors emitted by both
timeouts will have a `LogicalError` as their most specific error
metadata, meaning that the log messages and `l5d-proxy-error` headers
recorded for these timeouts do not indicate whether the timeout that
failed the request was the route request timeout or the backend request
timeout.

In order to ensure this information is recorded and exposed to the user,
this branch adds two new error wrapper types, one of which enriches an
error with a `RouteRef`'s metadata, and one of which enriches an error
with a `BackendRef`'s metadata. The `MatchedRoute` stack now wraps all
errors with `RouteRef` metadata, and the `MatchedBackend` stack wraps
errors with `BackendRef` metadata. This way, when the route timeout
fails a request, the error will include the route metadata, while when
the backend request timeout fails a request, the error will include both
the route and backend metadata.

Adding these new error wrappers also has the additional side benefit of
adding this metadata to errors returned by filters, allowing users to
distinguish between errors emitted by a filter on a route rule and
errors emitted by a per-backend filter. Also, any other errors emitted
lower in the stack for requests that are handled by a client policy
stack will now also include this metadata, which seems generally useful.

Example errors, taken from a proxy unit test:

backend request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: backend service.test.test-svc:666: HTTP response timeout after 1s
```
route request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: HTTP response timeout after 2s
```
risingspiral pushed a commit to linkerd/linkerd2 that referenced this pull request Jun 15, 2023
PRs #2418 and #2419 add per-route and per-backend request timeouts
configured by the `OutboundPolicies` API to the `MatchedRoute` and
`MatchedBackend` layers in the outbound `ClientPolicy` stack,
respectively. This means that — unlike in the `ServiceProfile` stack —
two separate request timeouts can be configured in `ClientPolicy`
stacks. However, because both the `MatchedRoute` and `MatchedBackend`
layers are in the HTTP logical stack, the errors emitted by both
timeouts will have a `LogicalError` as their most specific error
metadata, meaning that the log messages and `l5d-proxy-error` headers
recorded for these timeouts do not indicate whether the timeout that
failed the request was the route request timeout or the backend request
timeout.

In order to ensure this information is recorded and exposed to the user,
this branch adds two new error wrapper types, one of which enriches an
error with a `RouteRef`'s metadata, and one of which enriches an error
with a `BackendRef`'s metadata. The `MatchedRoute` stack now wraps all
errors with `RouteRef` metadata, and the `MatchedBackend` stack wraps
errors with `BackendRef` metadata. This way, when the route timeout
fails a request, the error will include the route metadata, while when
the backend request timeout fails a request, the error will include both
the route and backend metadata.

Adding these new error wrappers also has the additional side benefit of
adding this metadata to errors returned by filters, allowing users to
distinguish between errors emitted by a filter on a route rule and
errors emitted by a per-backend filter. Also, any other errors emitted
lower in the stack for requests that are handled by a client policy
stack will now also include this metadata, which seems generally useful.

Example errors, taken from a proxy unit test:

backend request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: backend service.test.test-svc:666: HTTP response timeout after 1s
```
route request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: HTTP response timeout after 2s
```

---

* recover: remove unused `mut` (linkerd/linkerd2-proxy#2425)
* outbound: implement `OutboundPolicies` route request timeouts (linkerd/linkerd2-proxy#2418)
* build(deps): bump tj-actions/changed-files from 35.7.7 to 36.2.1 (linkerd/linkerd2-proxy#2427)
* outbound: implement `OutboundPolicies` backend request timeouts (linkerd/linkerd2-proxy#2419)
* outbound: add backend and route metadata to errors (linkerd/linkerd2-proxy#2428)

Signed-off-by: Eric Anderson <[email protected]>
risingspiral pushed a commit to linkerd/linkerd2 that referenced this pull request Jun 15, 2023
Updating proxy version on main

PRs #2418 and #2419 add per-route and per-backend request timeouts
configured by the `OutboundPolicies` API to the `MatchedRoute` and
`MatchedBackend` layers in the outbound `ClientPolicy` stack,
respectively. This means that — unlike in the `ServiceProfile` stack —
two separate request timeouts can be configured in `ClientPolicy`
stacks. However, because both the `MatchedRoute` and `MatchedBackend`
layers are in the HTTP logical stack, the errors emitted by both
timeouts will have a `LogicalError` as their most specific error
metadata, meaning that the log messages and `l5d-proxy-error` headers
recorded for these timeouts do not indicate whether the timeout that
failed the request was the route request timeout or the backend request
timeout.

In order to ensure this information is recorded and exposed to the user,
this branch adds two new error wrapper types, one of which enriches an
error with a `RouteRef`'s metadata, and one of which enriches an error
with a `BackendRef`'s metadata. The `MatchedRoute` stack now wraps all
errors with `RouteRef` metadata, and the `MatchedBackend` stack wraps
errors with `BackendRef` metadata. This way, when the route timeout
fails a request, the error will include the route metadata, while when
the backend request timeout fails a request, the error will include both
the route and backend metadata.

Adding these new error wrappers also has the additional side benefit of
adding this metadata to errors returned by filters, allowing users to
distinguish between errors emitted by a filter on a route rule and
errors emitted by a per-backend filter. Also, any other errors emitted
lower in the stack for requests that are handled by a client policy
stack will now also include this metadata, which seems generally useful.

Example errors, taken from a proxy unit test:

backend request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: backend service.test.test-svc:666: HTTP response timeout after 1s
```
route request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: HTTP response timeout after 2s
```

---

* recover: remove unused `mut` (linkerd/linkerd2-proxy#2425)
* outbound: implement `OutboundPolicies` route request timeouts (linkerd/linkerd2-proxy#2418)
* build(deps): bump tj-actions/changed-files from 35.7.7 to 36.2.1 (linkerd/linkerd2-proxy#2427)
* outbound: implement `OutboundPolicies` backend request timeouts (linkerd/linkerd2-proxy#2419)
* outbound: add backend and route metadata to errors (linkerd/linkerd2-proxy#2428)

Signed-off-by: Eric Anderson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants