Skip to content

Add support for gloo#179

Merged
stefanprodan merged 12 commits intofluxcd:masterfrom
yuval-k:gloo2
May 17, 2019
Merged

Add support for gloo#179
stefanprodan merged 12 commits intofluxcd:masterfrom
yuval-k:gloo2

Conversation

@yuval-k
Copy link
Contributor

@yuval-k yuval-k commented May 10, 2019

This PR adds gloo\flagger integration.

It uses the gloo UpstreamGroup object. The advantage of using this is that the routes need only be updated in one place, and it will apply to all routes that point to that upstream group.

@yuval-k yuval-k marked this pull request as ready for review May 15, 2019 00:01
@stefanprodan
Copy link
Member

stefanprodan commented May 15, 2019

@yuval-k I'm having a hard time testing this, the Gloo gateway proxy is buggy, most of the time the /metrics endpoint returns 503.

Prometheus fails to scrape Envoy, at most one call in ten will make it:

Screenshot 2019-05-15 at 12 56 10

Same issue from inside the pod:

kubectl -n gloo-system exec -it gateway-proxy-5fc9c76b84-tgfst sh

/ # wget -q -O - localhost:8081/metrics
wget: server returned error: HTTP/1.1 503 Service Unavailable

I've reproduce this on several clusters, my conclusion is that the metrics endpoint returns 503 on 90% of the request. Both v0.13.27 and v0.13.28 have this issue.

@yuval-k
Copy link
Contributor Author

yuval-k commented May 15, 2019

Weird! I'll look in to the failure in the /metrics - thank you for letting me know.

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

metrics issue should be solved (it was access to admin port in general)

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

not sure why the travis job failed - maybe because lyft/protoc-gen-validate is now envoyproxy/protoc-gen-validate?

@stefanprodan
Copy link
Member

stefanprodan commented May 16, 2019

I see that you’ve changed the metrics and the tests are failing. Run make test locally to see what’s failing.

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

good point!

@stefanprodan
Copy link
Member

@yuval-k there is one more test failing:

--- FAIL: TestGlooRouter_Sync (0.00s)
    gloo_test.go:39: Got RoutingRule Destinations 2 wanted 1

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

thanks for the pointer!

@stefanprodan
Copy link
Member

Is it possible to add support for A/B testing with headers and cookies filters? I've seen that Gloo has this capability but not with UpstreamGroup.

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

that's a good question.. here's the trade-off:
the goal of an UpstreamGroup is to separation of concerns; flagger owns the UpstreamGroup, and the operator owns the virtual service. so the operator only needs to route to the upstream group and flagger will take care of the rest. That way flagger doesn't need to edit the virtual service, and every route that points to the upstream group will be routed to the canary automatically.

So if we want to keep this separation of concerns, we would have to add header matching to upstream group. This will take time to implement in gloo, as now we'll have one route translating to multiple envoy routes, which is not something gloo currently supports.

An alternative would be to edit the VirtualService, but then there's a potential for error in understanding which route are own by flagger and which by the operator.

@stefanprodan
Copy link
Member

Right! I think it makes sense to use delegation as in allow Flagger to own an object that's referenced in the main VirtualService and not operate on the VirtualService at all.

@yuval-k
Copy link
Contributor Author

yuval-k commented May 16, 2019

I agree - I think upstream groups are the better way to go. the problem is that adding the matching to an UpstreamGroup is non-trivial, as it requires two routes on the envoy level; while the current gloo implementation maps one gloo route to one envoy route.

Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'll write the install docs and a short tutorial

@stefanprodan stefanprodan merged commit 7b9df74 into fluxcd:master May 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants