-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: propagate tracing info through Raft proposals #39431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nvb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I'm still a little confused why we're favoring the tracing of entry application over the tracing of entry sequencing in the Raft log. I would expect that appending to the Raft log would be the more interesting place to inject tracing on followers because it's synchronous with the client write.
I guess it's a little harder to pull out tracing spans at that stage because we don't decode the Raft command. Still, I'm curious whether there are any plans to do something like that?
Reviewed 5 of 5 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/storage/replica_application_decoder.go, line 139 at r1 (raw file):
for it.init(&d.cmdBuf); it.Valid(); it.Next() { cmd := it.cur() parentCtx := ctx
Get rid of this variable.
pkg/storage/replica_application_decoder.go, line 142 at r1 (raw file):
if cmd.IsLocal() { parentCtx = cmd.proposal.ctx cmd.ctx, cmd.sp = tracing.ForkCtxSpan(parentCtx, "raft application")
Pull "raft application" into const opName = "raft application"
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
I guess it's a little harder to pull out tracing spans at that stage because we don't decode the Raft command. Still, I'm curious whether there are any plans to do something like that?
Now there is :P
Once this merges, I'll file an issue for what you want.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
pkg/storage/replica_application_decoder.go, line 139 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Get rid of this variable.
done
pkg/storage/replica_application_decoder.go, line 142 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Pull "raft application" into
const opName = "raft application"
done
Build failed |
bd5451a to
85dda1b
Compare
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @nvanbenschoten and @ajwerner , check out the increase in proposal size showed by TestProposalOverhead. We all still cool on this? (I am)
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
50 bytes of overhead to 133 bytes per Raft entry is a major regression. We need a lot to justify that kind of change. I thought this was only going to hurt when tracing is enabled. |
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was only going to hurt when tracing is enabled.
yeah now that I think about it, that's the intention. Maybe getTraceData() needs to handle the noop span... I'll see.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: TestProposalOverhead fails because tracing is enabled for its purposes because nobody sets trace.debug.enable for the test's testContext.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to self: let's default trace.debug.enable to false and stop this nonsense
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
This patch propagates the SpanContext of the proposer to all the replicas, which then decode it and create a "follows from" child span for the application. The local replica does not use this, since it already had direct access to the proposer's ctx and span. Now tracing utilities will show us all the command applications for a proposal tied to each other and we'll profit. Release note: None
andreimatei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now trace.debug.enable defaults to false and the test correctly shows no overhead when tracing is off.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
This patch propagates the SpanContext of the proposer to all the
replicas, which then decode it and create a "follows from" child span
for the application. The local replica does not use this, since it
already had direct access to the proposer's ctx and span.
Now tracing utilities will show us all the command applications for a
proposal tied to each other and we'll profit.
Release note: None