-
Notifications
You must be signed in to change notification settings - Fork 1.3k
⚠️ fix redundant values in log entries #2074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -23,7 +23,6 @@ import ( | |||||
| "github.com/go-logr/logr" | ||||||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||||||
| "k8s.io/apimachinery/pkg/runtime/schema" | ||||||
| "k8s.io/klog/v2" | ||||||
|
|
||||||
| "sigs.k8s.io/controller-runtime/pkg/client" | ||||||
| "sigs.k8s.io/controller-runtime/pkg/client/apiutil" | ||||||
|
|
@@ -319,7 +318,6 @@ func (blder *Builder) doController(r reconcile.Reconciler) error { | |||||
| log := log | ||||||
| if req != nil { | ||||||
| log = log.WithValues( | ||||||
| gvk.Kind, klog.KRef(req.Namespace, req.Name), | ||||||
| "namespace", req.Namespace, "name", req.Name, | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
We should probably remove this line instead?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @vincepri thanks for your time to review this. If I understand correctly, your proposition would result in something like this: We would also have to remove this line. Otherwise we would still have redundant information in the log line. This would produce: In comparison my change: If you are using indexing software like Splunk or Elastic Search, it would be harder to write queries for your proposition, since the Kind ( |
||||||
| ) | ||||||
| } | ||||||
|
|
||||||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional btw, we wanted to not break existing users who rely on these fields and allow it to just work for ppl who rely on the klog field naming. Is there any issue with the redundancy? Most log shippers allow to drop fields if you are concerned about the size.
This is a breaking change btw since it might break log indexing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, if we consider log output as part of the controller-runtime's API, this is a breaking change. But if we can agree, that for fixing a 'bug', it is allowed to introduce a breaking change, then let me elaborate why I think this is a bug:
Using a value as key is the main issue I have with the above log entry. I think this is an anti pattern in structured logging. Let's take http requests as analogy. To log http requests with status codes we probably would log something like:
and usually not:
The above code would be ok in my opinion, if we rewrite it to something like:
But if we change it, we could remove it altogether.
Additionally it is probably easier to create search queries with the fields
controllerKind,namespaceandnamethan with the klog field nameCertificatewhich changes for each kind.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the general sentiment to not put values into keys, but it has the advantage that you can search for logs of related to a given object type + name across controllers: #1826 (comment)
We basically had the same discussion we are having here on the issue before this was introduced
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you refer to this quote:
The query suggested by @sbueringer is still possible with:
And for just one controller:
So to summarize pros ✔️ and cons ❌:
If we merge this PR:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zbindenren I agree with your reasoning in principle, but the problem is that we want to do this the same way as upstream does so log queries will work for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akutz and @sbueringer were the ones who cared about the previous format, lets give them a chance to voice their opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, @pohly not Andrew
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the very late response, I have to get my notifications a bit more under control.
As stated above the current format was intended as I described in: #1826
Initially we used kind also with a first lower-case character but later on we realized that we can't do this correctly for arbitrary kind's:
So yes we are not 100% consistent with upstream, but as far as we can be.
Independent of consistency, my reason for wanting the current format is the following:
In ClusterAPI we have various controllers and they are all reconciling related objects.
With the current loggin we end up with logs like this: (re-ordered keys to make it easier to understand)
MachineSet controller:
{"Cluster":{"name":"quick-start-h0uf1w","namespace":"quick-start-u7zz7r"}, "MachineDeployment":{"name":"quick-start-h0uf1w-md-0","namespace":"quick-start-u7zz7r"}, "MachineSet":{"name":"quick-start-h0uf1w-md-0-6c856bb876","namespace":"quick-start-u7zz7r"},"ts":1673599028821.286,"caller":"machineset/machineset_controller.go:402","msg":"MachineSet is scaling up to 1 replicas by creating 1 machines","v":0,"controller":"machineset","controllerGroup":"cluster.x-k8s.io","controllerKind":"MachineSet","namespace":"quick-start-u7zz7r","name":"quick-start-h0uf1w-md-0-6c856bb876","reconcileID":"7f5cf231-61cb-4cb6-8d28-ac02d709d1ff","replicas":1,"machineCount":0}Cluster controller
{"Cluster":{"name":"clusterclass-changes-pbb08r","namespace":"clusterclass-changes-xo727h"},"ts":1673599028749.3906,"caller":"cluster/cluster_controller.go:187","msg":"Waiting for the topology to be generated","v":0,"controller":"cluster","controllerGroup":"cluster.x-k8s.io","controllerKind":"Cluster","namespace":"clusterclass-changes-xo727h","name":"clusterclass-changes-pbb08r","reconcileID":"bd128734-0120-4035-964b-e36d0624351a"}So essentially across controllers we are adding key/value pairs of the objects involved, like
{"Cluster":{"name":"clusterclass-changes-pbb08r","namespace":"clusterclass-changes-xo727h"},"MachineDeployment":{"name":"quick-start-h0uf1w-md-0","namespace":"quick-start-u7zz7r"}...This makes it very easy to query logs across controller, by just querying for log entries with
Cluster.name == <cluster-name>If we just would want to query the logs of a single controller you're format would work as well, but for cross-refencing it makes the queries very complicated in my opinion.
Regarding using values as keys
I don't know where that comes from and what the background is. Is it the same concern about cardinality with labels in metrics? Essentially if you use values as log keys you can end up with very high cardinality?
I'm not sure that this is true in this case, because the amount of kind's that we are reconciling shouldn't be that excessive to become a problem.
(also described here: #1826 (comment))
Regarding redundancy and easier to ready
I agree if folks are reading logs via kubectl less redundancy would indeed help. I think via Kibana / Grafana / .. it doesn't make a difference.
P.S. Just to make sure we're on the same page. We are discussing what the default / out-of-the box behavior should be. It is always possible to use different k/v pairs by setting
LogConstructor.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your answer.
Regarding your reason for wanting the current format
Your query could also be created without the redundant key like:
The above query would show the logs for multiple controllers. If you want to limit the controller you would need to do the following:
or
To me this doesn't look to complicated. Or do I misunderstand something?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
controllerKindis different for each controller so I can't usecontrollerKind=Clusterto filter for logs related to a Cluster across controllers.controllerKind=Cluster name=<cluster-name>would only match the logs from the Cluster controller but not the logs from the MachineSet controllerEssentially in the MachineSet controller the controllerKind is MachineSet and also name and namespace are from a MachineSet. In the Cluster controller the controllerKind is Cluster.