-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[translator/azurelogs] Fix missing data when ingesting logs without properties field #44266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[translator/azurelogs] Fix missing data when ingesting logs without properties field #44266
Conversation
…les (#44267) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR is to regenerate the `*_expected.yaml` testing files using `golden.WriteLogs(t, expectedFile, logs)` in `resourcelogs_to_logs_test.go`. <!-- Issue number (e.g. #1234) or full URL to issue, if applicable. --> #### Link to tracking issue There is no issue for this. I'm putting it in a separate PR so it can be more clear #44266 does not break the other existing log categories.
| - body: | ||
| kvlistValue: | ||
| values: | ||
| - key: operation.name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rerun resourcelogs_to_logs_test.go to generate all -expected.yaml files to make sure this PR does not introduce unexpected changes.
MikeGoldsmith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant to parse all unknown fields into properties in this receiver. I think this could lead to significant resource cost, especially with the map[string]any.
Instead, I'd prefer to capture the raw log strings if we fail to parse as an Azure log record and then recommend users to add a Transform processor or similar to extract the attributes onto the log message.
| Records []json.RawMessage `json:"records"` | ||
| } | ||
| if err := json.Unmarshal(buf, &rawRecords); err == nil { | ||
| for i := range azureLogs.Records { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use maps.Copy here?
| return nil, nil | ||
| } | ||
|
|
||
| var raw map[string]any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could get expensive if we're using any for the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this take. We should avoid these as much as possible, any is really bad for performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the comments from @MikeGoldsmith , we can't follow this approach, it really causes issues on performance. Just running the benchmark test, we can see that:
$ benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/translator/azurelogs
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
UnmarshalLogs/1_record-16 2.758µ ± 33% 19.215µ ± 76% +596.83% (p=0.000 n=10)
UnmarshalLogs/100_record-16 180.7µ ± 10% 1153.8µ ± 12% +538.43% (p=0.000 n=10)
UnmarshalLogs/1000_record-16 1.814m ± 1% 11.383m ± 7% +527.37% (p=0.000 n=10)
geomean 96.70µ 631.9µ +553.51%
│ old.txt │ new.txt │
│ B/op │ B/op vs base │
UnmarshalLogs/1_record-16 1.689Ki ± 0% 8.735Ki ± 0% +417.02% (p=0.000 n=10)
UnmarshalLogs/100_record-16 136.1Ki ± 0% 819.4Ki ± 0% +502.17% (p=0.000 n=10)
UnmarshalLogs/1000_record-16 1.186Mi ± 0% 7.842Mi ± 0% +561.39% (p=0.000 n=10)
geomean 65.35Ki 385.9Ki +490.51%
│ old.txt │ new.txt │
│ allocs/op │ allocs/op vs base │
UnmarshalLogs/1_record-16 34.00 ± 0% 115.00 ± 0% +238.24% (p=0.000 n=10)
UnmarshalLogs/100_record-16 1.731k ± 0% 9.050k ± 0% +422.82% (p=0.000 n=10)
UnmarshalLogs/1000_record-16 17.04k ± 0% 90.10k ± 0% +428.70% (p=0.000 n=10)
geomean 1.001k 4.543k +353.87%
We need to have exact fields, we can't add it to a new rawRecord field and then parse it there
Description
This PR is to fix the azurelogs translator when ingesting azure logs without
propertiesfield.Link to tracking issue
Fixes #44222
Testing
In
pkg/translator/azurelogs/resourcelogs_to_logs_test.goI added a unit test using vnet flow log as an example (without properties field). The expected output file is inpkg/translator/azurelogs/testdata/azurevnetflowlog/valid_1_expected.yaml.Documentation
N/A