in_syslog: do batching to reduce ring buffer pressure under load [v4.0]#10636
Merged
in_syslog: do batching to reduce ring buffer pressure under load [v4.0]#10636
Conversation
The syslog input plugin previously processed and encoded records one by one,
resulting in frequent writes to the ring buffer when running in threaded mode. This
was particularly inefficient under high-load environments, where it could lead to
ring buffer saturation and increased retry pressure.
This change introduces encoder batching by resetting the encoder before parsing
each message and flushing the accumulated records after processing. The new behavior
improves throughput and reduces contention on the ring buffer.
Before:
./flb-tcp-writer -c 100 -d ../../../fluent-bit/issues/sc-141555/2mb.json -p `pidof fluent-bit` -i 1000
records write (b) write secs | % cpu user (ms) sys (ms) Mem (bytes) Mem
-------- ---------- -------- ----- + ------ --------- -------- ----------- -------
100000 209000 204.10K 1.00 | 0.00 0 0 18350080 17.50M
200000 21109000 20.13M 1.01 | 61.64 590 30 29167616 27.82M
300000 42009000 40.06M 1.01 | 116.68 1110 70 52985856 50.53M
400000 62909000 59.99M 1.01 | 116.72 1140 40 53379072 50.91M
500000 83809000 79.93M 1.01 | 116.53 1140 40 53379072 50.91M
600000 104709000 99.86M 1.01 | 117.28 1140 50 53379072 50.91M
700000 125609000 119.79M 1.01 | 116.65 1150 30 53379072 50.91M
800000 146509000 139.72M 1.01 | 118.00 1140 50 53379072 50.91M
900000 167409000 159.65M 3.07 | 116.64 3450 130 53379072 50.91M
1000000 188309000 179.59M 3.40 | 116.79 3840 130 53379072 50.91M
0 0 0 b 1.00 | 117.99 1140 40 53379072 50.91M
0 0 0 b 1.00 | 115.98 1120 40 53379072 50.91M
0 0 0 b 1.00 | 117.99 1130 50 53379072 50.91M
0 0 0 b 1.00 | 115.99 1130 30 53379072 50.91M
0 0 0 b 1.00 | 116.99 1130 40 53379072 50.91M
0 0 0 b 1.00 | 117.98 1130 50 53379072 50.91M
0 0 0 b 1.00 | 115.98 1130 30 53379072 50.91M
0 0 0 b 1.00 | 115.99 1120 40 53379072 50.91M
0 0 0 b 1.00 | 116.98 1140 30 53641216 51.16M
0 0 0 b 1.00 | 57.97 560 20 53641216 51.16M
0 0 0 b 1.00 | 0.00 0 0 53641216 51.16M
0 0 0 b 1.00 | 0.00 0 0 53641216 51.16M
0 0 0 b 1.00 | 1.00 10 0 53641216 51.16M
0 0 0 b 1.00 | 0.00 0 0 53641216 51.16M
0 0 0 b 1.00 | 0.00 0 0 53641216 51.16M
0 0 0 b 1.00 | 0.00 0 0 53641216 51.16M
- Summary
- Process : fluent-bit
- PID : 1476802
- Elapsed Time: 27.55 seconds
- Avg Memory : 48.80M
- Avg CPU : 105.39%
- Avg Rate : 29.42M/sec
- Avg Records : 175.81K/sec
Ring buffer metrics (before)
curl -s http://127.0.0.1:2020/api/v2/metrics/prometheus|grep ring_buffer
# HELP fluentbit_input_ring_buffer_writes_total Number of ring buffer writes.
# TYPE fluentbit_input_ring_buffer_writes_total counter
fluentbit_input_ring_buffer_writes_total{name="syslog.0"} 4510000
After this patch
./flb-tcp-writer -c 100 -d ../../../fluent-bit/issues/sc-141555/2mb.json -p `pidof fluent-bit` -i 1000
records write (b) write secs | % cpu user (ms) sys (ms) Mem (bytes) Mem
-------- ---------- -------- ----- + ------ --------- -------- ----------- -------
100000 209000 204.10K 1.00 | 0.00 0 0 18313216 17.46M
200000 21109000 20.13M 1.01 | 48.72 460 30 40505344 38.63M
300000 42009000 40.06M 1.01 | 91.01 880 40 59801600 57.03M
400000 62909000 59.99M 1.01 | 100.73 990 30 58724352 56.00M
500000 83809000 79.93M 1.01 | 101.69 990 40 58740736 56.02M
600000 104709000 99.86M 1.02 | 99.49 980 30 59826176 57.05M
700000 125609000 119.79M 1.01 | 101.70 990 40 60760064 57.95M
800000 146509000 139.72M 1.01 | 101.04 980 40 61431808 58.59M
900000 167409000 159.65M 1.01 | 101.10 980 40 57200640 54.55M
1000000 188309000 179.59M 5.18 | 100.95 5060 170 59154432 56.41M
0 0 0 b 1.00 | 99.99 970 30 59170816 56.43M
0 0 0 b 1.00 | 101.98 980 40 59318272 56.57M
0 0 0 b 1.00 | 100.99 980 30 59207680 56.46M
0 0 0 b 1.00 | 100.98 980 30 59224064 56.48M
0 0 0 b 1.00 | 99.98 970 30 59228160 56.48M
0 0 0 b 1.00 | 101.98 980 40 59244544 56.50M
0 0 0 b 1.00 | 100.99 980 30 59125760 56.39M
0 0 0 b 1.00 | 68.98 670 20 61034496 58.21M
0 0 0 b 1.00 | 1.00 10 0 61034496 58.21M
0 0 0 b 1.00 | 0.00 0 0 61034496 58.21M
0 0 0 b 1.00 | 0.00 0 0 61034496 58.21M
0 0 0 b 1.00 | 0.00 0 0 61034496 58.21M
- Summary
- Process : fluent-bit
- PID : 1480874
- Elapsed Time: 23.27 seconds
- Avg Memory : 54.37M
- Avg CPU : 90.18%
- Avg Rate : 34.21M/sec
- Avg Records : 204.43K/sec
Ring buffer metrics (after)
curl -s http://127.0.0.1:2020/api/v2/metrics/prometheus|grep ring_buffer
# HELP fluentbit_input_ring_buffer_writes_total Number of ring buffer writes.
# TYPE fluentbit_input_ring_buffer_writes_total counter
fluentbit_input_ring_buffer_writes_total{name="syslog.0"} 15000
For short:
Less CPU, less memory allocations and less writes to the ring buffer = higher performance.
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The syslog input plugin previously processed and encoded records one by one, resulting in frequent writes to the ring buffer when running in threaded mode. This was particularly inefficient under high-load environments, where it could lead to ring buffer saturation and increased retry pressure.
This change introduces encoder batching by resetting the encoder before parsing each message and flushing the accumulated records after processing. The new behavior improves throughput and reduces contention on the ring buffer.
High level overview of the changes
breakdown of the tests before and after this patch:
Before
Ring buffer metrics (before)
After this patch
Ring buffer metrics (after)
For short:
Less CPU, less memory allocations and less writes to the ring buffer = higher performance.
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.