-
Notifications
You must be signed in to change notification settings - Fork 955
Description
When IO threads are running, they use busy-wait loop to wait for jobs. Therefore, they run at 100% all the time and CPU utilization is not a useful metric to understand the load. The problem is explained in this issue:
I propose a metric in percentage how much time the IO thread actual does useful work as opposed to busy-waiting.
We could add fields for IO threads under the "CPU" INFO section. Here, the existing fields are CPU time in seconds since, as given by getrusage() system library function. For example, we have the sys and user CPU time of the main thread:
used_cpu_sys_main_thread:123456.123456
used_cpu_user_main_thread:234567.123456
Since IO threads run at 100% CPU, there's not much point in presenting CPU time. I suggest we report useful time or work time in seconds. For that to be useful, we also need to report the IO thread's uptime, so the caller can compute a percentage and compare it over the time window they wish.
uptime_io_thread_1:1234.56789
useful_io_thread_1:291.234567
uptime_io_thread_2:1234.56789
useful_io_thread_2:321.098765
...
Alternatively, we could present something simpler, such as a single INFO field with a percentage for each IO thread. For this, we would need to compute it over some specified time window though.
io_threads_utilization:60%,39%,67%,61%
Metadata
Metadata
Assignees
Labels
Type
Projects
Status