cf186201 | 23-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla: Report missed event count
Print how many events were missed by trace buffer overflow in the main instance at the end of the run (for hist) or during the run (for top).
Cc: John Kacur <jkacur@
rtla: Report missed event count
Print how many events were missed by trace buffer overflow in the main instance at the end of the run (for hist) or during the run (for top).
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
8ccd9d8b | 23-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla: Add function to report missed events
Add osnoise_report_missed_events to be used to report the number of missed events either during or after an osnoise or timerlat run. Also, display the perc
rtla: Add function to report missed events
Add osnoise_report_missed_events to be used to report the number of missed events either during or after an osnoise or timerlat run. Also, display the percentage of missed events compared to the total number of received events.
If an unknown number of missed events was reported during the run, the entire number of missed events is reported as unknown.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-4-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
2aee44f7 | 23-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla: Count all processed events
Add a field processed_events to struct trace_instance and increment it in collect_registered_events, regardless of whether a handler is registered for the event.
Th
rtla: Count all processed events
Add a field processed_events to struct trace_instance and increment it in collect_registered_events, regardless of whether a handler is registered for the event.
The purpose is to calculate the percentage of events that were missed due to tracefs buffer overflow.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-3-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
d6fcd28f | 23-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla: Count missed trace events
Add function collect_missed_events to trace.c to act as a callback for tracefs_follow_missed_events, summing the number of total missed events into a new field missin
rtla: Count missed trace events
Add function collect_missed_events to trace.c to act as a callback for tracefs_follow_missed_events, summing the number of total missed events into a new field missing_events of struct trace_instance.
In case record->missed_events is negative, trace->missed_events is set to UINT64_MAX to signify an unknown number of events was missed.
The callback is activated on initialization of the trace instance.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250123142339.990300-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
b91cfd9f | 15-Jan-2025 |
Costa Shulyupin <costa.shul@redhat.com> |
tools/rtla: Add osnoise_trace_is_off()
All of the users of trace_is_off() passes in &record->trace as the second parameter, where record is a pointer to a struct osnoise_tool. This record could be N
tools/rtla: Add osnoise_trace_is_off()
All of the users of trace_is_off() passes in &record->trace as the second parameter, where record is a pointer to a struct osnoise_tool. This record could be NULL and there is a hidden dependency that the trace field is the first field to allow &record->trace to work with a NULL record pointer.
In order to make this code a bit more robust, as record shouldn't be dereferenced if it is NULL, even if the code does work, create a new function called osnoise_trace_is_off() that takes the pointer to a struct osnoise_tool as its second parameter. This way it can properly test if it is NULL before it dereferences it.
The old function trace_is_off() is removed and the function osnoise_trace_is_off() is added into osnoise.c which is what the struct osnoise_tool is associated with.
Cc: John Kacur <jkacur@redhat.com> Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Cc: Eder Zulian <ezulian@redhat.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tomas Glozar <tglozar@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250115180055.2136815-1-costa.shul@redhat.com Signed-off-by: Costa Shulyupin <costa.shul@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
217f0b1e | 07-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads
When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/option
rtla/timerlat_top: Set OSNOISE_WORKLOAD for kernel threads
When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/options. This option is not re-enabled in a subsequent run with kernel-space threads, leading to rtla collecting no results if the previous run exited abnormally:
$ rtla timerlat top -u ^\Quit (core dumped) $ rtla timerlat top -k -d 1s Timer Latency 0 00:00:01 | IRQ Timer Latency (us) | Thread Timer Latency (us) CPU COUNT | cur min avg max | cur min avg max
The issue persists until OSNOISE_WORKLOAD is set manually by running: $ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if available to fix the issue.
Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-4-tglozar@redhat.com Fixes: cdca4f4e5e8e ("rtla/timerlat_top: Add timerlat user-space support") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
d8d86617 | 07-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads
When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/optio
rtla/timerlat_hist: Set OSNOISE_WORKLOAD for kernel threads
When using rtla timerlat with userspace threads (-u or -U), rtla disables the OSNOISE_WORKLOAD option in /sys/kernel/tracing/osnoise/options. This option is not re-enabled in a subsequent run with kernel-space threads, leading to rtla collecting no results if the previous run exited abnormally:
$ rtla timerlat hist -u ^\Quit (core dumped) $ rtla timerlat hist -k -d 1s Index over: count: min: avg: max: ALL: IRQ Thr Usr count: 0 0 0 min: - - - avg: - - - max: - - -
The issue persists until OSNOISE_WORKLOAD is set manually by running: $ echo OSNOISE_WORKLOAD > /sys/kernel/tracing/osnoise/options
Set OSNOISE_WORKLOAD when running rtla with kernel-space threads if available to fix the issue.
Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-3-tglozar@redhat.com Fixes: ed774f7481fa ("rtla/timerlat_hist: Add timerlat user-space support") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
80d3ba1c | 07-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/osnoise: Distinguish missing workload option
osnoise_set_workload returns -1 for both missing OSNOISE_WORKLOAD option and failure in setting the option.
Return -1 for missing and -2 for failur
rtla/osnoise: Distinguish missing workload option
osnoise_set_workload returns -1 for both missing OSNOISE_WORKLOAD option and failure in setting the option.
Return -1 for missing and -2 for failure to distinguish them.
Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/20250107144823.239782-2-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
80967b35 | 16-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_top: Abort event processing on second signal
If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently le
rtla/timerlat_top: Abort event processing on second signal
If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently left in the tracefs buffer and exit immediately.
This allows the user to exit rtla without waiting for processing all events, should that take longer than wanted, at the cost of not processing all samples.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
d6899e56 | 16-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_hist: Abort event processing on second signal
If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently l
rtla/timerlat_hist: Abort event processing on second signal
If either SIGINT is received twice, or after a SIGALRM (that is, after timerlat was supposed to stop), abort processing events currently left in the tracefs buffer and exit immediately.
This allows the user to exit rtla without waiting for processing all events, should that take longer than wanted, at the cost of not processing all samples.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
a4dfce75 | 16-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_top: Stop timerlat tracer on signal
Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the m
rtla/timerlat_top: Stop timerlat tracer on signal
Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the main loop. This is not sufficient for cases where the timerlat tracer is producing more data than rtla can consume, since in that case, rtla is looping indefinitely inside tracefs_iterate_raw_events, never reaches the check of stop_tracing and hangs.
In addition to setting stop_tracing, also stop the timerlat tracer on received signal (SIGINT or SIGALRM). This will stop new samples so that the existing samples may be processed and tracefs_iterate_raw_events eventually exits.
Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-4-tglozar@redhat.com Fixes: a828cd18bc4a ("rtla: Add timerlat tool and timelart top mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
c73cab9d | 16-Jan-2025 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat_hist: Stop timerlat tracer on signal
Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the
rtla/timerlat_hist: Stop timerlat tracer on signal
Currently, when either SIGINT from the user or SIGALRM from the duration timer is caught by rtla-timerlat, stop_tracing is set to break out of the main loop. This is not sufficient for cases where the timerlat tracer is producing more data than rtla can consume, since in that case, rtla is looping indefinitely inside tracefs_iterate_raw_events, never reaches the check of stop_tracing and hangs.
In addition to setting stop_tracing, also stop the timerlat tracer on received signal (SIGINT or SIGALRM). This will stop new samples so that the existing samples may be processed and tracefs_iterate_raw_events eventually exits.
Cc: stable@vger.kernel.org Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Gabriele Monaco <gmonaco@redhat.com> Link: https://lore.kernel.org/20250116144931.649593-3-tglozar@redhat.com Fixes: 1eeb6328e8b3 ("rtla/timerlat: Add timerlat hist mode") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
fcbc60d7 | 21-Oct-2024 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat: Do not set params->user_workload with -U
Since commit fb9e90a67ee9 ("rtla/timerlat: Make user-space threads the default"), rtla-timerlat has been defaulting to params->user_workload i
rtla/timerlat: Do not set params->user_workload with -U
Since commit fb9e90a67ee9 ("rtla/timerlat: Make user-space threads the default"), rtla-timerlat has been defaulting to params->user_workload if neither that or params->kernel_workload is set. This has unintentionally made -U, which sets only params->user_hist/top but not params->user_workload, to behave like -u unless -k is set, preventing the user from running a custom workload.
Example: $ rtla timerlat hist -U -c 0 & [1] 7413 $ python sample/timerlat_load.py 0 Error opening timerlat fd, did you run timerlat -U? $ ps | grep timerlatu 7415 pts/4 00:00:00 timerlatu/0
Fix the issue by checking for params->user_top/hist instead of params->user_workload when setting default thread mode.
Link: https://lore.kernel.org/20241021123140.14652-1-tglozar@redhat.com Fixes: fb9e90a67ee9 ("rtla/timerlat: Make user-space threads the default") Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
cfbfbfc9 | 17-Oct-2024 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat: Add --deepest-idle-state for hist
Support limiting deepest idle state also for timerlat-hist.
Link: https://lore.kernel.org/20241017140914.3200454-6-tglozar@redhat.com Signed-off-by:
rtla/timerlat: Add --deepest-idle-state for hist
Support limiting deepest idle state also for timerlat-hist.
Link: https://lore.kernel.org/20241017140914.3200454-6-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
549b92c9 | 17-Oct-2024 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat: Add --deepest-idle-state for top
Add option to limit deepest idle state on CPUs where timerlat is running for the duration of the workload.
Link: https://lore.kernel.org/202410171409
rtla/timerlat: Add --deepest-idle-state for top
Add option to limit deepest idle state on CPUs where timerlat is running for the duration of the workload.
Link: https://lore.kernel.org/20241017140914.3200454-5-tglozar@redhat.com Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
76b31021 | 11-Oct-2024 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat: Make timerlat_hist_cpu->*_count unsigned long long
Do the same fix as in previous commit also for timerlat-hist.
Link: https://lore.kernel.org/20241011121015.2868751-2-tglozar@redhat
rtla/timerlat: Make timerlat_hist_cpu->*_count unsigned long long
Do the same fix as in previous commit also for timerlat-hist.
Link: https://lore.kernel.org/20241011121015.2868751-2-tglozar@redhat.com Reported-by: Attila Fazekas <afazekas@redhat.com> Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
4eba4723 | 11-Oct-2024 |
Tomas Glozar <tglozar@redhat.com> |
rtla/timerlat: Make timerlat_top_cpu->*_count unsigned long long
Most fields of struct timerlat_top_cpu are unsigned long long, but the fields {irq,thread,user}_count are int (32-bit signed).
This
rtla/timerlat: Make timerlat_top_cpu->*_count unsigned long long
Most fields of struct timerlat_top_cpu are unsigned long long, but the fields {irq,thread,user}_count are int (32-bit signed).
This leads to overflow when tracing on a large number of CPUs for a long enough time: $ rtla timerlat top -a20 -c 1-127 -d 12h ... 0 12:00:00 | IRQ Timer Latency (us) | Thread Timer Latency (us) CPU COUNT | cur min avg max | cur min avg max 1 #43200096 | 0 0 1 2 | 3 2 6 12 ... 127 #43200096 | 0 0 1 2 | 3 2 5 11 ALL #119144 e4 | 0 5 4 | 2 28 16
The average latency should be 0-1 for IRQ and 5-6 for thread, but is reported as 5 and 28, about 4 to 5 times more, due to the count overflowing when summed over all CPUs: 43200096 * 127 = 5486412192, however, 1191444898 (= 5486412192 mod MAX_INT) is reported instead, as seen on the last line of the output, and the averages are thus ~4.6 times higher than they should be (5486412192 / 1191444898 = ~4.6).
Fix the issue by changing {irq,thread,user}_count fields to unsigned long long, similarly to other fields in struct timerlat_top_cpu and to the count variable in timerlat_top_print_sum.
Link: https://lore.kernel.org/20241011121015.2868751-1-tglozar@redhat.com Reported-by: Attila Fazekas <afazekas@redhat.com> Signed-off-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
0eecee34 | 10-Oct-2024 |
Jan Stancek <jstancek@redhat.com> |
tools/rtla: fix collision with glibc sched_attr/sched_set_attr
glibc commit 21571ca0d703 ("Linux: Add the sched_setattr and sched_getattr functions") now also provides 'struct sched_attr' and sched_
tools/rtla: fix collision with glibc sched_attr/sched_set_attr
glibc commit 21571ca0d703 ("Linux: Add the sched_setattr and sched_getattr functions") now also provides 'struct sched_attr' and sched_setattr() which collide with the ones from rtla.
In file included from src/trace.c:11: src/utils.h:49:8: error: redefinition of ‘struct sched_attr’ 49 | struct sched_attr { | ^~~~~~~~~~ In file included from /usr/include/bits/sched.h:60, from /usr/include/sched.h:43, from /usr/include/tracefs/tracefs.h:10, from src/trace.c:4: /usr/include/linux/sched/types.h:98:8: note: originally defined here 98 | struct sched_attr { | ^~~~~~~~~~
Define 'struct sched_attr' conditionally, similar to what strace did: https://lore.kernel.org/all/20240930222913.3981407-1-raj.khem@gmail.com/ and rename rtla's version of sched_setattr() to avoid collision.
Link: https://lore.kernel.org/8088f66a7a57c1b209cd8ae0ae7c336a7f8c930d.1728572865.git.jstancek@redhat.com Signed-off-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
099a8401 | 10-Oct-2024 |
Jan Stancek <jstancek@redhat.com> |
tools/rtla: drop __NR_sched_getattr
It's not used since commit 084ce16df0f0 ("tools/rtla: Remove unused sched_getattr() function").
Link: https://lore.kernel.org/c355dc9ad23470098d6a8d0f31fbd702551
tools/rtla: drop __NR_sched_getattr
It's not used since commit 084ce16df0f0 ("tools/rtla: Remove unused sched_getattr() function").
Link: https://lore.kernel.org/c355dc9ad23470098d6a8d0f31fbd702551c9ea8.1728552769.git.jstancek@redhat.com Signed-off-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|
cfb1ea21 | 26-Sep-2024 |
Gabriele Monaco <gmonaco@redhat.com> |
rtla: Fix consistency in getopt_long for timerlat_hist
Commit e9a4062e1527 ("rtla: Add --trace-buffer-size option") adds a new long option to rtla utilities, but among all affected files, timerlat_h
rtla: Fix consistency in getopt_long for timerlat_hist
Commit e9a4062e1527 ("rtla: Add --trace-buffer-size option") adds a new long option to rtla utilities, but among all affected files, timerlat_hist misses a trailing `:` in the corresponding short option inside the getopt string (e.g. `\3:`). This patch propagates the `:`.
Although this change is not functionally required, it improves consistency and slightly reduces the likelihood a future change would introduce a problem.
Cc: John Kacur <jkacur@redhat.com> Cc: Luis Goncalves <lgoncalv@redhat.com> Cc: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/20240926143417.54039-1-gmonaco@redhat.com Signed-off-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
show more ...
|