Revision tags: v6.12-rc2 |
|
#
c8d430db |
| 06-Oct-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #1
- Fix pKVM error path on init, making sure we do not chang
Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #1
- Fix pKVM error path on init, making sure we do not change critical system registers as we're about to fail
- Make sure that the host's vector length is at capped by a value common to all CPUs
- Fix kvm_has_feat*() handling of "negative" features, as the current code is pretty broken
- Promote Joey to the status of official reviewer, while James steps down -- hopefully only temporarly
show more ...
|
#
0c436dfe |
| 02-Oct-2024 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.12
A bunch of fixes here that came in during the merge window and t
Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.12
A bunch of fixes here that came in during the merge window and the first week of release, plus some new quirks and device IDs. There's nothing major here, it's a bit bigger than it might've been due to there being no fixes sent during the merge window due to your vacation.
show more ...
|
#
2cd86f02 |
| 01-Oct-2024 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Required for a panthor fix that broke when FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.
Signed-off-by: Maarten L
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Required for a panthor fix that broke when FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
show more ...
|
Revision tags: v6.12-rc1 |
|
#
3a39d672 |
| 27-Sep-2024 |
Paolo Abeni <pabeni@redhat.com> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts and no adjacent changes.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
891e8abe |
| 22-Sep-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Use BPF + BTF to collect and
Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Use BPF + BTF to collect and pretty print syscall and tracepoint arguments in 'perf trace', done as an GSoC activity
- Data-type profiling improvements:
- Cache debuginfo to speed up data type resolution
- Add the 'typecln' sort order, to show which cacheline in a target is hot or cold. The following shows members in the cfs_rq's first cache line:
$ perf report -s type,typecln,typeoff -H ... - 2.67% struct cfs_rq + 1.23% struct cfs_rq: cache-line 2 + 0.57% struct cfs_rq: cache-line 4 + 0.46% struct cfs_rq: cache-line 6 - 0.41% struct cfs_rq: cache-line 0 0.39% struct cfs_rq +0x14 (h_nr_running) 0.02% struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
- When a typedef resolves to a unnamed struct, use the typedef name
- When a struct has just one basic type field (int, etc), resolve the type sort order to the name of the struct, not the type of the field
- Support type folding/unfolding in the data-type annotation TUI
- Fix bitfields offsets and sizes
- Initial support for PowerPC, using libcapstone and the usual objdump disassembly parsing routines
- Add support for disassembling and addr2line using the LLVM libraries, speeding up those operations
- Support --addr2line option in 'perf script' as with other tools
- Intel branch counters (LBR event logging) support, only available in recent Intel processors, for instance, the new "brcntr" field can be asked from 'perf script' to print the information collected from this feature:
$ perf script -F +brstackinsn,+brcntr
# Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC
- Support Timed PEBS (Precise Event-Based Sampling), a recent hardware feature in Intel processors
- Add 'perf ftrace profile' subcommand, using ftrace's function-graph tracer so that users can see the total, average, max execution time as well as the number of invocations easily, for instance:
$ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \ perf stat -e cycles -C1 true 2> /dev/null | head # Total (us) Avg (us) Max (us) Count Function 65.611 65.611 65.611 1 __x64_sys_perf_event_open 30.527 30.527 30.527 1 anon_inode_getfile 30.260 30.260 30.260 1 __anon_inode_getfile 29.700 29.700 29.700 1 alloc_file_pseudo 17.578 17.578 17.578 1 d_alloc_pseudo 17.382 17.382 17.382 1 __d_alloc 16.738 16.738 16.738 1 kmem_cache_alloc_lru 15.686 15.686 15.686 1 perf_event_alloc 14.012 7.006 11.264 2 obj_cgroup_charge
- 'perf sched timehist' improvements, including the addition of priority showing/filtering command line options
- Varios improvements to the 'perf probe', including 'perf test' regression testings
- Introduce the 'perf check', initially to check if some feature is in place, using it in 'perf test'
- Various fixes for 32-bit systems
- Address more leak sanitizer failures
- Fix memory leaks (LBR, disasm lock ops, etc)
- More reference counting fixes (branch_info, etc)
- Constify 'struct perf_tool' parameters to improve code generation and reduce the chances of having its internals changed, which isn't expected
- More constifications in various other places
- Add more build tests, including for JEVENTS
- Add more 'perf test' entries ('perf record LBR', pipe/inject, --setup-filter, 'perf ftrace', 'cgroup sampling', etc)
- Inject build ids for all entries in a call chain in 'perf inject', not just for the main sample
- Improve the BPF based sample filter, allowing root to setup filters in bpffs that then can be used by non-root users
- Allow filtering by cgroups with the BPF based sample filter
- Allow a more compact way for 'perf mem report' using the -T/--type-profile and also provide a --sort option similar to the one in 'perf report', 'perf top', to setup the sort order manually
- Fix --group behavior in 'perf annotate' when leader has no samples, where it was not showing anything even when other events in the group had samples
- Fix spinlock and rwlock accounting in 'perf lock contention'
- Fix libsubcmd fixdep Makefile dependencies
- Improve 'perf ftrace' error message when ftrace isn't available
- Update various Intel JSON vendor event files
- ARM64 CoreSight hardware tracing infrastructure improvements, mostly not visible to users
- Update power10 JSON events
* tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (310 commits) perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space perf env: Find correct branch counter info on hybrid perf evlist: Print hint for group tools: Drop nonsensical -O6 perf pmu: To info add event_type_desc perf evsel: Add accessor for tool_event perf pmus: Fake PMU clean up perf list: Avoid potential out of bounds memory read perf help: Fix a typo ("bellow") perf ftrace: Detect whether ftrace is enabled on system perf test shell probe_vfs_getname: Remove extraneous '=' from probe line number regex perf build: Require at least clang 16.0.6 to build BPF skeletons perf trace: If a syscall arg is marked as 'const', assume it is coming _from_ userspace perf parse-events: Remove duplicated include in parse-events.c perf callchain: Allow symbols to be optional when resolving a callchain perf inject: Lazy build-id mmap2 event insertion perf inject: Add new mmap2-buildid-all option perf inject: Fix build ID injection perf annotate-data: Add pr_debug_scope() ...
show more ...
|
Revision tags: v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
#
ae8e4f40 |
| 23-Jul-2024 |
James Clark <james.clark@linaro.org> |
perf scripts python cs-etm: Restore first sample log in verbose mode
The linked commit moved the early return on the first sample to before the verbose log, so move the log earlier too. Now the firs
perf scripts python cs-etm: Restore first sample log in verbose mode
The linked commit moved the early return on the first sample to before the verbose log, so move the log earlier too. Now the first sample is also logged and not skipped.
Fixes: 2d98dbb4c9c5b09c ("perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample") Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: James Clark <james.clark@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: coresight@lists.linaro.org Cc: gankulkarni@os.amperecomputing.com Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20240723132858.12747-1-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
a23e1966 |
| 15-Jul-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.11 merge window.
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
#
6f47c7ae |
| 28-May-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.9' into next
Sync up with the mainline to bring in the new cleanup API.
|
Revision tags: v6.10-rc1 |
|
#
60a2f25d |
| 16-May-2024 |
Tvrtko Ursulin <tursulin@ursulin.net> |
Merge drm/drm-next into drm-intel-gt-next
Some display refactoring patches are needed in order to allow conflict- less merging.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
#
06d07429 |
| 29-Feb-2024 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync to get the drm_printer changes to drm-intel-next.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
2e21dee6 |
| 13-Mar-2024 |
Jiri Kosina <jkosina@suse.com> |
Merge branch 'for-6.9/amd-sfh' into for-linus
- assorted fixes and optimizations for amd-sfh (Basavaraj Natikar)
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
Revision tags: v6.8-rc6, v6.8-rc5 |
|
#
41c177cf |
| 11-Feb-2024 |
Rob Clark <robdclark@chromium.org> |
Merge tag 'drm-misc-next-2024-02-08' into msm-next
Merge the drm-misc tree to uprev MSM CI.
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
Revision tags: v6.8-rc4, v6.8-rc3 |
|
#
4db102dc |
| 29-Jan-2024 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next
Kickstart 6.9 development cycle.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.8-rc2 |
|
#
be3382ec |
| 23-Jan-2024 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Sync to v6.8-rc1.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
#
03c11eb3 |
| 14-Feb-2024 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch
Conflicts: arch/x86/include/asm/percpu.h arch/x86/include/asm/text-patching.h
Signed-off-by: Ingo Molnar <mingo@k
Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch
Conflicts: arch/x86/include/asm/percpu.h arch/x86/include/asm/text-patching.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
42ac0be1 |
| 26-Jan-2024 |
Ingo Molnar <mingo@kernel.org> |
Merge branch 'linus' into x86/mm, to refresh the branch and pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
06f609b3 |
| 25-Jan-2024 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f0b7a0d1 |
| 23-Jan-2024 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-hotfixes-stable
|
#
cf79f291 |
| 22-Jan-2024 |
Maxime Ripard <mripard@kernel.org> |
Merge v6.8-rc1 into drm-misc-fixes
Let's kickstart the 6.8 fix cycle.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.8-rc1 |
|
#
9d64bf43 |
| 19-Jan-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "Add Namhyung Kim as tools/perf/
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns processing patches, switching roles from perf-tools to perf-tools-next at each Linux release.
Data profiling:
- Associate samples that identify loads and stores with data structures. This uses events available on Intel, AMD and others and DWARF info:
# To get memory access samples in kernel for 1 second (on Intel) $ perf mem record -a -K --ldlat=4 -- sleep 1
# Similar for the AMD (but it requires 6.3+ kernel for BPF filters) $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1
Then, amongst several modes of post processing, one can do things like:
$ perf report -s type,typeoff --hierarchy --group --stdio ... # # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u' # Event count (approx.): 602758064 # # Overhead Data Type / Data Type Offset # ........................... ............................ # 26.09% 3.28% 0.00% long unsigned int 26.09% 3.28% 0.00% long unsigned int +0 (no field) 18.48% 0.73% 0.00% struct page 10.83% 0.02% 0.00% struct page +8 (lru.next) 3.90% 0.28% 0.00% struct page +0 (flags) 3.45% 0.06% 0.00% struct page +24 (mapping) 0.25% 0.28% 0.00% struct page +48 (_mapcount.counter) 0.02% 0.06% 0.00% struct page +32 (index) 0.02% 0.00% 0.00% struct page +52 (_refcount.counter) 0.02% 0.01% 0.00% struct page +56 (memcg_data) 0.00% 0.01% 0.00% struct page +16 (lru.prev) 15.37% 17.54% 0.00% (stack operation) 15.37% 17.54% 0.00% (stack operation) +0 (no field) 11.71% 50.27% 0.00% (unknown) 11.71% 50.27% 0.00% (unknown) +0 (no field)
$ perf annotate --data-type ... Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples): ============================================================================ samples offset size field 13 0 640 struct cfs_rq { 2 0 16 struct load_weight load { 2 0 8 unsigned long weight; 0 8 4 u32 inv_weight; }; 0 16 8 unsigned long runnable_weight; 0 24 4 unsigned int nr_running; 1 28 4 unsigned int h_nr_running; ...
$ perf annotate --data-type=page --group Annotate type: 'struct page' in [kernel.kallsyms] (480 samples): event[0] = cpu/mem-loads,ldlat=4/P event[1] = cpu/mem-stores/P event[2] = dummy:u =================================================================================== samples offset size field 447 33 0 0 64 struct page { 108 8 0 0 8 long unsigned int flags; 319 13 0 8 40 union { 319 13 0 8 40 struct { 236 2 0 8 16 union { 236 2 0 8 16 struct list_head lru { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct { 236 1 0 8 8 void* __filler; 0 1 0 16 4 unsigned int mlock_count; }; 236 2 0 8 16 struct list_head buddy_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct list_head pcp_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; }; 82 4 0 24 8 struct address_space* mapping; 1 7 0 32 8 union { 1 7 0 32 8 long unsigned int index; 1 7 0 32 8 long unsigned int share; }; 0 0 0 40 8 long unsigned int private; };
This uses the existing annotate code, calling objdump to do the disassembly, with improvements to avoid having this take too long, but longer term a switch to a disassembler library, possibly reusing code in the kernel will be pursued.
This is the initial implementation, please use it and report impressions and bugs. Make sure the kernel-debuginfo packages match the running kernel. The 'perf report' phase for non short perf.data files may take a while.
There is a great article about it on LWN:
https://lwn.net/Articles/955709/ - "Data-type profiling for perf"
One last test I did while writing this text, on a AMD Ryzen 5950X, using a distro kernel, while doing a simple 'find /' on an otherwise idle system resulted in:
# uname -r 6.6.9-100.fc38.x86_64 # perf -vv | grep BPF_ bpf: [ on ] # HAVE_LIBBPF_SUPPORT bpf_skeletons: [ on ] # HAVE_BPF_SKEL # rpm -qa | grep kernel-debuginfo kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64 kernel-debuginfo-6.6.9-100.fc38.x86_64 # # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ] # # ls -la perf.data -rw-------. 1 root root 2346486 Jan 9 18:36 perf.data # perf evlist ibs_op// dummy:u # perf evlist -v ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1 dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf report -s type,typeoff --hierarchy --group --stdio # Total Lost Samples: 0 # # Samples: 2K of events 'ibs_op//, dummy:u' # Event count (approx.): 1904553038 # # Overhead Data Type / Data Type Offset # ................... ............................ # 73.70% 0.00% (unknown) 73.70% 0.00% (unknown) +0 (no field) 3.01% 0.00% long unsigned int 3.00% 0.00% long unsigned int +0 (no field) 0.01% 0.00% long unsigned int +2 (no field) 2.73% 0.00% struct task_struct 1.71% 0.00% struct task_struct +52 (on_cpu) 0.38% 0.00% struct task_struct +2104 (rcu_read_unlock_special.b.blocked) 0.23% 0.00% struct task_struct +2100 (rcu_read_lock_nesting) 0.14% 0.00% struct task_struct +2384 () 0.06% 0.00% struct task_struct +3096 (signal) 0.05% 0.00% struct task_struct +3616 (cgroups) 0.05% 0.00% struct task_struct +2344 (active_mm) 0.02% 0.00% struct task_struct +46 (flags) 0.02% 0.00% struct task_struct +2096 (migration_disabled) 0.01% 0.00% struct task_struct +24 (__state) 0.01% 0.00% struct task_struct +3956 (mm_cid_active) 0.01% 0.00% struct task_struct +1048 (cpus_ptr) 0.01% 0.00% struct task_struct +184 (se.group_node.next) 0.01% 0.00% struct task_struct +20 (thread_info.cpu) 0.00% 0.00% struct task_struct +104 (on_rq) 0.00% 0.00% struct task_struct +2456 (pid) 1.36% 0.00% struct module 0.59% 0.00% struct module +952 (kallsyms) 0.42% 0.00% struct module +0 (state) 0.23% 0.00% struct module +8 (list.next) 0.12% 0.00% struct module +216 (syms) 0.95% 0.00% struct inode 0.41% 0.00% struct inode +40 (i_sb) 0.22% 0.00% struct inode +0 (i_mode) 0.06% 0.00% struct inode +76 (i_rdev) 0.06% 0.00% struct inode +56 (i_security) <SNIP>
perf top/report:
- Don't ignore job control, allowing control+Z + bg to work.
- Add s390 raw data interpretation for PAI (Processor Activity Instrumentation) counters.
perf archive:
- Add new option '--all' to pack perf.data with DSOs.
- Add new option '--unpack' to expand tarballs.
Initialization speedups:
- Lazily initialize zstd streams to save memory when not using it.
- Lazily allocate/size mmap event copy.
- Lazy load kernel symbols in 'perf record'.
- Be lazier in allocating lost samples buffer in 'perf record'.
- Don't synthesize BPF events when disabled via the command line (perf record --no-bpf-event).
Assorted improvements:
- Show note on AMD systems that the :p, :pp, :ppp and :P are all the same, as IBS (Instruction Based Sampling) is used and it is inherentely precise, not having levels of precision like in Intel systems.
- When 'cycles' isn't available, fall back to the "task-clock" event when not system wide, not to 'cpu-clock'.
- Add --debug-file option to redirect debug output, e.g.:
$ perf --debug-file /tmp/perf.log record -v true
- Shrink 'struct map' to under one cacheline by avoiding function pointers for selecting if addresses are identity or DSO relative, and using just a byte for some boolean struct members.
- Resolve the arch specific strerrno just once to use in perf_env__arch_strerrno().
- Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.
Assorted fixes:
- Fix the default 'perf top' usage on Intel hybrid systems, now it starts with a browser showing the number of samples for Efficiency (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This behaviour is similar on ARM64, with its respective set of big.LITTLE processors.
- Fix segfault on build_mem_topology() error path.
- Fix 'perf mem' error on hybrid related to availability of mem event in a PMU.
- Fix missing reference count gets (map, maps) in the db-export code.
- Avoid recursively taking env->bpf_progs.lock in the 'perf_env' code.
- Use the newly introduced maps__for_each_map() to add missing locking around iteration of 'struct map' entries.
- Parse NOTE segments until the build id is found, don't stop on the first one, ELF files may have several such NOTE segments.
- Remove 'egrep' usage, its deprecated, use 'grep -E' instead.
- Warn first about missing libelf, not libbpf, that depends on libelf.
- Use alternative to 'find ... -printf' as this isn't supported in busybox.
- Address python 3.6 DeprecationWarning for string scapes.
- Fix memory leak in uniq() in libsubcmd.
- Fix man page formatting for 'perf lock'
- Fix some spelling mistakes.
perf tests:
- Fail shell tests that needs some symbol in perf itself if it is stripped. These tests check if a symbol is resolved, if some hot function is indeed detected by profiling, etc.
- The 'perf test sigtrap' test is currently failing on PREEMPT_RT, skip it if sleeping spinlocks are detected (using BTF) and point to the mailing list discussion about it. This test is also being skipped on several architectures (powerpc, s390x, arm and aarch64) due to other pending issues with intruction breakpoints.
- Adjust test case perf record offcpu profiling tests for s390.
- Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest, addressing issues caused by the fallback from cycles to task-clock done in this release.
- Fix mask for VG register in the user-regs test.
- Use shellcheck on 'perf test' shell scripts automatically to make sure changes don't introduce things it flags as problematic.
- Add option to change objdump binary and allow it to be set via 'perf config'.
- Add basic 'perf script', 'perf list --json" and 'perf diff' tests.
- Basic branch counter support.
- Make DSO tests a suite rather than individual.
- Remove atomics from test_loop to avoid test failures.
- Fix call chain match on powerpc for the record+probe_libc_inet_pton test.
- Improve Intel hybrid tests.
Vendor event files (JSON):
powerpc:
- Update datasource event name to fix duplicate events on IBM's Power10.
- Add PVN for HX-C2000 CPU with Power8 Architecture.
Intel:
- Alderlake/rocketlake metric fixes.
- Update emeraldrapids events to v1.02.
- Update icelakex events to v1.23.
- Update sapphirerapids events to v1.17.
- Add skx, clx, icx and spr upi bandwidth metric.
AMD:
- Add Zen 4 memory controller events.
RISC-V:
- Add StarFive Dubhe-80 and Dubhe-90 JSON files. https://www.starfivetech.com/en/site/cpu-u
- Add T-HEAD C9xx JSON file. https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md
ARM64:
- Remove UTF-8 characters from cmn.json, that were causing build failure in some distros.
- Add core PMU events and metrics for Ampere One X.
- Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT
libperf:
- Rename several perf_cpu_map constructor names to clarify what they really do.
- Ditto for some other methods, coping with some issues in their semantics, like perf_cpu_map__empty() -> perf_cpu_map__has_any_cpu_or_is_empty().
- Document perf_cpu_map__nr()'s behavior
perf stat:
- Exit if parse groups fails.
- Combine the -A/--no-aggr and --no-merge options.
- Fix help message for --metric-no-threshold option.
Hardware tracing:
ARM64 CoreSight:
- Bump minimum OpenCSD version to ensure a bugfix is present.
- Add 'T' itrace option for timestamp trace
- Set start vm addr of exectable file to 0 and don't ignore first sample on the arm-cs-trace-disasm.py 'perf script'"
* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits) MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm perf db-export: Fix missing reference count get in call_path_from_sample() perf tests: Add perf script test libsubcmd: Fix memory leak in uniq() perf TUI: Don't ignore job control perf vendor events intel: Update sapphirerapids events to v1.17 perf vendor events intel: Update icelakex events to v1.23 perf vendor events intel: Update emeraldrapids events to v1.02 perf vendor events intel: Alderlake/rocketlake metric fixes perf x86 test: Add hybrid test for conflicting legacy/sysfs event perf x86 test: Update hybrid expectations perf vendor events amd: Add Zen 4 memory controller events perf stat: Fix hard coded LL miss units perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event perf env: Avoid recursively taking env->bpf_progs.lock perf annotate: Add --insn-stat option for debugging perf annotate: Add --type-stat option for debugging perf annotate: Support event group display perf annotate: Add --data-type option ...
show more ...
|
Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6 |
|
#
2d98dbb4 |
| 14-Dec-2023 |
Ruidong Tian <tianruidong@linux.alibaba.com> |
perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample
arm-cs-trace-disasm ignore disam the first branch sample, For example as follow, the instructions beteween 0x0000ffffae87
perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample
arm-cs-trace-disasm ignore disam the first branch sample, For example as follow, the instructions beteween 0x0000ffffae878750 and 0x0000ffffae878754 is lose:
ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffae878750 phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
Initialize cpu_data earlier to fix it:
ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } 0000000000028740 <ioctl>: (base address is 0x0000ffffae850000) 28750: b13ffc1f cmn x0, #4095 28754: 54000042 b.hs 0x2875c <ioctl+0x1c> test 4003489/4003489 [0000] 26765.151766034 __GI___ioctl+0x14 /usr/lib64/libc-2.32.so Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffa67535ac phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 }
Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Link: https://lore.kernel.org/r/20231214123304.34087-4-tianruidong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
c344675a |
| 14-Dec-2023 |
Ruidong Tian <tianruidong@linux.alibaba.com> |
perf scripts python arm-cs-trace-disasm.py: Set start vm addr of exectable file to 0
For exectable ELF file, which e_type is ET_EXEC, dso start address is a absolute address other than offset. Just
perf scripts python arm-cs-trace-disasm.py: Set start vm addr of exectable file to 0
For exectable ELF file, which e_type is ET_EXEC, dso start address is a absolute address other than offset. Just set vm_start to zero when dso start is 0x400000, which means it is a exectable file.
Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Link: https://lore.kernel.org/r/20231214123304.34087-3-tianruidong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
#
280b4e4a |
| 12-Sep-2023 |
Benjamin Gray <bgray@linux.ibm.com> |
perf tools: Address python 3.6 DeprecationWarning for string scapes
Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and w
perf tools: Address python 3.6 DeprecationWarning for string scapes
Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and will eventually be a syntax error.
Fix these now to get ahead of it before it's an error.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Todd E Brandt <todd.e.brandt@linux.intel.com> Cc: Tom Rix <trix@redhat.com> Cc: linux-doc@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230912060801.95533-6-bgray@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
cdd5b5a9 |
| 07-Nov-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.7 merge window.
|
Revision tags: v6.6-rc1 |
|
#
34069d12 |
| 05-Sep-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.5' into next
Sync up with mainline to bring in updates to the shared infrastructure.
|