Revision tags: v6.12-rc2 |
|
#
c8d430db |
| 06-Oct-2024 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #1
- Fix pKVM error path on init, making sure we do not chang
Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.12, take #1
- Fix pKVM error path on init, making sure we do not change critical system registers as we're about to fail
- Make sure that the host's vector length is at capped by a value common to all CPUs
- Fix kvm_has_feat*() handling of "negative" features, as the current code is pretty broken
- Promote Joey to the status of official reviewer, while James steps down -- hopefully only temporarly
show more ...
|
#
0c436dfe |
| 02-Oct-2024 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.12
A bunch of fixes here that came in during the merge window and t
Merge tag 'asoc-fix-v6.12-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.12
A bunch of fixes here that came in during the merge window and the first week of release, plus some new quirks and device IDs. There's nothing major here, it's a bit bigger than it might've been due to there being no fixes sent during the merge window due to your vacation.
show more ...
|
#
2cd86f02 |
| 01-Oct-2024 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Required for a panthor fix that broke when FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.
Signed-off-by: Maarten L
Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixes
Required for a panthor fix that broke when FOP_UNSIGNED_OFFSET was added in place of FMODE_UNSIGNED_OFFSET.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
show more ...
|
Revision tags: v6.12-rc1 |
|
#
3a39d672 |
| 27-Sep-2024 |
Paolo Abeni <pabeni@redhat.com> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts and no adjacent changes.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
#
891e8abe |
| 22-Sep-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Use BPF + BTF to collect and
Merge tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Use BPF + BTF to collect and pretty print syscall and tracepoint arguments in 'perf trace', done as an GSoC activity
- Data-type profiling improvements:
- Cache debuginfo to speed up data type resolution
- Add the 'typecln' sort order, to show which cacheline in a target is hot or cold. The following shows members in the cfs_rq's first cache line:
$ perf report -s type,typecln,typeoff -H ... - 2.67% struct cfs_rq + 1.23% struct cfs_rq: cache-line 2 + 0.57% struct cfs_rq: cache-line 4 + 0.46% struct cfs_rq: cache-line 6 - 0.41% struct cfs_rq: cache-line 0 0.39% struct cfs_rq +0x14 (h_nr_running) 0.02% struct cfs_rq +0x38 (tasks_timeline.rb_leftmost)
- When a typedef resolves to a unnamed struct, use the typedef name
- When a struct has just one basic type field (int, etc), resolve the type sort order to the name of the struct, not the type of the field
- Support type folding/unfolding in the data-type annotation TUI
- Fix bitfields offsets and sizes
- Initial support for PowerPC, using libcapstone and the usual objdump disassembly parsing routines
- Add support for disassembling and addr2line using the LLVM libraries, speeding up those operations
- Support --addr2line option in 'perf script' as with other tools
- Intel branch counters (LBR event logging) support, only available in recent Intel processors, for instance, the new "brcntr" field can be asked from 'perf script' to print the information collected from this feature:
$ perf script -F +brstackinsn,+brcntr
# Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC
- Support Timed PEBS (Precise Event-Based Sampling), a recent hardware feature in Intel processors
- Add 'perf ftrace profile' subcommand, using ftrace's function-graph tracer so that users can see the total, average, max execution time as well as the number of invocations easily, for instance:
$ sudo perf ftrace profile -G __x64_sys_perf_event_open -- \ perf stat -e cycles -C1 true 2> /dev/null | head # Total (us) Avg (us) Max (us) Count Function 65.611 65.611 65.611 1 __x64_sys_perf_event_open 30.527 30.527 30.527 1 anon_inode_getfile 30.260 30.260 30.260 1 __anon_inode_getfile 29.700 29.700 29.700 1 alloc_file_pseudo 17.578 17.578 17.578 1 d_alloc_pseudo 17.382 17.382 17.382 1 __d_alloc 16.738 16.738 16.738 1 kmem_cache_alloc_lru 15.686 15.686 15.686 1 perf_event_alloc 14.012 7.006 11.264 2 obj_cgroup_charge
- 'perf sched timehist' improvements, including the addition of priority showing/filtering command line options
- Varios improvements to the 'perf probe', including 'perf test' regression testings
- Introduce the 'perf check', initially to check if some feature is in place, using it in 'perf test'
- Various fixes for 32-bit systems
- Address more leak sanitizer failures
- Fix memory leaks (LBR, disasm lock ops, etc)
- More reference counting fixes (branch_info, etc)
- Constify 'struct perf_tool' parameters to improve code generation and reduce the chances of having its internals changed, which isn't expected
- More constifications in various other places
- Add more build tests, including for JEVENTS
- Add more 'perf test' entries ('perf record LBR', pipe/inject, --setup-filter, 'perf ftrace', 'cgroup sampling', etc)
- Inject build ids for all entries in a call chain in 'perf inject', not just for the main sample
- Improve the BPF based sample filter, allowing root to setup filters in bpffs that then can be used by non-root users
- Allow filtering by cgroups with the BPF based sample filter
- Allow a more compact way for 'perf mem report' using the -T/--type-profile and also provide a --sort option similar to the one in 'perf report', 'perf top', to setup the sort order manually
- Fix --group behavior in 'perf annotate' when leader has no samples, where it was not showing anything even when other events in the group had samples
- Fix spinlock and rwlock accounting in 'perf lock contention'
- Fix libsubcmd fixdep Makefile dependencies
- Improve 'perf ftrace' error message when ftrace isn't available
- Update various Intel JSON vendor event files
- ARM64 CoreSight hardware tracing infrastructure improvements, mostly not visible to users
- Update power10 JSON events
* tag 'perf-tools-for-v6.12-1-2024-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (310 commits) perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space perf env: Find correct branch counter info on hybrid perf evlist: Print hint for group tools: Drop nonsensical -O6 perf pmu: To info add event_type_desc perf evsel: Add accessor for tool_event perf pmus: Fake PMU clean up perf list: Avoid potential out of bounds memory read perf help: Fix a typo ("bellow") perf ftrace: Detect whether ftrace is enabled on system perf test shell probe_vfs_getname: Remove extraneous '=' from probe line number regex perf build: Require at least clang 16.0.6 to build BPF skeletons perf trace: If a syscall arg is marked as 'const', assume it is coming _from_ userspace perf parse-events: Remove duplicated include in parse-events.c perf callchain: Allow symbols to be optional when resolving a callchain perf inject: Lazy build-id mmap2 event insertion perf inject: Add new mmap2-buildid-all option perf inject: Fix build ID injection perf annotate-data: Add pr_debug_scope() ...
show more ...
|
Revision tags: v6.11 |
|
#
64eed019 |
| 09-Sep-2024 |
Ian Rogers <irogers@google.com> |
perf inject: Lazy build-id mmap2 event insertion
Add -B option that lazily inserts mmap2 events thereby dropping all mmap events without samples. This is similar to the behavior of -b where only bui
perf inject: Lazy build-id mmap2 event insertion
Add -B option that lazily inserts mmap2 events thereby dropping all mmap events without samples. This is similar to the behavior of -b where only build_id events are inserted when a dso is accessed in a sample.
File size savings can be significant in system-wide mode, consider:
$ perf record -g -a -o perf.data sleep 1 $ perf inject -B -i perf.data -o perf.new.data $ ls -al perf.data perf.new.data 5147049 perf.data 2248493 perf.new.data
Give test coverage of the new option in pipe test.
Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
#
d762ba02 |
| 09-Sep-2024 |
Ian Rogers <irogers@google.com> |
perf inject: Add new mmap2-buildid-all option
Add an option that allows all mmap or mmap2 events to be rewritten as mmap2 events with build IDs.
This is similar to the existing -b/--build-ids and -
perf inject: Add new mmap2-buildid-all option
Add an option that allows all mmap or mmap2 events to be rewritten as mmap2 events with build IDs.
This is similar to the existing -b/--build-ids and --buildid-all options except instead of adding a build_id event an existing mmap/mmap2 event is used as a template and a new mmap2 event synthesized from it.
As mmap2 events are typical this avoids the insertion of build_id events.
Add test coverage to the pipe test.
Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Casey Chen <cachen@purestorage.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Link: https://lore.kernel.org/r/20240909203740.143492-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.11-rc7, v6.11-rc6 |
|
#
ccb90046 |
| 29-Aug-2024 |
Ian Rogers <irogers@google.com> |
perf test: Additional pipe tests with pipe output written to a file
Additional pipe tests where piped files are written to disk. This means that spotting a file name of "-" isn't a sufficient "is pi
perf test: Additional pipe tests with pipe output written to a file
Additional pipe tests where piped files are written to disk. This means that spotting a file name of "-" isn't a sufficient "is pipe?" test.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240829150154.37929-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.11-rc5, v6.11-rc4 |
|
#
a8656614 |
| 17-Aug-2024 |
Ian Rogers <irogers@google.com> |
perf test: Expand pipe/inject test
Test recording of call-graphs and injecting --build-all. Add/expand trap handler.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@
perf test: Expand pipe/inject test
Test recording of call-graphs and injecting --build-all. Add/expand trap handler.
Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Casey Chen <cachen@purestorage.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jann Horn <jannh@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Link: https://lore.kernel.org/r/20240817064442.2152089-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
#
a23e1966 |
| 15-Jul-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.11 merge window.
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
#
6f47c7ae |
| 28-May-2024 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v6.9' into next
Sync up with the mainline to bring in the new cleanup API.
|
Revision tags: v6.10-rc1 |
|
#
60a2f25d |
| 16-May-2024 |
Tvrtko Ursulin <tursulin@ursulin.net> |
Merge drm/drm-next into drm-intel-gt-next
Some display refactoring patches are needed in order to allow conflict- less merging.
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
#
06d07429 |
| 29-Feb-2024 |
Jani Nikula <jani.nikula@intel.com> |
Merge drm/drm-next into drm-intel-next
Sync to get the drm_printer changes to drm-intel-next.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
2e21dee6 |
| 13-Mar-2024 |
Jiri Kosina <jkosina@suse.com> |
Merge branch 'for-6.9/amd-sfh' into for-linus
- assorted fixes and optimizations for amd-sfh (Basavaraj Natikar)
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
Revision tags: v6.8-rc6, v6.8-rc5 |
|
#
41c177cf |
| 11-Feb-2024 |
Rob Clark <robdclark@chromium.org> |
Merge tag 'drm-misc-next-2024-02-08' into msm-next
Merge the drm-misc tree to uprev MSM CI.
Signed-off-by: Rob Clark <robdclark@chromium.org>
|
Revision tags: v6.8-rc4, v6.8-rc3 |
|
#
4db102dc |
| 29-Jan-2024 |
Maxime Ripard <mripard@kernel.org> |
Merge drm/drm-next into drm-misc-next
Kickstart 6.9 development cycle.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.8-rc2 |
|
#
be3382ec |
| 23-Jan-2024 |
Lucas De Marchi <lucas.demarchi@intel.com> |
Merge drm/drm-next into drm-xe-next
Sync to v6.8-rc1.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
#
03c11eb3 |
| 14-Feb-2024 |
Ingo Molnar <mingo@kernel.org> |
Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch
Conflicts: arch/x86/include/asm/percpu.h arch/x86/include/asm/text-patching.h
Signed-off-by: Ingo Molnar <mingo@k
Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch
Conflicts: arch/x86/include/asm/percpu.h arch/x86/include/asm/text-patching.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
42ac0be1 |
| 26-Jan-2024 |
Ingo Molnar <mingo@kernel.org> |
Merge branch 'linus' into x86/mm, to refresh the branch and pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
06f609b3 |
| 25-Jan-2024 |
Jakub Kicinski <kuba@kernel.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
f0b7a0d1 |
| 23-Jan-2024 |
Andrew Morton <akpm@linux-foundation.org> |
Merge branch 'master' into mm-hotfixes-stable
|
#
cf79f291 |
| 22-Jan-2024 |
Maxime Ripard <mripard@kernel.org> |
Merge v6.8-rc1 into drm-misc-fixes
Let's kickstart the 6.8 fix cycle.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
Revision tags: v6.8-rc1 |
|
#
9d64bf43 |
| 19-Jan-2024 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "Add Namhyung Kim as tools/perf/
Merge tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "Add Namhyung Kim as tools/perf/ co-maintainer, we're taking turns processing patches, switching roles from perf-tools to perf-tools-next at each Linux release.
Data profiling:
- Associate samples that identify loads and stores with data structures. This uses events available on Intel, AMD and others and DWARF info:
# To get memory access samples in kernel for 1 second (on Intel) $ perf mem record -a -K --ldlat=4 -- sleep 1
# Similar for the AMD (but it requires 6.3+ kernel for BPF filters) $ perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1
Then, amongst several modes of post processing, one can do things like:
$ perf report -s type,typeoff --hierarchy --group --stdio ... # # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u' # Event count (approx.): 602758064 # # Overhead Data Type / Data Type Offset # ........................... ............................ # 26.09% 3.28% 0.00% long unsigned int 26.09% 3.28% 0.00% long unsigned int +0 (no field) 18.48% 0.73% 0.00% struct page 10.83% 0.02% 0.00% struct page +8 (lru.next) 3.90% 0.28% 0.00% struct page +0 (flags) 3.45% 0.06% 0.00% struct page +24 (mapping) 0.25% 0.28% 0.00% struct page +48 (_mapcount.counter) 0.02% 0.06% 0.00% struct page +32 (index) 0.02% 0.00% 0.00% struct page +52 (_refcount.counter) 0.02% 0.01% 0.00% struct page +56 (memcg_data) 0.00% 0.01% 0.00% struct page +16 (lru.prev) 15.37% 17.54% 0.00% (stack operation) 15.37% 17.54% 0.00% (stack operation) +0 (no field) 11.71% 50.27% 0.00% (unknown) 11.71% 50.27% 0.00% (unknown) +0 (no field)
$ perf annotate --data-type ... Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples): ============================================================================ samples offset size field 13 0 640 struct cfs_rq { 2 0 16 struct load_weight load { 2 0 8 unsigned long weight; 0 8 4 u32 inv_weight; }; 0 16 8 unsigned long runnable_weight; 0 24 4 unsigned int nr_running; 1 28 4 unsigned int h_nr_running; ...
$ perf annotate --data-type=page --group Annotate type: 'struct page' in [kernel.kallsyms] (480 samples): event[0] = cpu/mem-loads,ldlat=4/P event[1] = cpu/mem-stores/P event[2] = dummy:u =================================================================================== samples offset size field 447 33 0 0 64 struct page { 108 8 0 0 8 long unsigned int flags; 319 13 0 8 40 union { 319 13 0 8 40 struct { 236 2 0 8 16 union { 236 2 0 8 16 struct list_head lru { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct { 236 1 0 8 8 void* __filler; 0 1 0 16 4 unsigned int mlock_count; }; 236 2 0 8 16 struct list_head buddy_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; 236 2 0 8 16 struct list_head pcp_list { 236 1 0 8 8 struct list_head* next; 0 1 0 16 8 struct list_head* prev; }; }; 82 4 0 24 8 struct address_space* mapping; 1 7 0 32 8 union { 1 7 0 32 8 long unsigned int index; 1 7 0 32 8 long unsigned int share; }; 0 0 0 40 8 long unsigned int private; };
This uses the existing annotate code, calling objdump to do the disassembly, with improvements to avoid having this take too long, but longer term a switch to a disassembler library, possibly reusing code in the kernel will be pursued.
This is the initial implementation, please use it and report impressions and bugs. Make sure the kernel-debuginfo packages match the running kernel. The 'perf report' phase for non short perf.data files may take a while.
There is a great article about it on LWN:
https://lwn.net/Articles/955709/ - "Data-type profiling for perf"
One last test I did while writing this text, on a AMD Ryzen 5950X, using a distro kernel, while doing a simple 'find /' on an otherwise idle system resulted in:
# uname -r 6.6.9-100.fc38.x86_64 # perf -vv | grep BPF_ bpf: [ on ] # HAVE_LIBBPF_SUPPORT bpf_skeletons: [ on ] # HAVE_BPF_SKEL # rpm -qa | grep kernel-debuginfo kernel-debuginfo-common-x86_64-6.6.9-100.fc38.x86_64 kernel-debuginfo-6.6.9-100.fc38.x86_64 # # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ] # # ls -la perf.data -rw-------. 1 root root 2346486 Jan 9 18:36 perf.data # perf evlist ibs_op// dummy:u # perf evlist -v ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1 dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf report -s type,typeoff --hierarchy --group --stdio # Total Lost Samples: 0 # # Samples: 2K of events 'ibs_op//, dummy:u' # Event count (approx.): 1904553038 # # Overhead Data Type / Data Type Offset # ................... ............................ # 73.70% 0.00% (unknown) 73.70% 0.00% (unknown) +0 (no field) 3.01% 0.00% long unsigned int 3.00% 0.00% long unsigned int +0 (no field) 0.01% 0.00% long unsigned int +2 (no field) 2.73% 0.00% struct task_struct 1.71% 0.00% struct task_struct +52 (on_cpu) 0.38% 0.00% struct task_struct +2104 (rcu_read_unlock_special.b.blocked) 0.23% 0.00% struct task_struct +2100 (rcu_read_lock_nesting) 0.14% 0.00% struct task_struct +2384 () 0.06% 0.00% struct task_struct +3096 (signal) 0.05% 0.00% struct task_struct +3616 (cgroups) 0.05% 0.00% struct task_struct +2344 (active_mm) 0.02% 0.00% struct task_struct +46 (flags) 0.02% 0.00% struct task_struct +2096 (migration_disabled) 0.01% 0.00% struct task_struct +24 (__state) 0.01% 0.00% struct task_struct +3956 (mm_cid_active) 0.01% 0.00% struct task_struct +1048 (cpus_ptr) 0.01% 0.00% struct task_struct +184 (se.group_node.next) 0.01% 0.00% struct task_struct +20 (thread_info.cpu) 0.00% 0.00% struct task_struct +104 (on_rq) 0.00% 0.00% struct task_struct +2456 (pid) 1.36% 0.00% struct module 0.59% 0.00% struct module +952 (kallsyms) 0.42% 0.00% struct module +0 (state) 0.23% 0.00% struct module +8 (list.next) 0.12% 0.00% struct module +216 (syms) 0.95% 0.00% struct inode 0.41% 0.00% struct inode +40 (i_sb) 0.22% 0.00% struct inode +0 (i_mode) 0.06% 0.00% struct inode +76 (i_rdev) 0.06% 0.00% struct inode +56 (i_security) <SNIP>
perf top/report:
- Don't ignore job control, allowing control+Z + bg to work.
- Add s390 raw data interpretation for PAI (Processor Activity Instrumentation) counters.
perf archive:
- Add new option '--all' to pack perf.data with DSOs.
- Add new option '--unpack' to expand tarballs.
Initialization speedups:
- Lazily initialize zstd streams to save memory when not using it.
- Lazily allocate/size mmap event copy.
- Lazy load kernel symbols in 'perf record'.
- Be lazier in allocating lost samples buffer in 'perf record'.
- Don't synthesize BPF events when disabled via the command line (perf record --no-bpf-event).
Assorted improvements:
- Show note on AMD systems that the :p, :pp, :ppp and :P are all the same, as IBS (Instruction Based Sampling) is used and it is inherentely precise, not having levels of precision like in Intel systems.
- When 'cycles' isn't available, fall back to the "task-clock" event when not system wide, not to 'cpu-clock'.
- Add --debug-file option to redirect debug output, e.g.:
$ perf --debug-file /tmp/perf.log record -v true
- Shrink 'struct map' to under one cacheline by avoiding function pointers for selecting if addresses are identity or DSO relative, and using just a byte for some boolean struct members.
- Resolve the arch specific strerrno just once to use in perf_env__arch_strerrno().
- Reduce memory for recording PERF_RECORD_LOST_SAMPLES event.
Assorted fixes:
- Fix the default 'perf top' usage on Intel hybrid systems, now it starts with a browser showing the number of samples for Efficiency (cpu_atom/cycles/P) and Performance (cpu_core/cycles/P). This behaviour is similar on ARM64, with its respective set of big.LITTLE processors.
- Fix segfault on build_mem_topology() error path.
- Fix 'perf mem' error on hybrid related to availability of mem event in a PMU.
- Fix missing reference count gets (map, maps) in the db-export code.
- Avoid recursively taking env->bpf_progs.lock in the 'perf_env' code.
- Use the newly introduced maps__for_each_map() to add missing locking around iteration of 'struct map' entries.
- Parse NOTE segments until the build id is found, don't stop on the first one, ELF files may have several such NOTE segments.
- Remove 'egrep' usage, its deprecated, use 'grep -E' instead.
- Warn first about missing libelf, not libbpf, that depends on libelf.
- Use alternative to 'find ... -printf' as this isn't supported in busybox.
- Address python 3.6 DeprecationWarning for string scapes.
- Fix memory leak in uniq() in libsubcmd.
- Fix man page formatting for 'perf lock'
- Fix some spelling mistakes.
perf tests:
- Fail shell tests that needs some symbol in perf itself if it is stripped. These tests check if a symbol is resolved, if some hot function is indeed detected by profiling, etc.
- The 'perf test sigtrap' test is currently failing on PREEMPT_RT, skip it if sleeping spinlocks are detected (using BTF) and point to the mailing list discussion about it. This test is also being skipped on several architectures (powerpc, s390x, arm and aarch64) due to other pending issues with intruction breakpoints.
- Adjust test case perf record offcpu profiling tests for s390.
- Fix 'Setup struct perf_event_attr' fails on s390 on z/VM guest, addressing issues caused by the fallback from cycles to task-clock done in this release.
- Fix mask for VG register in the user-regs test.
- Use shellcheck on 'perf test' shell scripts automatically to make sure changes don't introduce things it flags as problematic.
- Add option to change objdump binary and allow it to be set via 'perf config'.
- Add basic 'perf script', 'perf list --json" and 'perf diff' tests.
- Basic branch counter support.
- Make DSO tests a suite rather than individual.
- Remove atomics from test_loop to avoid test failures.
- Fix call chain match on powerpc for the record+probe_libc_inet_pton test.
- Improve Intel hybrid tests.
Vendor event files (JSON):
powerpc:
- Update datasource event name to fix duplicate events on IBM's Power10.
- Add PVN for HX-C2000 CPU with Power8 Architecture.
Intel:
- Alderlake/rocketlake metric fixes.
- Update emeraldrapids events to v1.02.
- Update icelakex events to v1.23.
- Update sapphirerapids events to v1.17.
- Add skx, clx, icx and spr upi bandwidth metric.
AMD:
- Add Zen 4 memory controller events.
RISC-V:
- Add StarFive Dubhe-80 and Dubhe-90 JSON files. https://www.starfivetech.com/en/site/cpu-u
- Add T-HEAD C9xx JSON file. https://github.com/riscv-software-src/opensbi/blob/master/docs/platform/thead-c9xx.md
ARM64:
- Remove UTF-8 characters from cmn.json, that were causing build failure in some distros.
- Add core PMU events and metrics for Ampere One X.
- Rename Ampere One's BPU_FLUSH_MEM_FAULT to GPC_FLUSH_MEM_FAULT
libperf:
- Rename several perf_cpu_map constructor names to clarify what they really do.
- Ditto for some other methods, coping with some issues in their semantics, like perf_cpu_map__empty() -> perf_cpu_map__has_any_cpu_or_is_empty().
- Document perf_cpu_map__nr()'s behavior
perf stat:
- Exit if parse groups fails.
- Combine the -A/--no-aggr and --no-merge options.
- Fix help message for --metric-no-threshold option.
Hardware tracing:
ARM64 CoreSight:
- Bump minimum OpenCSD version to ensure a bugfix is present.
- Add 'T' itrace option for timestamp trace
- Set start vm addr of exectable file to 0 and don't ignore first sample on the arm-cs-trace-disasm.py 'perf script'"
* tag 'perf-tools-for-v6.8-1-2024-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (179 commits) MAINTAINERS: Add Namhyung as tools/perf/ co-maintainer perf test: test case 'Setup struct perf_event_attr' fails on s390 on z/vm perf db-export: Fix missing reference count get in call_path_from_sample() perf tests: Add perf script test libsubcmd: Fix memory leak in uniq() perf TUI: Don't ignore job control perf vendor events intel: Update sapphirerapids events to v1.17 perf vendor events intel: Update icelakex events to v1.23 perf vendor events intel: Update emeraldrapids events to v1.02 perf vendor events intel: Alderlake/rocketlake metric fixes perf x86 test: Add hybrid test for conflicting legacy/sysfs event perf x86 test: Update hybrid expectations perf vendor events amd: Add Zen 4 memory controller events perf stat: Fix hard coded LL miss units perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event perf env: Avoid recursively taking env->bpf_progs.lock perf annotate: Add --insn-stat option for debugging perf annotate: Add --type-stat option for debugging perf annotate: Support event group display perf annotate: Add --data-type option ...
show more ...
|
Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3 |
|
#
c9526a73 |
| 23-Nov-2023 |
Adrian Hunter <adrian.hunter@intel.com> |
perf tests: Skip pipe test if noploop symbol is missing
perf pipe recording and injection test depends on finding symbol noploop in perf, and fails if perf has been stripped and no debug object is a
perf tests: Skip pipe test if noploop symbol is missing
perf pipe recording and injection test depends on finding symbol noploop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead.
Example:
Before:
$ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 47734 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 47741 47741 -1 |perf [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] cannot find noploop function in pipe #1 test child finished with -1 ---- end ---- perf pipe recording and injection test: FAILED!
After:
$ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 48996 perf does not have symbol 'noploop' perf is missing symbols - skipping test test child finished with -2 ---- end ---- perf pipe recording and injection test: Skip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
show more ...
|
Revision tags: v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1 |
|
#
7ae9fb1b |
| 21-Feb-2023 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 6.3 merge window.
|