History log of /linux/tools/perf/pmu-events/arch/arm64/arm/neoverse-v1/memory.json (Results 1 – 25 of 38)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# a23e1966 15-Jul-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.11 merge window.


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2
# 6f47c7ae 28-May-2024 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.9' into next

Sync up with the mainline to bring in the new cleanup API.


Revision tags: v6.10-rc1
# 60a2f25d 16-May-2024 Tvrtko Ursulin <tursulin@ursulin.net>

Merge drm/drm-next into drm-intel-gt-next

Some display refactoring patches are needed in order to allow conflict-
less merging.

Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>


Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1
# 0ea5c948 15-Jan-2024 Jani Nikula <jani.nikula@intel.com>

Merge drm/drm-next into drm-intel-next

Backmerge to bring Xe driver to drm-intel-next.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 03c11eb3 14-Feb-2024 Ingo Molnar <mingo@kernel.org>

Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch

Conflicts:
arch/x86/include/asm/percpu.h
arch/x86/include/asm/text-patching.h

Signed-off-by: Ingo Molnar <mingo@k

Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch

Conflicts:
arch/x86/include/asm/percpu.h
arch/x86/include/asm/text-patching.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# 6b93f350 08-Jan-2024 Jiri Kosina <jkosina@suse.com>

Merge branch 'for-6.8/amd-sfh' into for-linus

- addition of new interfaces to export User presence information and
Ambient light from amd-sfh to other drivers within the kernel (Basavaraj
Natika

Merge branch 'for-6.8/amd-sfh' into for-linus

- addition of new interfaces to export User presence information and
Ambient light from amd-sfh to other drivers within the kernel (Basavaraj
Natikar)

show more ...


Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2
# 3bf3e21c 15-Nov-2023 Maxime Ripard <mripard@kernel.org>

Merge drm/drm-next into drm-misc-next

Let's kickstart the v6.8 release cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>


# 5d2d4a9f 15-Nov-2023 Peter Zijlstra <peterz@infradead.org>

Merge branch 'tip/perf/urgent'

Avoid conflicts, base on fixes.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>


Revision tags: v6.7-rc1
# 7ab89417 03-Nov-2023 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build:

- Compile BPF programs by defaul

Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
"Build:

- Compile BPF programs by default if clang (>= 12.0.1) is available
to enable more features like kernel lock contention, off-cpu
profiling, kwork, sample filtering and so on.

This can be disabled by passing BUILD_BPF_SKEL=0 to make.

- Produce better error messages for bison on debug build (make
DEBUG=1) by defining YYDEBUG symbol internally.

perf record:

- Track sideband events (like FORK/MMAP) from all CPUs even if perf
record targets a subset of CPUs only (using -C option). Otherwise
it may lose some information happened on a CPU out of the target
list.

- Fix checking raw sched_switch tracepoint argument using system BTF.
This affects off-cpu profiling which attaches a BPF program to the
raw tracepoint.

perf lock contention:

- Add --lock-cgroup option to see contention by cgroups. This should
be used with BPF only (using -b option).

$ sudo perf lock con -ab --lock-cgroup -- sleep 1
contended total wait max wait avg wait cgroup

835 14.06 ms 41.19 us 16.83 us /system.slice/led.service
25 122.38 us 13.77 us 4.89 us /
44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope
1 491 ns 491 ns 491 ns /system.slice/connectd.service

- Add -G/--cgroup-filter option to see contention only for given
cgroups.

This can be useful when you identified a cgroup in the above
command and want to investigate more on it. It also works with
other output options like -t/--threads and -l/--lock-addr.

$ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1
contended total wait max wait avg wait type caller

8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8
2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25
1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a

- Use per-cpu array for better spinlock tracking. This is to improve
performance of the BPF program and to avoid nested contention on a
lock in the BPF hash map.

- Update callstack check for PowerPC. To find a representative caller
of a lock, it needs to look up the call stacks. It ends the lookup
when it sees 0 in the call stack buffer. However, PowerPC call
stacks can have 0 values in the beginning so skip them when it
expects valid call stacks after.

perf kwork:

- Support 'sched' class (for -k option) so that it can see task
scheduling event (using sched_switch tracepoint) as well as irq and
workqueue items.

- Add perf kwork top subcommand to show more accurate cpu utilization
with sched class above. It works both with a recorded data (using
perf kwork record command) and BPF (using -b option). Unlike perf
top command, it does not support interactive mode (yet).

$ sudo perf kwork top -b -k sched
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 160702.425 ms, 8 cpus
%Cpu(s): 36.00% id, 0.00% hi, 0.00% si
%Cpu0 [|||||||||||||||||| 61.66%]
%Cpu1 [|||||||||||||||||| 61.27%]
%Cpu2 [||||||||||||||||||| 66.40%]
%Cpu3 [|||||||||||||||||| 61.28%]
%Cpu4 [|||||||||||||||||| 61.82%]
%Cpu5 [||||||||||||||||||||||| 77.41%]
%Cpu6 [|||||||||||||||||| 61.73%]
%Cpu7 [|||||||||||||||||| 63.25%]

PID SPID %CPU RUNTIME COMMMAND
-------------------------------------------------------------
0 0 38.72 8089.463 ms [swapper/1]
0 0 38.71 8084.547 ms [swapper/3]
0 0 38.33 8007.532 ms [swapper/0]
0 0 38.26 7992.985 ms [swapper/6]
0 0 38.17 7971.865 ms [swapper/4]
0 0 36.74 7447.765 ms [swapper/7]
0 0 33.59 6486.942 ms [swapper/2]
0 0 22.58 3771.268 ms [swapper/5]
9545 9351 2.48 447.136 ms sched-messaging
9574 9351 2.09 418.583 ms sched-messaging
9724 9351 2.05 372.407 ms sched-messaging
9531 9351 2.01 368.804 ms sched-messaging
9512 9351 2.00 362.250 ms sched-messaging
9514 9351 1.95 357.767 ms sched-messaging
9538 9351 1.86 384.476 ms sched-messaging
9712 9351 1.84 386.490 ms sched-messaging
9723 9351 1.83 380.021 ms sched-messaging
9722 9351 1.82 382.738 ms sched-messaging
9517 9351 1.81 354.794 ms sched-messaging
9559 9351 1.79 344.305 ms sched-messaging
9725 9351 1.77 365.315 ms sched-messaging
<SNIP>

- Add hard/soft-irq statistics to perf kwork top. This will show the
total CPU utilization with IRQ stats like below:

$ sudo perf kwork top -b -k sched,irq,softirq
Starting trace, Hit <Ctrl+C> to stop and report
^C
Total : 12554.889 ms, 8 cpus
%Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here
%Cpu0 [| 4.60%]
%Cpu1 [| 4.59%]
%Cpu2 [ 2.73%]
%Cpu3 [| 3.81%]
<SNIP>

perf bench:

- Add -G/--cgroups option to perf bench sched pipe. The pipe bench is
good to measure context switch overhead. With this option, it puts
the reader and writer tasks in separate cgroups to enforce context
switch between two different cgroups.

Also it needs to set CPU affinity of the tasks in a CPU to
accurately measure the impact of cgroup context switches.

$ sudo perf stat -e context-switches,cgroup-switches -- \
> taskset -c 0 perf bench sched pipe -l 100000
# Running 'sched/pipe' benchmark:
# Executed 100000 pipe operations between two processes

Total time: 0.307 [sec]

3.078180 usecs/op
324867 ops/sec

Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000':

200,026 context-switches
63 cgroup-switches

0.321637922 seconds time elapsed

You can see small number of cgroup-switches because both write and
read tasks are in the same cgroup.

$ sudo mkdir /sys/fs/cgroup/{AAA,BBB}

$ sudo perf stat -e context-switches,cgroup-switches -- \
> taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB
# Running 'sched/pipe' benchmark:
# Executed 100000 pipe operations between two processes

Total time: 0.351 [sec]

3.512990 usecs/op
284657 ops/sec

Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB':

200,020 context-switches
200,019 cgroup-switches

0.365034567 seconds time elapsed

Now context-switches and cgroup-switches are almost same. And you
can see the pipe operation took little more.

- Kill child processes when perf bench sched messaging exited
abnormally. Otherwise it'd leave the child doing unnecessary work.

perf test:

- Fix various shellcheck issues on the tests written in shell script.

- Skip tests when condition is not satisfied:
- object code reading test for non-text section addresses.
- CoreSight test if cs_etm// event is not available.
- lock contention test if not enough CPUs.

Event parsing:

- Make PMU alias name loading lazy to reduce the startup time in the
event parsing code for perf record, stat and others in the general
case.

- Lazily compute PMU default config. In the same sense, delay PMU
initialization until it's really needed to reduce the startup cost.

- Fix event term values that are raw events. The event specification
can have several terms including event name. But sometimes it
clashes with raw event encoding which starts with 'r' and has
hex-digits.

For example, an event named 'read' should be processed as a normal
event but it was mis-treated as a raw encoding and caused a
failure.

$ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1
event syntax error: '..nning/event=read/'
\___ parser error
Run 'perf list' for a list of valid events

Usage: perf stat [<options>] [<command>]

-e, --event <event> event selector. use 'perf list' to list available events

Event metrics:

- Add "Compat" regex to match event with multiple identifiers.

- Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne.

Misc:

- Assorted memory leak fixes and footprint reduction.

- Add "bpf_skeletons" to perf version --build-options so that users
can check whether their perf tools have BPF support easily.

- Fix unaligned access in Intel-PT packet decoder found by
undefined-behavior sanitizer.

- Avoid frequency mode for the dummy event. Surprisingly it'd impact
kernel timer tick handler performance by force iterating all PMU
events.

- Update bash shell completion for events and metrics"

* tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits)
perf vendor events intel: Update tsx_cycles_per_elision metrics
perf vendor events intel: Update bonnell version number to v5
perf vendor events intel: Update westmereex events to v4
perf vendor events intel: Update meteorlake events to v1.06
perf vendor events intel: Update knightslanding events to v16
perf vendor events intel: Add typo fix for ivybridge FP
perf vendor events intel: Update a spelling in haswell/haswellx
perf vendor events intel: Update emeraldrapids to v1.01
perf vendor events intel: Update alderlake/alderlake events to v1.23
perf build: Disable BPF skeletons if clang version is < 12.0.1
perf callchain: Fix spelling mistake "statisitcs" -> "statistics"
perf report: Fix spelling mistake "heirachy" -> "hierarchy"
perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit()
perf tests: test_arm_coresight: Simplify source iteration
perf vendor events intel: Add tigerlake two metrics
perf vendor events intel: Add broadwellde two metrics
perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric
perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit
perf callchain: Minor layout changes to callchain_list
perf callchain: Make brtype_stat in callchain_list optional
...

show more ...


Revision tags: v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1
# a484e645 31-Aug-2023 James Clark <james.clark@arm.com>

perf vendor events arm64: Update V1 events using Arm telemetry repo

The new data [1] includes descriptions that may have product specific
details and new groupings that will be consistent with other

perf vendor events arm64: Update V1 events using Arm telemetry repo

The new data [1] includes descriptions that may have product specific
details and new groupings that will be consistent with other products.

The following command was used to generate the jsons:

$ telemetry-solution/tools/perf_json_generator/generate.py \
linux/tools/perf/ --telemetry-files \
telemetry-solution/data/pmu/cpu/neoverse/neoverse-v1.json

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.json

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230831161618.134738-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1
# 4f2c0a4a 14-Dec-2022 Nick Terrell <terrelln@fb.com>

Merge branch 'main' into zstd-linus


# cfd1f6c1 13-Dec-2022 Jiri Kosina <jkosina@suse.cz>

Merge branch 'for-6.2/apple' into for-linus

- new quirks for select Apple keyboards (Kerem Karabay, Aditya Garg)


# e291c116 12-Dec-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge branch 'next' into for-linus

Prepare input updates for 6.2 merge window.


Revision tags: v6.1
# 6b2b0d83 08-Dec-2022 Petr Mladek <pmladek@suse.com>

Merge branch 'rework/console-list-lock' into for-linus


Revision tags: v6.1-rc8, v6.1-rc7
# 29583dfc 21-Nov-2022 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-next into drm-misc-next-fixes

Backmerging to update drm-misc-next-fixes for the final phase
of the release cycle.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


Revision tags: v6.1-rc6
# 002c6ca7 14-Nov-2022 Rodrigo Vivi <rodrigo.vivi@intel.com>

Merge drm/drm-next into drm-intel-next

Catch up on 6.1-rc cycle in order to solve the intel_backlight
conflict on linux-next.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


Revision tags: v6.1-rc5, v6.1-rc4
# d93618da 04-Nov-2022 Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge drm/drm-next into drm-intel-gt-next

Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator")
which is needed for series https://patchwork.freedesktop.org/s

Merge drm/drm-next into drm-intel-gt-next

Needed to bring in v6.1-rc1 which contains commit f683b9d61319 ("i915: use the VMA iterator")
which is needed for series https://patchwork.freedesktop.org/series/110083/ .

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

show more ...


Revision tags: v6.1-rc3, v6.1-rc2
# 14e77332 22-Oct-2022 Nick Terrell <terrelln@fb.com>

Merge branch 'main' into zstd-next


# 1aca5ce0 20-Oct-2022 Thomas Zimmermann <tzimmermann@suse.de>

Merge drm/drm-fixes into drm-misc-fixes

Backmerging to get v6.1-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>


# 008f05a7 19-Oct-2022 Mark Brown <broonie@kernel.org>

ASoC: jz4752b: Capture fixes

Merge series from Siarhei Volkau <lis8215@gmail.com>:

The patchset fixes:
- Line In path stays powered off during capturing or
bypass to mixer.
- incorrectly repre

ASoC: jz4752b: Capture fixes

Merge series from Siarhei Volkau <lis8215@gmail.com>:

The patchset fixes:
- Line In path stays powered off during capturing or
bypass to mixer.
- incorrectly represented dB values in alsamixer, et al.
- incorrect represented Capture input selector in alsamixer
in Playback tab.
- wrong control selected as Capture Master

show more ...


# a140a6a2 18-Oct-2022 Maxime Ripard <maxime@cerno.tech>

Merge drm/drm-next into drm-misc-next

Let's kick-off this release cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>


# c29a017f 17-Oct-2022 Dmitry Torokhov <dmitry.torokhov@gmail.com>

Merge tag 'v6.1-rc1' into next

Merge with mainline to bring in the latest changes to twl4030 driver.


# 8048b835 17-Oct-2022 Andrew Morton <akpm@linux-foundation.org>

Merge branch 'master' into mm-hotfixes-stable


Revision tags: v6.1-rc1
# d465bff1 12-Oct-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:

- Add support for AMD on 'perf mem'

Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:

- Add support for AMD on 'perf mem' and 'perf c2c', the kernel
enablement patches went via tip.

Example:

$ sudo perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

$ sudo perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
$

- "perf lock" improvements:

- Add -E/--entries option to limit the number of entries to
display, say to ask for just the top 5 contended locks.

- Add -q/--quiet option to suppress header and debug messages.

- Add a 'perf test' kernel lock contention entry to test 'perf
lock'.

- "perf lock contention" improvements:

- Ask BPF's bpf_get_stackid() to skip some callchain entries.

The ones closer to the tooling are bpf related and not that
interesting, the ones calling the locking function are the ones
we're interested in, example of a full, unskipped callstack:

- Allow changing the callstack depth and number of entries to skip.

1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffbb8b8e75 bpf_trace_run2+0x35
0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb
0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5
0xffffffffbc1c26ff _raw_spin_lock+0x1f
0xffffffffbb841015 tick_do_update_jiffies64+0x25
0xffffffffbb8409ee tick_irq_enter+0x9e

- Show full callstack in verbose mode (-v option), sometimes this
is desirable instead of showing just one callstack entry.

- Allow multiple time ranges in 'perf record --delay' to help in
reducing the amount of data collected from hardware tracing (Intel
PT, etc) when there is a rough idea of periods of time where events
of interest take time.

- Add Intel PT to record only decoder debug messages when error
happens.

- Improve layout of Intel PT man page.

- Add new branch types: alignment, data and inst faults and arch
specific ones, such as fiq, debug_halt, debug_exit, debug_inst and
debug_data on arm64.

Kernel enablement went thru the tip tree.

- Fix 'perf probe' error log check in 'perf test' when no debuginfo is
available.

- Fix 'perf stat' aggregation mode logic, it should be looking at the
CPU not at the core number.

- Fix flags parsing in 'perf trace' filters.

- Introduce compact encoding of CPU range encoding on perf.data, to
avoid having a bitmap with all the CPUs.

- Improvements to the 'perf stat' metrics, including adding
"core_wide", and computing "smt" from the CPU topology.

- Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format,
that allows tooling to ask for the precise number of lost samples for
a given event.

- Add 'addr' sort key to see just the address of sampled instructions:

$ perf record -o- true | perf report -i- -s addr
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
# Samples: 12 of event 'cycles:u'
# Event count (approx.): 252512
#
# Overhead Address
# ........ ..................
42.96% 0x7f96f08443d7
29.55% 0x7f96f0859b50
14.76% 0x7f96f0852e02
8.30% 0x7f96f0855028
4.43% 0xffffffff8de01087

perf annotate: Toggle full address <-> offset display

- Add 'f' hotkey to the 'perf annotate' TUI interface when in
'disassembler output' mode ('o' hotkey) to toggle showing full
virtual address or just the offset.

- Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for
pre-existing threads, at the start of a 'perf record' session,
speeding up that record startup phase.

- Add a command line option to specify build ids in 'perf inject'.

- Update JSON event files for the Intel alderlake, broadwell,
broadwellde, broadwellx, cascadelakex, haswell, haswellx, icelake,
icelakex, ivybridge, ivytown, jaketown, sandybridge, sapphirerapids,
skylake, skylakex, and tigerlake processors.

- Update vendor JSON event files for the ARM Neoverse V1 and E1
platforms.

- Add a 'perf test' entry for 'perf mem' where a struct has false
sharing and this gets detected in the 'perf mem' output, tested with
Intel, AMD and ARM64 systems.

- Add a 'perf test' entry to test the resolution of java symbols, where
an output like this is expected:

8.18% jshell jitted-50116-29.so [.] Interpreter
0.75% Thread-1 jitted-83602-1670.so [.] jdk.internal.jimage.BasicImageReader.getString(int)

- Add tests for the ARM64 CoreSight hardware tracing feature, with
specially crafted pureloop, memcpy, thread loop and unroll tread that
then gets traced and the output compared with expected output.

Documentation explaining it is also included.

- Add per thread Intel PT 'perf test' entry to check that
PERF_RECORD_TEXT_POKE events are recorded per CPU, resulting in a
mixture of per thread and per CPU events and mmaps, verify that this
gets all recorded correctly.

- Introduce pthread mutex wrappers to allow for building with clang's
-Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by"
"lockable", "exclusive_lock_function", "exclusive_trylock_function",
"exclusive_locks_required", and "no_thread_safety_analysis" compiler
function attributes.

- Fix empty version number when building outside of a git repo.

- Improve feature detection display when multiple versions of a feature
are present, such as for binutils libbfd, that has a mix of possible
ways to detect according to the Linux distribution.

Previously in some cases we had:

Auto-detecting system features
<SNIP>
... libbfd: [ on ]
... libbfd-liberty: [ on ]
... libbfd-liberty-z: [ on ]
<SNIP>

Now for this case we show just the main feature:

Auto-detecting system features
<SNIP>
... libbfd: [ on ]
<SNIP>

- Remove some unused structs, variables, macros, function prototypes
and includes from various places.

* tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf script: Add missing fields in usage hint
perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
perf mem/c2c: Avoid printing empty lines for unsupported events
perf mem/c2c: Add load store event mappings for AMD
perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel
tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout()
perf test coresight: Add relevant documentation about ARM64 CoreSight testing
perf test: Add git ignore for tmp and output files of ARM CoreSight tests
perf test coresight: Add unroll thread test shell script
perf test coresight: Add unroll thread test tool
perf test coresight: Add thread loop test shell scripts
perf test coresight: Add thread loop test tool
perf test coresight: Add memcpy thread test shell script
perf test coresight: Add memcpy thread test tool
perf test: Add git ignore for perf data generated by the ARM CoreSight tests
perf test: Add arm64 asm pureloop test shell script
perf test: Add asm pureloop test tool
...

show more ...


Revision tags: v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5
# c581e46b 08-Sep-2022 Nick Forrington <nick.forrington@arm.com>

perf vendor events arm64: Move REMOTE_ACCESS to "memory" category

Move REMOTE_ACCESS event from other.json to memory.json for Neoverse
CPUs. This is consistent with other Arm (Cortex) CPUs.

Reviewe

perf vendor events arm64: Move REMOTE_ACCESS to "memory" category

Move REMOTE_ACCESS event from other.json to memory.json for Neoverse
CPUs. This is consistent with other Arm (Cortex) CPUs.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220908112519.64614-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

show more ...


12