/linux/tools/perf/util/ |
H A D | sort.h | diff f29984226978313039d7dfe9b45eaa55a3aad03d Fri May 23 17:15:47 CEST 2014 Jiri Olsa <jolsa@kernel.org> perf tools: Move elide bool into perf_hpp_fmt struct
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'.
The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for:
$ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
Performance report: 9.70% perf [.] perf_hpp__should_skip
Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read.
Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types.
Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% )
150.943975206 seconds time elapsed ( +- 0.62% )
now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% )
116.416670507 seconds time elapsed ( +- 0.31% )
Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
H A D | hist.h | diff f29984226978313039d7dfe9b45eaa55a3aad03d Fri May 23 17:15:47 CEST 2014 Jiri Olsa <jolsa@kernel.org> perf tools: Move elide bool into perf_hpp_fmt struct
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'.
The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for:
$ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
Performance report: 9.70% perf [.] perf_hpp__should_skip
Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read.
Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types.
Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% )
150.943975206 seconds time elapsed ( +- 0.62% )
now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% )
116.416670507 seconds time elapsed ( +- 0.31% )
Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
H A D | sort.c | diff f29984226978313039d7dfe9b45eaa55a3aad03d Fri May 23 17:15:47 CEST 2014 Jiri Olsa <jolsa@kernel.org> perf tools: Move elide bool into perf_hpp_fmt struct
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'.
The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for:
$ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
Performance report: 9.70% perf [.] perf_hpp__should_skip
Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read.
Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types.
Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% )
150.943975206 seconds time elapsed ( +- 0.62% )
now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% )
116.416670507 seconds time elapsed ( +- 0.31% )
Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|
/linux/tools/perf/ui/browsers/ |
H A D | hists.c | diff f29984226978313039d7dfe9b45eaa55a3aad03d Fri May 23 17:15:47 CEST 2014 Jiri Olsa <jolsa@kernel.org> perf tools: Move elide bool into perf_hpp_fmt struct
After output/sort fields refactoring, it's expensive to check the elide bool in its current location inside the 'struct sort_entry'.
The perf_hpp__should_skip function gets highly noticable in workloads with high number of output/sort fields, like for:
$ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
Performance report: 9.70% perf [.] perf_hpp__should_skip
Moving the elide bool into the 'struct perf_hpp_fmt', which makes the perf_hpp__should_skip just single struct read.
Got speedup of around 22% for my test perf.data workload. The change should not harm any other workload types.
Performance counter stats for (10 runs): before: 358,319,732,626 cycles ( +- 0.55% ) 467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% )
150.943975206 seconds time elapsed ( +- 0.62% )
now: 278,785,972,990 cycles ( +- 0.12% ) 370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% )
116.416670507 seconds time elapsed ( +- 0.31% )
Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com Signed-off-by: Jiri Olsa <jolsa@kernel.org>
|