Running the ring-buffer-benchmark's threads at the lowest priority may
work well for keeping it in the background, but it is not appropriate
for the benchmarks.
This patch adds 4 parameters to the module:
consumer_fifo
consumer_nice
producer_fifo
producer_nice
By default the consumer and producer still run at nice +19.
If the *_fifo options are set, they will override the *_nice values.
modprobe ring_buffer_benchmark consumer_nice=0 producer_fifo=10
The above will set the consumer thread to a nice value of 0, and
the producer thread to a RT SCHED_FIFO priority of 10.
Note, this patch also fixes a bug where calling set_user_nice on the
consumer thread would oops the kernel when the parameter "disable_reader"
is set.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The ring-buffer benchmark threads run on nice 0 by default, using
up a lot of CPU time and slowing down the system:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1024 root 20 0 0 0 0 D 95.3 0.0 4:01.67 rb_producer
1023 root 20 0 0 0 0 R 93.5 0.0 2:54.33 rb_consumer
21569 mingo 40 0 14852 1048 772 R 3.6 0.1 0:00.05 top
1 root 40 0 4080 928 668 S 0.0 0.0 0:23.98 init
Renice them to +19 to make them less intrusive.
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The ring_buffer_benchmark does a gettimeofday after every write to the
ring buffer in its measurements. This adds the overhead of the call
to gettimeofday to the measurements and does not give an accurate picture
of the length of time it takes to record a trace.
This was first noticed with perf top:
------------------------------------------------------------------------------
PerfTop: 679 irqs/sec kernel:99.9% [1000Hz cpu-clock-msecs], (all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
1673.00 - 27.8% : trace_clock_local
806.00 - 13.4% : do_gettimeofday
590.00 - 9.8% : rb_reserve_next_event
554.00 - 9.2% : native_read_tsc
431.00 - 7.2% : ring_buffer_lock_reserve
365.00 - 6.1% : __rb_reserve_next
355.00 - 5.9% : rb_end_commit
322.00 - 5.4% : getnstimeofday
268.00 - 4.5% : ring_buffer_unlock_commit
262.00 - 4.4% : ring_buffer_producer_thread [ring_buffer_benchmark]
113.00 - 1.9% : read_tsc
91.00 - 1.5% : debug_smp_processor_id
69.00 - 1.1% : trace_recursive_unlock
66.00 - 1.1% : ring_buffer_event_data
25.00 - 0.4% : _spin_unlock_irq
And the length of each write to the ring buffer measured at 310ns.
This patch adds a new module parameter called "write_interval" which is
defaulted to 50. This is the number of writes performed between
timestamps. After this patch perf top shows:
------------------------------------------------------------------------------
PerfTop: 244 irqs/sec kernel:100.0% [1000Hz cpu-clock-msecs], (all, 4 CPUs)
------------------------------------------------------------------------------
samples pcnt kernel function
_______ _____ _______________
2842.00 - 40.4% : trace_clock_local
1043.00 - 14.8% : rb_reserve_next_event
784.00 - 11.1% : ring_buffer_lock_reserve
600.00 - 8.5% : __rb_reserve_next
579.00 - 8.2% : rb_end_commit
440.00 - 6.3% : ring_buffer_unlock_commit
290.00 - 4.1% : ring_buffer_producer_thread [ring_buffer_benchmark]
155.00 - 2.2% : debug_smp_processor_id
117.00 - 1.7% : trace_recursive_unlock
103.00 - 1.5% : ring_buffer_event_data
28.00 - 0.4% : do_gettimeofday
22.00 - 0.3% : _spin_unlock_irq
14.00 - 0.2% : native_read_tsc
11.00 - 0.2% : getnstimeofday
do_gettimeofday dropped from 13% usage to a mere 0.4%! (using the default
50 interval) The measurement for each timestamp went from 310ns to 210ns.
That's 100ns (1/3rd) overhead that the gettimeofday call was introducing.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently the output of the ring buffer benchmark/test prints to
the console. This test runs for ten seconds every ten seconds and
ouputs the result after every iteration. This needlessly fills up
the logs.
This patch makes the ring buffer benchmark/test print to the ftrace
buffer using trace_printk. To view the test results, you must examine
the debug/tracing/trace file.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
With the addition of commit:
c7b0930857
ring-buffer: prevent adding write in discarded area
The ring buffer may now add discarded events when a write passes
the end of a buffer page. Before, a discarded event was only added
when the tracer deliberately created one. The ring buffer benchmark
test does not handle discarded events when it reads the buffer and
fails when it encounters one.
Also fix the increment for large data entries (luckily, the test did
not add any yet).
[ Impact: fix false failure of ring buffer self test ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Although we check if "missed" is not zero, we divide by hit + missed,
and the addition can possible overflow and become a divide by zero.
This patch checks for this case, and will report it when it happens
then modify "hit" to make the calculation be non zero.
[ Impact: prevent possible divide by zero in ring-buffer-benchmark ]
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The use of numeric constants is discouraged. It is cleaner and more
descriptive to use macros for constant time conversions.
This patch also removes an extra new line.
[ Impact: more descriptive time conversions ]
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
It is nice to see the overhead of the benchmark test when tracing is
disabled. That is, we turn off the ring buffer just to see what the
cost of running the loop that calls into the ring buffer is.
Currently, if no entries wer made, we get 0. This is not informative.
This patch changes it to check if we had any "missed" (non recorded)
events. If so, a total count is also reported.
[ Impact: evaluate the over head of the ring buffer benchmark test ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Calling cond_resched at every iteration of the loop adds a bit of
overhead to the benchmark.
This patch does two things.
1) only calls cond-resched when CONFIG_PREEMPT is not enabled
2) only calls cond-resched after so many traces has been performed.
[ Impact: less overhead to the ring-buffer-benchmark ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Ingo Molnar thought that the code to calculate the time in cond_resched
is a bit too ugly and is not needed. This patch removes it and replaces
it with a simple call to cond_resched. I kept the comment that explains
the reason for the cond_resched.
[ Impact: remove ugly code ]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The ring buffer benchmark/test runs a producer for 10 seconds.
This is done with preemption and interrupts enabled. But if the kernel
is not compiled with CONFIG_PREEMPT, it basically stops everything
but interrupts for 10 seconds.
Although this is just a test and is not for production, this attribute
can be quite annoying. It can also spawn badness elsewhere.
This patch solves the issues by calling "cond_resched" when the system
is not compiled with CONFIG_PREEMPT. It also keeps track of the time
spent to call cond_resched such that it does not go against the
time calculations. That is, if the task schedules away, the time scheduled
out is removed from the test data. Note, this only works for non PREEMPT
because we do not know when the task is scheduled out if we have PREEMPT
enabled.
[ Impact: prevent test from stopping the world for 10 seconds ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The result of the allocation of the ring buffer read page in the
ring buffer bench mark does not check the return to see if a page
was actually allocated. This patch fixes that.
[ Impact: avoid NULL dereference ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch adds code that can benchmark the ring buffer as well as
test it. This code can be compiled into the kernel (not recommended)
or as a module.
A separate ring buffer is used to not interfer with other users, like
ftrace. It creates a producer and a consumer (option to disable creation
of the consumer) and will run for 10 seconds, then sleep for 10 seconds
and then repeat.
While running, the producer will write 10 byte loads into the ring
buffer with just putting in the current CPU number. The reader will
continually try to read the buffer. The reader will alternate from reading
the buffer via event by event, or by full pages.
The output is a pr_info, thus it will fill up the syslogs.
Starting ring buffer hammer
End ring buffer hammer
Time: 9000349 (usecs)
Overruns: 12578640
Read: 5358440 (by events)
Entries: 0
Total: 17937080
Missed: 0
Hit: 17937080
Entries per millisec: 1993
501 ns per entry
Sleeping for 10 secs
Starting ring buffer hammer
End ring buffer hammer
Time: 9936350 (usecs)
Overruns: 0
Read: 28146644 (by pages)
Entries: 74
Total: 28146718
Missed: 0
Hit: 28146718
Entries per millisec: 2832
353 ns per entry
Sleeping for 10 secs
Time: is the time the test ran
Overruns: the number of events that were overwritten and not read
Read: the number of events read (either by pages or events)
Entries: the number of entries left in the buffer
(the by pages will only read full pages)
Total: Entries + Read + Overruns
Missed: the number of entries that failed to write
Hit: the number of entries that were written
The above example shows that it takes ~353 nanosecs per entry when
there is a reader, reading by pages (and no overruns)
The event by event reader slowed the producer down to 501 nanosecs.
[ Impact: see how changes to the ring buffer affect stability and performance ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>