Benchmarking Ruby Code
Introduction
Before you can improve the performance of your Ruby code, you need to measure it. Guessing which approach is faster often leads you astray — your intuition about what’s slow rarely matches reality. The standard library’s Benchmark module gives you precise, repeatable measurements so you can make decisions based on data.
This guide covers the Benchmark module’s core methods, when to use each one, and how to avoid common pitfalls that make measurements unreliable.
Getting Started with Benchmark.measure
The simplest way to time a block of code is with Benchmark.measure. It returns a Benchmark::Tms object with breakdown of user CPU time, system CPU time, and wall-clock elapsed time.
require 'benchmark'
time = Benchmark.measure do
1_000_000.times { a = "1" }
end
puts time
# => 0.220000 0.000000 0.220000 ( 0.227313)
The output shows user time, system time, total CPU time, and elapsed real time (in parentheses). For most purposes, real is the most useful figure — it tells you how long the operation took from start to finish.
You can also access individual fields:
require 'benchmark'
n = 1_000_000
time = Benchmark.measure do
n.times { |i| i.to_s }
end
puts "User: #{time.utime}"
# => User: 0.210000
puts "Real: #{time.real}"
# => Real: 0.215312
Benchmark.measure is best for timing a single piece of code. When you need to compare multiple approaches, use Benchmark.bm.
Comparing Code with Benchmark.bm
Benchmark.bm runs multiple code blocks and prints a side-by-side comparison table. Pass an integer to set a minimum label width so columns line up neatly.
require 'benchmark'
n = 5_000_000
Benchmark.bm(7) do |x|
x.report("for:") { for i in 1..n; a = "1"; end }
x.report("times:") { n.times { a = "1" } }
x.report("upto:") { 1.upto(n) { a = "1" } }
end
Output:
user system total real
for: 1.010000 0.000000 1.010000 ( 1.015688)
times: 1.000000 0.000000 1.000000 ( 1.003611)
upto: 1.030000 0.000000 1.030000 ( 1.028098)
Each x.report call times one block and prints a row. The real column is what you care about most — it shows wall-clock time and is most representative of what your users experience.
Reducing GC Noise with Benchmark.bmbm
One problem with Benchmark.bm is that garbage collection runs unpredictably. If one benchmark block creates many objects, it can trigger GC during the next block — making that block look slower than it actually is.
Benchmark.bmbm addresses this by running each block twice. First a “rehearsal” pass warms up memory, then a second pass calls GC.start before each block to minimise allocation-related variance.
require 'benchmark'
array = (1..1_000_000).map { rand }
Benchmark.bmbm(7) do |x|
x.report("sort!:") { array.dup.sort! }
x.report("sort:") { array.dup.sort }
end
Output:
Rehearsal -----------------------------------------
sort!: 1.440000 0.010000 1.450000 ( 1.446833)
sort: 1.440000 0.000000 1.440000 ( 1.448257)
--------------------------------- total: 2.890000sec
user system total real
sort!: 1.460000 0.000000 1.460000 ( 1.458065)
sort: 1.450000 0.010000 1.450000 ( 1.455963)
The rehearsal pass shows all blocks running without GC pauses between them. The second pass is the real measurement, with GC.start called before each block.
Keep in mind that bmbm only handles GC variance. It cannot protect against JIT compilation effects, OS scheduler preemption, or CPU frequency scaling.
Measuring Simple Elapsed Time
If you only need the wall-clock time and nothing else, Benchmark.realtime returns a plain Float:
require 'benchmark'
elapsed = Benchmark.realtime do
sleep 0.1
end
puts elapsed.round(2)
# => 0.1
This is useful when you want a quick measurement without dealing with Benchmark::Tms objects.
Controlling Garbage Collection Manually
For very tight microbenchmarks where even bmbm’s GC handling is insufficient, you can disable GC entirely during the measurement:
require 'benchmark'
GC.disable
begin
result = Benchmark.bm do |x|
x.report("fast") { 1_000_000.times { 1 + 2 } }
end
ensure
GC.enable
end
Always re-enable GC in an ensure block so it gets called even if your benchmark raises an exception. Leaving GC disabled too long causes memory to accumulate and can make your system unresponsive.
For most benchmarks, a better approach is to call GC.start once before the entire benchmark run rather than disabling GC entirely:
GC.start # prime GC before measuring
Benchmark.bm(7) do |x|
x.report("approach A") { ... }
x.report("approach B") { ... }
end
Iterations Per Second with benchmark-ips
Ruby’s standard Benchmark module measures absolute time. For very fast code, iterations per second (IPS) is often more intuitive. The benchmark-ips gem measures how many times your code can run per second.
Install it with:
gem install benchmark-ips
require 'benchmark/ips'
Benchmark.ips do |x|
x.config(warmup: 2, time: 5)
x.report("addition") { 1 + 2 }
x.report("multiply") { 1 * 2 }
x.compare!
end
Output:
Warming up --------------------------------------
addition 3.572M i/100ms
multiply 3.581M i/100ms
Calculating -------------------------------------
addition 36.209M (± 2.8%) i/s (27.62 ns/i)
multiply 36.215M (± 2.1%) i/s (27.61 ns/i)
Comparison:
addition: 36209044.5 i/s
multiply: 36215173.7 i/s
x.config(warmup: 2, time: 5) sets 2 seconds of warmup and 5 seconds of measurement. The default is 2 seconds warmup and 5 seconds measurement. x.compare! prints a comparison table at the end.
High standard deviation (above 10%) means your results are noisy and you should run the benchmark again with longer warmup or measurement time.
Warm-up Effects and JIT
Ruby 3.x includes YJIT and MJIT, which optimise code after the first few executions. MJIT was deprecated in Ruby 3.2; YJIT is the current default. This means your first run is always slower than subsequent runs. Always warm up with a few iterations before taking measurements:
require 'benchmark'
# warm up JIT
3.times { 1_000_000.times { 1 + 2 } }
# now measure
time = Benchmark.measure do
1_000_000.times { 1 + 2 }
end
puts time.real
Run your benchmark multiple times across separate Ruby invocations and compare the results. JIT state persists within a process, so one slow run might be a cold-start effect rather than a genuine performance difference.
Common Pitfalls
Single-run measurements are noisy. Always run multiple iterations. If one run shows “A is 20% faster than B” but the next run shows the opposite, increase the iteration count or run time.
Loop overhead can dominate. If you wrap a tiny operation in a loop, the loop itself becomes the dominant cost. Use Benchmark.measure for single executions or let benchmark-ips handle iteration counting automatically.
Real time vs CPU time. If your code waits on I/O or calls sleep, the real time will exceed user + system. Always check real first.
Comparing across processes. Ruby’s GC and JIT state persist within a process. To compare two different implementations fairly, run them in separate Ruby invocations.
See Also
- Array Methods — Benchmark results often involve sorting or searching arrays
- Enumerable — Collection iteration patterns frequently compared in benchmarks
- Hash Methods — Hash lookup and manipulation benchmarks