Bug #16243
closedcase/when is slower than if on MRI
Description
This comes from a discussion on a PR on Sidekiq: https://github.com/mperham/sidekiq/pull/4303
This benchmark:
# frozen_string_literal: true require "benchmark/ips" def deep_dup_case(obj) case obj when Integer, Float, TrueClass, FalseClass, NilClass obj when String obj.dup when Array obj.map { |e| deep_dup_case(e) } when Hash duped = obj.dup duped.each_pair do |key, value| duped[key] = deep_dup_case(value) end else obj.dup end end def deep_dup_if(obj) if Integer === obj || Float === obj || TrueClass === obj || FalseClass === obj || NilClass === obj obj elsif String === obj obj.dup elsif Array === obj obj.map { |e| deep_dup_if(e) } elsif Hash === obj duped = obj.dup duped.each_pair do |key, value| duped[key] = deep_dup_if(value) end duped else obj.dup end end obj = { "class" => "FooWorker", "args" => [1, 2, 3, "foobar"], "jid" => "123987123" } Benchmark.ips do |x| x.report("deep_dup_case") do deep_dup_case(obj) end x.report("deep_dup_if") do deep_dup_if(obj) end x.compare! end gives in MRI 2.6.5:
Warming up -------------------------------------- deep_dup_case 37.767k i/100ms deep_dup_if 41.802k i/100ms Calculating ------------------------------------- deep_dup_case 408.046k (± 0.9%) i/s - 2.077M in 5.090997s deep_dup_if 456.657k (± 0.9%) i/s - 2.299M in 5.035040s Comparison: deep_dup_if: 456657.4 i/s deep_dup_case: 408046.1 i/s - 1.12x slower It seems @tenderlovemaking (Aaron Patterson) already noticed this a while ago due to missing inline caches for case, but the performance bug seems to still remain:
https://youtu.be/b77V0rkr5rk?t=1442
Do you think this could be fixed in MRI?
AFAIK both JRuby and TruffleRuby support inline cached === calls in case/when.
I think the code with case is more idiomatic and readable and should be preferred to if, and this performance issue makes that less clear.
Files
Updated by Eregon (Benoit Daloze) about 6 years ago
- Description updated (diff)
Updated by nobu (Nobuyoshi Nakada) about 6 years ago
Interesting.
I tried a couple approaches, but couldn't get significant results.
-
checkmatchwith call cache
https://github.com/nobu/ruby/pull/new/feature/checkmatch-cache - replace simple
checkmatchwith:===call
https://github.com/nobu/ruby/pull/new/feature/checkmatch-to-send
Updated by Eregon (Benoit Daloze) about 6 years ago
- File case_vs_if_sum.rb case_vs_if_sum.rb added
The original benchmark allocates quite a bit, if we change it to just sum elements, the difference is around 20%:
Warming up -------------------------------------- deep_dup_case 44.804k i/100ms deep_dup_if 53.648k i/100ms Calculating ------------------------------------- deep_dup_case 487.684k (± 0.6%) i/s - 2.464M in 5.053100s deep_dup_if 584.710k (± 0.6%) i/s - 2.951M in 5.046530s Comparison: deep_dup_if: 584709.9 i/s deep_dup_case: 487684.5 i/s - 1.20x slower @nobu (Nobuyoshi Nakada) thanks for the experiments, do you have an idea why case is slower than if?
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
- Status changed from Open to Closed
5ada23ac1265a1da5d7ef82e1c71f14c40dddc26 changed case statements to use a === with a call cache, instead of using the checkmatch VM instruction. In my benchmarking on master, I'm only seeing about 1-3% difference with the revised benchmark, as opposed to the much larger difference on ruby 3.0. So I think this performance difference has been addressed.