Bug #20807
openString#gsub fails when called from string subclass with a block passed
Description
When String#gsub is called from a string subclass with a block, Regexp.last_match is nil, but passed block is executed. Here is example code:
def call_gsub(str) str.gsub(/%/) do puts "checking #{str.class}" puts "Special variable value: #{$&}" puts "Regexp.last_match = #{Regexp.last_match.inspect}\n\n" raise "Special variable $& is not assigned, but block is called" if $&.nil? end end class MyString < String def gsub(*args, &block) super(*args, &block) # just forward everything end end text = 'test%text_with_special_character' call_gsub(String.new(text)) # original string call_gsub(MyString.new(text)) # string subclass Result:
checking String Special variable value: % Regexp.last_match = #<MatchData "%"> checking MyString Special variable value: Regexp.last_match = nil gsub_bug.rb:7:in `block in call_gsub': Special variable $& is not assigned, but block is called (RuntimeError) from gsub_bug.rb:13:in `gsub' from gsub_bug.rb:13:in `gsub' from gsub_bug.rb:2:in `call_gsub' from gsub_bug.rb:20:in `<main>' I expect result to be the same for both classes since MyString just wraps the same method:
checking String Special variable value: % Regexp.last_match = #<MatchData "%"> checking MyString Special variable value: % Regexp.last_match = #<MatchData "%"> Maybe there is something off with with control frame when params are forwarded?
Thanks in advance!
Updated by Dan0042 (Daniel DeLorme) about 1 year ago
Regexp.last_match and other regexp-related pseudo globals do not work across more than one stack frame. Since you override #gsub, they are set only inside MyString#gsub
You can confirm with this:
def test(klass) p klass klass.new("test").gsub(/s/,'x') p result: $~ end class MyString1 < String end test(MyString1) #prints: #{:result=>#<MatchData "s">} class MyString2 < String def gsub(...) super ensure p ensure: $~ end end test(MyString2) #prints: #{:ensure=>#<MatchData "s">} #{:result=>nil} It would be possible to fix this by propagating Regexp.last_match up every "super" stack frame until we reach the originating non-super frame. It would allow some interesting use cases (like logging the time spent in every Regexp#match). But it's a lot of work for a very niche use.
Updated by jeremyevans0 (Jeremy Evans) 12 months ago
- Related to Bug #8444: Regexp vars $~ and friends are not thread local added
- Related to Bug #12689: Thread isolation of $~ and $_ added
- Related to Bug #14364: Regexp last match variable in procs added
Updated by jeremyevans0 (Jeremy Evans) 12 months ago
- Related to Bug #11808: Different behavior between Enumerable#grep and Array#grep added