Project

General

Profile

Actions

Bug #16842

closed

`inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable

Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable

Added by sawa (Tsuyoshi Sawada) over 5 years ago. Updated over 3 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-linux]
[ruby-core:98231]

Description

The UTF-8 character U+0085 (NEXT LINE) is not printable, but inspect prints the character verbatim (within double quotation):

0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # => false 0x85.chr(Encoding::UTF_8).inspect #=> "\" \"" 

My understanding is that non-printable characters are not printed verbatim with inspect:

"\n".match?(/\p{print}/) # => false "\n".inspect #=> "\"\\n\"" 

while printable characters are:

"a".match?(/\p{print}/) # => true "a".inspect # => "\"a\"" 

I ran the following script, and found that U+0085 is the only character within the range U+0000 to U+FFFF that behaves like this.

def verbatim?(char) !char.inspect.start_with?(%r{\"\\[a-z]}) end def printable?(char) char.match?(/\p{print}/) end (0x0000..0xffff).each do |i| begin char = i.chr(Encoding::UTF_8) rescue RangeError next end puts '%#x' % i unless verbatim?(char) == printable?(char) end 
Actions

Also available in: PDF Atom