Description
Hello up there. I've hit a discrepancy in how z
flags is handled by %
in unicode and bytes:
- for unicode
%
rejects it as "unsupported format character" according to original discussion in string formatting: normalize negative zero #90153 (= BPO-45995), - hower for bytes
%
fully handles "z":
kirr@deca:~$ python3 Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> '%zf' % 1 <-- unicode Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: unsupported format character 'z' (0x7a) at index 1 >>> b'%zf' % 1 <-- bytes b'1.000000' >>> b'%zf' % 0.0 <-- +0 -> 0 b'0.000000' >>> b'%zf' % -0.0 <-- -0 -> 0 b'0.000000' >>> b'%f' % -0.0 <-- -0 -> -0 if run without 'z' b'-0.000000'
In other words there is inconsistency in how 'z' is handled by '%' for unicode and bytes, and there is also inconsistency in how 'z' was supposed to be handled by .format
and not handled by '%' as originally discussed on BPO-45995.
'z' handling was implemented in #30049 and indeed there I see b'%z' being fully handled:
b0b836b20cb5#diff-f6d440aad34e1c4535c0d898c0197a95490766c745991caace6f64b5dd1ece51
but u'%z' being only partly handled internally without corresponding frontend parsing that bytes has:
b0b836b20cb5#diff-34c966e7876d6f8bf801dd51896327e4f68bba02cddb95fbf3963f0b2e39c38a
In my view the fix should be either a) to add '%z' handling to unicode, or b) to remove '%z' handling from bytes.
Thanks beforehand,
Kirill
- CPython versions tested on: 3.11.2
- Operating system and architecture: Debian GNU/Linux 12 on AMD64
/cc @belm0, @mdickinson
Linked PRs
- gh-104018: disallow "z" format specifier in %-format of byte strings #104033
- [3.11] gh-104018: disallow "z" format specifier in %-format of byte strings (GH-104033) #104058
- gh-104018: remove unused format "z" handling in string formatfloat() #104107
- [3.11] gh-104018: remove unused format "z" handling in string formatfloat() (GH-104107) #104260