"z" format specifier is treated differently in unicode and bytes

@belm0

Hello up there. I've hit a discrepancy in how z flags is handled by % in unicode and bytes:

for unicode % rejects it as "unsupported format character" according to original discussion in string formatting: normalize negative zero #90153 (= BPO-45995),
hower for bytes % fully handles "z":

kirr@deca:~$ python3 Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> '%zf' % 1 <-- unicode Traceback (most recent call last):  File "<stdin>", line 1, in <module> ValueError: unsupported format character 'z' (0x7a) at index 1 >>> b'%zf' % 1 <-- bytes b'1.000000' >>> b'%zf' % 0.0 <-- +0 -> 0 b'0.000000' >>> b'%zf' % -0.0 <-- -0 -> 0 b'0.000000' >>> b'%f' % -0.0 <-- -0 -> -0 if run without 'z' b'-0.000000'

In other words there is inconsistency in how 'z' is handled by '%' for unicode and bytes, and there is also inconsistency in how 'z' was supposed to be handled by .format and not handled by '%' as originally discussed on BPO-45995.

'z' handling was implemented in #30049 and indeed there I see b'%z' being fully handled:

b0b836b20cb5#diff-f6d440aad34e1c4535c0d898c0197a95490766c745991caace6f64b5dd1ece51

but u'%z' being only partly handled internally without corresponding frontend parsing that bytes has:

b0b836b20cb5#diff-34c966e7876d6f8bf801dd51896327e4f68bba02cddb95fbf3963f0b2e39c38a

In my view the fix should be either a) to add '%z' handling to unicode, or b) to remove '%z' handling from bytes.

Thanks beforehand,
Kirill

CPython versions tested on: 3.11.2
Operating system and architecture: Debian GNU/Linux 12 on AMD64

/cc @belm0, @mdickinson

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

"z" format specifier is treated differently in unicode and bytes #104018

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

"z" format specifier is treated differently in unicode and bytes #104018

Description

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions