builtins: Audit bytes arguments #7631

JelleZijlstra · 2022-04-16T05:20:52Z

As a followup from #7589 (comment),
I audited all occurrences of bytes in builtins.pyi by reading the corresponding C code
on CPython main.

Most use the C buffer protocol, so _typeshed.ReadableBuffer is the right type. A few
check specifically for bytes and bytearray.

As a followup from python#7589 (comment), I audited all occurrences of bytes in builtins.pyi by reading the corresponding C code on CPython main. Most use the C buffer protocol, so _typeshed.ReadableBuffer is the right type. A few check specifically for bytes and bytearray.

JelleZijlstra · 2022-04-16T05:21:43Z

stdlib/builtins.pyi

- def __new__(cls: type[Self], __x: str | bytes | SupportsInt | SupportsIndex | SupportsTrunc = ...) -> Self: ...
+ def __new__(cls: type[Self], __x: str | ReadableBuffer | SupportsInt | SupportsIndex | SupportsTrunc = ...) -> Self: ...
 @overload
 def __new__(cls: type[Self], __x: str | bytes | bytearray, base: SupportsIndex) -> Self: ...


>>> int(memoryview(b"0xdeadbeef"), 16) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: int() can't convert non-string with explicit base >>> int(memoryview(b"123")) 123

Showing that the first overload accepts buffers but the second doesn't.

https://github.com/bradfitz/deadbeef

JelleZijlstra · 2022-04-16T05:22:33Z

stdlib/builtins.pyi

 def from_bytes(
 cls: type[Self],
- bytes: Iterable[SupportsIndex] | SupportsBytes, # TODO buffer object argument
+ bytes: Iterable[SupportsIndex] | SupportsBytes | ReadableBuffer,


>>> int.from_bytes([1, 2, 3]) 66051 >>> int.from_bytes(memoryview(b"123")) 3224115

JelleZijlstra · 2022-04-16T05:23:43Z

stdlib/builtins.pyi

+ self, __sub: ReadableBuffer | SupportsIndex, __start: SupportsIndex | None = ..., __end: SupportsIndex | None = ...
 ) -> int: ...
 if sys.version_info >= (3, 8):
 def hex(self, sep: str | bytes = ..., bytes_per_sep: SupportsIndex = ...) -> str: ...


>>> b"xy".hex(memoryview(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sep must be str or bytes. >>> b"xy".hex(bytearray(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: sep must be str or bytes.

stdlib/builtins.pyi

JelleZijlstra · 2022-04-16T05:25:26Z

stdlib/builtins.pyi

 @overload
 def __getitem__(self, __s: slice) -> bytes: ...
- def __add__(self, __s: bytes) -> bytes: ...
+ def __add__(self, __s: ReadableBuffer) -> bytes: ...


>>> b"x" + memoryview(b"y") b'xy'

JelleZijlstra · 2022-04-16T05:26:23Z

stdlib/builtins.pyi

 def __setitem__(self, __s: slice, __x: Iterable[SupportsIndex] | bytes) -> None: ...
 def __delitem__(self, __i: SupportsIndex | slice) -> None: ...
- def __add__(self, __s: bytes) -> bytearray: ...
- def __iadd__(self: Self, __s: Iterable[int]) -> Self: ...


This was wrong; ba += [1, 2, 3] fails

JelleZijlstra · 2022-04-16T05:26:59Z

stdlib/builtins.pyi

 opener: _Opener | None = ...,
 ) -> IO[Any]: ...
-def ord(__c: str | bytes) -> int: ...
+def ord(__c: str | bytes | bytearray) -> int: ...


>>> ord(memoryview(b"x")) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: ord() expected string of length 1, but memoryview found

github-actions · 2022-04-16T05:36:47Z

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

srittau

Thanks, I didn't double check, but the changes look reasonable.

srittau · 2022-04-16T12:13:21Z

stdlib/builtins.pyi

- def join(self, __iterable_of_bytes: Iterable[ByteString | memoryview]) -> bytes: ...
- def ljust(self, __width: SupportsIndex, __fillchar: bytes = ...) -> bytes: ...
+ def join(self, __iterable_of_bytes: Iterable[ReadableBuffer]) -> bytes: ...
+ def ljust(self, __width: SupportsIndex, __fillchar: bytes | bytearray = ...) -> bytes: ...


Unfortunately, this will also accept memoryview at the moment, but having it more explicit can't hurt.

That's a mypy bug :)

It's working as documented. In the past when reviewing I've always asked people to remove bytearray from argument types due to that.

JelleZijlstra commented Apr 16, 2022

View reviewed changes

Update stdlib/builtins.pyi

446691f

This comment has been minimized.

Sign in to view

type ignore

a7b6ab8

This comment has been minimized.

Sign in to view

srittau approved these changes Apr 16, 2022

View reviewed changes

srittau merged commit ee09d9e into python:master Apr 16, 2022

JelleZijlstra deleted the bytes branch April 16, 2022 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

builtins: Audit bytes arguments #7631

builtins: Audit bytes arguments #7631

Uh oh!

JelleZijlstra commented Apr 16, 2022

JelleZijlstra Apr 16, 2022

hauntsaninja Apr 16, 2022

JelleZijlstra Apr 16, 2022

JelleZijlstra Apr 16, 2022

Uh oh!

JelleZijlstra Apr 16, 2022

JelleZijlstra Apr 16, 2022

JelleZijlstra Apr 16, 2022

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 16, 2022

srittau left a comment

srittau Apr 16, 2022

JelleZijlstra Apr 16, 2022

srittau Apr 16, 2022

Labels

3 participants

Uh oh!

builtins: Audit bytes arguments #7631

builtins: Audit bytes arguments #7631

Uh oh!

Conversation

JelleZijlstra commented Apr 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment has been minimized.

This comment has been minimized.

github-actions bot commented Apr 16, 2022

srittau left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Labels

3 participants