Skip to content

3.11 regression: traceback.format_list raises UnicodeDecodeError in certain scenarios #98744

@sebastinas

Description

@sebastinas

Bug report

Take the following piece of code:

import sys, traceback try: width except: _, _, tb = sys.exc_info() tblist = traceback.extract_tb(tb) print(traceback.format_list(tblist))

With Python 3.10 and earlier versions, executing a file with this code produces:

[' File "/tmp/test.py", line 3, in <module>\n width\n'] 

With Python 3.11, a UnicodeDecodeError is raised instead:

Traceback (most recent call last): File "/tmp/test.py", line 9, in <module> print(traceback.format_list(tblist)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/traceback.py", line 41, in format_list return StackSummary.from_list(extracted_list).format() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/traceback.py", line 531, in format formatted_frame = self.format_frame_summary(frame_summary) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/traceback.py", line 478, in format_frame_summary colno = _byte_offset_to_character_offset( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.11/traceback.py", line 566, in _byte_offset_to_character_offset return len(as_utf8[:offset + 1].decode("utf-8")) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 4: unexpected end of data 

Your environment

  • CPython versions tested on: 3.10.8 and 3.11.0
  • Operating system and architecture: Debian unstable, amd64

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.11only security fixes3.12only security fixestopic-unicodetype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions