When computing the anchors on the traceback, results may be wrong if unicode chars are used

Bug report

Consider the code below:

d = { "ó": { "á": { "í": { "theta": 1 } } } } try: result = d["ó"]["á"]["í"]["beta"] except: import traceback;traceback.print_exc()

The output provided is:

Traceback (most recent call last): File "W:\pydev.debugger\check\snippet2.py", line 12, in <module> result = d["ó"]["á"]["í"]["beta"] ~~~~~~~~~~~~~~~~~~~^^^^^^^^ KeyError: 'beta'

Notice that for each additional unicode char, an additional `~' is added.

This seems to happen because when computing the anchors in traceback._extract_caret_anchors_from_line_segment the columns from the ast nodes generated in ast.parse seem to be related to bytes and not actual chars.

Your environment

CPython versions tested on: 3.11.0
Operating system and architecture: Windows 10

PR: gh-99103: Normalize specialized traceback anchors against the current line #99145

PR: [3.11] gh-99103: Normalize specialized traceback anchors against the current line #99423

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

When computing the anchors on the traceback, results may be wrong if unicode chars are used #99103

Bug report

Your environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

When computing the anchors on the traceback, results may be wrong if unicode chars are used #99103

Description

Bug report

Your environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions