-
- Notifications
You must be signed in to change notification settings - Fork 33.2k
Closed
Closed
Copy link
Labels
Description
This is really a corner case, but I ran across the problem today. The unicode data file for east asian widths states:
# - All code points, assigned or unassigned, that are not listed # explicitly are given the value "N".
However, that seems to not be true in the unicodedata
module, eg:
$ python3 Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import unicodedata >>> char = chr(0xfe75) # arbitrary unassigned code point >>> unicodedata.name(char) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: no such name >>> unicodedata.east_asian_width(char) 'F'
I'd be happy to fix this, if people agree that it should be fixed. FWIW, PyPy has always returned 'N' in this situation. For assigned code points everything is fine.