-
-
Couldn't load subscription status.
- Fork 19.2k
Closed
Labels
Arrowpyarrow functionalitypyarrow functionalityBugStringsString extension data type and string dataString extension data type and string data
Milestone
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd from re import sub teststr = "var.one[0]" print(sub(r'\[(\d+)\]',r'(\1)',teststr)) s = pd.Series(["var.one[0]", "var.two[1]", "var.three[2]"]).convert_dtypes(dtype_backend="pyarrow") t = s.str.replace(r'\[(\d+)\]',r'(\1)',regex=True) print(t)Issue Description
The most recent pandas version produces the following error message:
var.one(0) Traceback (most recent call last): File "/workspaces/verbose-system/testfile.py", line 9, in <module> t = s.str.replace(r'\[(\d+)\]',r'(\1)',regex=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/strings/accessor.py", line 140, in wrapper return func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/strings/accessor.py", line 1580, in replace result = self._data.array._str_replace( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/arrays/_arrow_string_mixins.py", line 182, in _str_replace raise NotImplementedError( NotImplementedError: replace is not supported with a re.Pattern, callable repl, case=False, flags!=0, or when the replacement string contains named group references (\g<...>, \d+) I have trialed all possible combinations of \1, \g<1>, named groups, precompiled patterns, etc. However, this issue persists. I believe that this is related to: #57636
Expected Behavior
With pandas 2.3.0 the output of the script above was:
var.one(0) 0 var.one(0) 1 var.two(1) 2 var.three(2) dtype: string[pyarrow] Installed Versions
INSTALLED VERSIONS ------------------ commit : 9c8bc3e55188c8aff37207a74f1dd144980b8874 python : 3.12.11 python-bits : 64 OS : Linux OS-release : 6.8.0-1030-azure Version : #35~22.04.1-Ubuntu SMP Mon May 26 18:08:30 UTC 2025 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : C.UTF-8 pandas : 2.3.3 numpy : 2.3.3 pytz : 2025.2 dateutil : 2.9.0.post0 pip : 25.0.1 Cython : None sphinx : None IPython : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 21.0.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2025.2 qtpy : None pyqt5 : None Metadata
Metadata
Assignees
Labels
Arrowpyarrow functionalitypyarrow functionalityBugStringsString extension data type and string dataString extension data type and string data