Skip to content

BUG: Series.str.replace stopped working with regex groups #62653

@worc4021

Description

@worc4021

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd from re import sub teststr = "var.one[0]" print(sub(r'\[(\d+)\]',r'(\1)',teststr)) s = pd.Series(["var.one[0]", "var.two[1]", "var.three[2]"]).convert_dtypes(dtype_backend="pyarrow") t = s.str.replace(r'\[(\d+)\]',r'(\1)',regex=True) print(t)

Issue Description

The most recent pandas version produces the following error message:

var.one(0) Traceback (most recent call last): File "/workspaces/verbose-system/testfile.py", line 9, in <module> t = s.str.replace(r'\[(\d+)\]',r'(\1)',regex=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/strings/accessor.py", line 140, in wrapper return func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/strings/accessor.py", line 1580, in replace result = self._data.array._str_replace( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspaces/verbose-system/.new/lib/python3.12/site-packages/pandas/core/arrays/_arrow_string_mixins.py", line 182, in _str_replace raise NotImplementedError( NotImplementedError: replace is not supported with a re.Pattern, callable repl, case=False, flags!=0, or when the replacement string contains named group references (\g<...>, \d+) 

I have trialed all possible combinations of \1, \g<1>, named groups, precompiled patterns, etc. However, this issue persists. I believe that this is related to: #57636

Expected Behavior

With pandas 2.3.0 the output of the script above was:

var.one(0) 0 var.one(0) 1 var.two(1) 2 var.three(2) dtype: string[pyarrow] 

Installed Versions

INSTALLED VERSIONS ------------------ commit : 9c8bc3e55188c8aff37207a74f1dd144980b8874 python : 3.12.11 python-bits : 64 OS : Linux OS-release : 6.8.0-1030-azure Version : #35~22.04.1-Ubuntu SMP Mon May 26 18:08:30 UTC 2025 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : C.UTF-8 pandas : 2.3.3 numpy : 2.3.3 pytz : 2025.2 dateutil : 2.9.0.post0 pip : 25.0.1 Cython : None sphinx : None IPython : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None blosc : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : None html5lib : None hypothesis : None gcsfs : None jinja2 : None lxml.etree : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None psycopg2 : None pymysql : None pyarrow : 21.0.0 pyreadstat : None pytest : None python-calamine : None pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlsxwriter : None zstandard : None tzdata : 2025.2 qtpy : None pyqt5 : None 

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityBugStringsString extension data type and string data

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions