- Notifications
You must be signed in to change notification settings - Fork 484
Description
Specifically, this program succeeds in regex 1.9.x
but fails in regex 1.10.1
:
fn main() -> anyhow::Result<()> { let re = regex::Regex::new(r"(\\N\{[^}]+})|([{}])").unwrap(); let hay = r#"hiya \N{snowman} bye"#; let matches = re.find_iter(hay).map(|m| m.range()).collect::<Vec<_>>(); assert_eq!(matches, vec![5..16]); Ok(()) }
Its output with 1.10.1
:
$ cargo run -q thread 'main' panicked at main.rs:7:5: assertion `left == right` failed left: [7..8, 15..16] right: [5..16] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
I believe the issue here was my change to broaden the reverse suffix optimization to use one of many possible literals. But this turns out to be not be quite correct since the rules that govern prefixes don't apply to suffixes. In this case, the literal optimization extracts {
and }
as suffixes. It looks for a {
first and finds a match at that position via the second alternate in the regex. But this winds up missing the match that came before it with the first alternate since the {
isn't a suffix of the first alternate.
This is why we should, at least at present, only use this optimization when there is a non-empty longest common suffix. In that case, and only that case, we know that it is a suffix of every possible path through the regex.
Thank you to @charliermarsh for finding this! See: astral-sh/ruff#7980