perf: Avoid re-canonicalizing the entire IntervalSet on push #1308

Marwes · 2025-10-20T14:49:11Z

Canonicalize is taking up a significant amount due to a regex with a huge amount of character ranges (generated by lalrpop's lexer expanding multiple \w in a token). While this could perhaps be fixed in lalrpop I did notice the TODO in the code and after addressing this so we automatically union and compress on each push instead of re-canonicalizing on every push and that fixed the performance problem.

I did see the earlier attempt at this #1051 and it seems like that was reverted and regression tests were added so I hope that and the existing tests are enough (I don't have a clear idea on what tests might be missing).

Canonicalize is taking up a significant amount due to a regex with a huge amount of character ranges (generated by [lalrpop](https://github.com/lalrpop/lalrpop)'s lexer expanding multiple `\w` in a token). While this could perhaps be fixed in lalrpop I did notice the TODO in the code and after addressing this so we automatically union and compress on each push instead of re-canonicalizing on every push and that fixed the performance problem. I did see the earlier attempt at this rust-lang#1051 and it seems like that was reverted and regression tests were added so I hope that and the existing tests are enough (I don't have a clear idea on what tests might be missing).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Avoid re-canonicalizing the entire IntervalSet on push #1308

perf: Avoid re-canonicalizing the entire IntervalSet on push #1308

Marwes commented Oct 20, 2025

Labels

1 participant

perf: Avoid re-canonicalizing the entire IntervalSet on push #1308

Are you sure you want to change the base?

perf: Avoid re-canonicalizing the entire IntervalSet on push #1308

Conversation

Marwes commented Oct 20, 2025

Labels

1 participant