- Notifications
You must be signed in to change notification settings - Fork 484
syntax: optimize some functions in IntervalSet. #1051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Replace the old canonicalize part with the in-place way with constant memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
w00t! This is awesome. Nice work. :-)
I see you're working on more stuff here. If possible, please avoid merge commits. Instead, please rebase and force push. |
Sorry about the inconvenience I brought here, and thank you for your professional advices. I am still learning to contribute codes in public repo. The work on this PR is down from my perspective and sorry again for my inexperienced in git. |
No need to apologize! Everyone deals with git and commit history a little differently. If you look at the commit log for this repo, it's a straight linear line with (usually) decent messages. Basically, if you can just avoid merges then it should be easier for me to massage your PR. |
I see. Thx for the explanation. I'll pay attention to this point in future potential PRs. |
5c97a7b
to c4b718f
Compare It now behaves like difference and others.
c4b718f
to a71e3c8
Compare 2f2db21
to dd46d34
Compare This reduces or eliminates allocation when combining Unicode classes and should make some things faster. It's unlikely for these optimizations to matter much in practice, but they are likely to help in niche or pathological cases where there are a lot of ops in a class. Closes #1051
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Some of this looks... quite gnarly. But I feel pretty confident with the test suite. I also tried some random mutations to provoke test failures, and it did. So that's good too.
This is really great work. I remember banging my head against the wall trying to write these routines without using extra memory and just gave up.
This reduces or eliminates allocation when combining Unicode classes and should make some things faster. It's unlikely for these optimizations to matter much in practice, but they are likely to help in niche or pathological cases where there are a lot of ops in a class. Closes #1051
This reduces or eliminates allocation when combining Unicode classes and should make some things faster. It's unlikely for these optimizations to matter much in practice, but they are likely to help in niche or pathological cases where there are a lot of ops in a class. Closes #1051
This reduces or eliminates allocation when combining Unicode classes and should make some things faster. It's unlikely for these optimizations to matter much in practice, but they are likely to help in niche or pathological cases where there are a lot of ops in a class. Closes #1051
This reverts commit 6d2b09e. Sadly I just don't have the time to fix this code myself. It's too subtle. So I'm just reverting it entirely for now. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153 Ref #1051
This reverts commit 6d2b09e. Sadly I just don't have the time to fix this code myself. It's too subtle. So I'm just reverting it entirely for now. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168 Ref #1051
FYI, I ended up needing to revert this change because of bugs caught by OSS-fuzz. See #1102 for more info. |
This reverts commit 6d2b09e. Sadly I just don't have the time to fix this code myself. It's too subtle. So I'm just reverting it entirely for now. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168 Ref #1051
This reverts commit 6d2b09e. Sadly I just don't have the time to fix this code myself. It's too subtle. So I'm just reverting it entirely for now. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168 Ref #1051
Fix the `push` function in rust-lang#1051
Based on the test cases, it turns out that the implementation of |
Canonicalize is taking up a significant amount due to a regex with a huge amount of character ranges (generated by [lalrpop](https://github.com/lalrpop/lalrpop)'s lexer expanding multiple `\w` in a token). While this could perhaps be fixed in lalrpop I did notice the TODO in the code and after addressing this so we automatically union and compress on each push instead of re-canonicalizing on every push and that fixed the performance problem. I did see the earlier attempt at this rust-lang#1051 and it seems like that was reverted and regression tests were added so I hope that and the existing tests are enough (I don't have a clear idea on what tests might be missing).
Canonicalize is taking up a significant amount due to a regex with a huge amount of character ranges (generated by [lalrpop](https://github.com/lalrpop/lalrpop)'s lexer expanding multiple `\w` in a token). While this could perhaps be fixed in lalrpop I did notice the TODO in the code and after addressing this so we automatically union and compress on each push instead of re-canonicalizing on every push and that fixed the performance problem. I did see the earlier attempt at this rust-lang#1051 and it seems like that was reverted and regression tests were added so I hope that and the existing tests are enough (I don't have a clear idea on what tests might be missing).
Replace the canonicalize function the in-place way with constant memory.
Replace the negate function the in-place way with constant memory.
Simplify the difference function without changing the algorithm.
Implement symmetric_difference function like others.
Implement push function like others.
Force the symmetric_difference function in Interval return ranges in order.