-
- Notifications
You must be signed in to change notification settings - Fork 34.2k
util: optimize toASCIILower function using V8's native toLowerCase #61107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
util: optimize toASCIILower function using V8's native toLowerCase #61107
Conversation
bf24628 to b391b73 Compare b391b73 to 6b08d1a Compare 6b08d1a to 57b817f Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not identical (and there is a reason why the function checks character ranges)
This function is supposed to do ASCII lowercasing, not Unicode lowercasing
TextDecoder has the same issue for label normalization btw, as it incorrectly uses .toLowerCase at
Line 332 in b1c01fc
| return encodings.get(trimAsciiWhitespace(label.toLowerCase())); |
See spec: https://tc39.es/ecma262/#sec-string.prototype.tolowercase and notes there
As an example, '\u212A'.toLowerCase() === 'k'
toLowerCase can convert non-ascii characters into ascii characters (which makes e.g. current TextDecoder label handling incorrect, among all its other bugs)
text/html charset=gb\u212A should not turn into text/html charset=gbk
| What would be helpful is to move |
Thanks I will try this idea |
Co-authored-by: Nikita Skovoroda <chalkerx@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@ ## main #61107 +/- ## ========================================== - Coverage 88.53% 88.52% -0.01% ========================================== Files 703 703 Lines 208546 208548 +2 Branches 40217 40224 +7 ========================================== - Hits 184634 184620 -14 - Misses 15926 15958 +32 + Partials 7986 7970 -16
🚀 New features to boost your workflow:
|
RafaelGSS left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. With green CI and confirmed by benchmark CI.
benchmark results (mimetype-instantiation.js):
application/ecmascript +54%
text/html charset=gbk +24%
text/html long...=x charset=gbk +21%