Skip to content

Conversation

@mertcanaltin
Copy link
Member

@mertcanaltin mertcanaltin commented Dec 18, 2025

benchmark results (mimetype-instantiation.js):

application/ecmascript +54%
text/html charset=gbk +24%
text/html long...=x charset=gbk +21%

@nodejs-github-bot nodejs-github-bot added the needs-ci PRs that need a full CI run. label Dec 18, 2025
@mertcanaltin mertcanaltin force-pushed the mert/perf/mime-toasciilower branch from bf24628 to b391b73 Compare December 18, 2025 08:49
@mertcanaltin mertcanaltin changed the title perf: optimize toASCIILower function using V8's native toLowerCase util: optimize toASCIILower function using V8's native toLowerCase Dec 18, 2025
@mertcanaltin mertcanaltin force-pushed the mert/perf/mime-toasciilower branch from b391b73 to 6b08d1a Compare December 18, 2025 08:50
@mertcanaltin mertcanaltin force-pushed the mert/perf/mime-toasciilower branch from 6b08d1a to 57b817f Compare December 18, 2025 08:55
Copy link
Member

@ChALkeR ChALkeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not identical (and there is a reason why the function checks character ranges)

This function is supposed to do ASCII lowercasing, not Unicode lowercasing

TextDecoder has the same issue for label normalization btw, as it incorrectly uses .toLowerCase at

return encodings.get(trimAsciiWhitespace(label.toLowerCase()));

See spec: https://tc39.es/ecma262/#sec-string.prototype.tolowercase and notes there

As an example, '\u212A'.toLowerCase() === 'k'

toLowerCase can convert non-ascii characters into ascii characters (which makes e.g. current TextDecoder label handling incorrect, among all its other bugs)

text/html charset=gb\u212A should not turn into text/html charset=gbk

@ChALkeR
Copy link
Member

ChALkeR commented Dec 18, 2025

What would be helpful is to move toASCIILower out to some helper lib and to fix TextDecoder label handling to use it instead of .toLowerCase.

@mertcanaltin
Copy link
Member Author

mertcanaltin commented Dec 18, 2025

What would be helpful is to move toASCIILower out to some helper lib and to fix TextDecoder label handling to use it instead of .toLowerCase.

Thanks I will try this idea

mertcanaltin and others added 3 commits December 18, 2025 12:46
Co-authored-by: Nikita Skovoroda <chalkerx@gmail.com>
@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.52%. Comparing base (4f24aff) to head (34692cf).
⚠️ Report is 17 commits behind head on main.

Additional details and impacted files
@@ Coverage Diff @@ ## main #61107 +/- ## ========================================== - Coverage 88.53% 88.52% -0.01%  ========================================== Files 703 703 Lines 208546 208548 +2 Branches 40217 40224 +7 ========================================== - Hits 184634 184620 -14  - Misses 15926 15958 +32  + Partials 7986 7970 -16 
Files with missing lines Coverage Δ
lib/internal/mime.js 98.48% <100.00%> (+<0.01%) ⬆️

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
Copy link
Member

@RafaelGSS RafaelGSS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. With green CI and confirmed by benchmark CI.

@RafaelGSS RafaelGSS added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 19, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 19, 2025
@addaleax addaleax added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

author ready PRs that have at least one approval, no pending requests for changes, and a CI started. needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run.

6 participants