[js] [atoms] Fixed text transformation issue with text-transform: capitalize #16275

vicky-iv · 2025-08-29T20:52:00Z

User description

🔗 Related Issues

Fixes #14271

💥 What does this PR do?

Fixes incorrect capitalization behavior for text-transform: capitalize, addressing:

Accented Latin letters (e.g., “expiración” → “Expiración”, not “ExpiraciÓN”).
Enye (mañana -> Mañana)

🔧 Implementation Notes

Replaced the previous boundary regex:

First step uses a negated “separator” class that treats ASCII letters, extended Latin letters (\u00C0–\u02AF, \u1E00–\u1EFF) and combining marks (\u0300–\u036F, \u1AB0–\u1AFF, \u1DC0–\u1DFF) as in-word characters, and excludes _ and apostrophes from being boundaries (so we don’t split snake_case or contractions).
Second step: a small second regex capitalizes the first letter after an opening _ or * only when those symbols act as wrappers (preceded by start or a non-word), avoiding interference with snake_case.

All tests from the text_test.html are passed locally:

Firefox:

Chrome:

Safari:

💡 Additional Considerations

Scope limited to Latin scripts + circled letters; other scripts (Greek/Cyrillic/etc.) can be added by extending ranges if needed.

🔄 Types of changes

Bug fix (backwards compatible)

PR Type

Bug fix

Description

Fixed text-transform: capitalize for accented Latin letters
Added support for enye and extended Unicode ranges
Improved boundary detection to preserve snake_case
Added test cases for Spanish accented characters

Diagram Walkthrough

flowchart LR A["Old regex boundary detection"] --> B["Enhanced Unicode-aware regex"] B --> C["Preserve snake_case"] B --> D["Support accented letters"] E["Add test cases"] --> F["Spanish characters validation"]

File Walkthrough

Relevant files

Bug fix

dom.js `Enhanced text capitalization with Unicode support` javascript/atoms/dom.js Replaced simple boundary regex with Unicode-aware pattern Added support for extended Latin and combining marks ranges Implemented two-step capitalization to handle edge cases Preserved snake_case and contractions from incorrect splitting	+9/-2

Tests

text_test.html `Added Unicode capitalization test cases` javascript/atoms/test/text_test.html Added test cases for Spanish accented characters Fixed whitespace expectations in preformatted text tests Added validation for "expiración" and "mañana" capitalization	+12/-2

…italize

qodo-merge-pro · 2025-08-29T20:52:35Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Unicode Range Accuracy Verify the enclosed/circled letter ranges: the capital range included is U+24B6–U+24E9, which spans capital A–Z (U+24B6–U+24CF) and then continues into additional symbols; ensure this is intentional and that the mapping truly pairs small (U+24D0–U+24E9) to corresponding capitals without affecting unrelated characters. Also confirm combining marks following first letter don’t break capitalization. // 1) don't treat '_' as a separator (protects snake_case) var re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2].toUpperCase(); }); // 2) capitalize after opening "_" or "" // Preceded by start or a non-word (so it won't fire for snake_case) re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2] + arguments[3].toUpperCase(); }); IE Compatibility The old code had an IE-specific fallback avoiding /u. New regexes drop that branch and rely on Unicode character classes without the 'u' flag but include non-ASCII ranges; confirm this still behaves in legacy environments the atoms target (especially old IE) and doesn’t degrade performance. // 1) don't treat '_' as a separator (protects snake_case) var re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2].toUpperCase(); }); // 2) capitalize after opening "_" or "" // Preceded by start or a non-word (so it won't fire for snake_case) re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2] + arguments[3].toUpperCase(); }); False Positives Around Punctuation The boundary class excludes apostrophes to preserve contractions; confirm cases like l'état, O’Neill, and words after punctuation (e.g., “hello—world”) capitalize correctly and that hyphenated words “bla-bla” still behave as desired. // 1) don't treat '_' as a separator (protects snake_case) var re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2].toUpperCase(); }); // 2) capitalize after opening "_" or "" // Preceded by start or a non-word (so it won't fire for snake_case) re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g; text = text.replace(re, function () { return arguments[1] + arguments[2] + arguments[3].toUpperCase(); });

qodo-merge-pro · 2025-08-29T20:55:06Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Align Unicode ranges and boundaries Make this regex consistent with the first capitalization pass. Include combining mark ranges in the boundary class and use the full circled Latin range for the letter group to avoid missed matches (e.g., uppercase circled letters) and incorrect boundaries after combining marks. javascript/atoms/dom.js [1184] -re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g; +re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])/g; Apply / Chat Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies an inconsistency in Unicode ranges between the two regular expressions used for capitalization, which could lead to bugs in edge cases.	Medium
Learned best practice	Use explicit callback parameters Use named parameters in the replace callbacks instead of indexing into "arguments" to improve clarity and avoid reliance on the implicit arguments object. javascript/atoms/dom.js [1175-1188] if (textTransform == 'capitalize') { // 1) don't treat '_' as a separator (protects snake_case) var re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9\u0300-\u036F\u1AB0-\u1AFF\u1DC0-\u1DFF])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])/g; - text = text.replace(re, function () { - return arguments[1] + arguments[2].toUpperCase(); + text = text.replace(re, function (match, boundary, ch) { + return boundary + ch.toUpperCase(); }); // 2) capitalize after opening "_" or "" // Preceded by start or a non-word (so it won't fire for snake_case) re = /(^\|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g; - text = text.replace(re, function () { - return arguments[1] + arguments[2] + arguments[3].toUpperCase(); + text = text.replace(re, function (match, boundary, marker, ch) { + return boundary + marker + ch.toUpperCase(); }); } else if (textTransform == 'uppercase') { Apply / Chat Suggestion importance[1-10]: 5 __ Why: Relevant best practice - Prefer clear, language-idiomatic APIs: use explicit callback parameters over the implicit "arguments" object for readability and maintainability.	Low
Update

Usielrivas · 2025-08-29T22:44:51Z

javascript/atoms/dom.js

+ // Preceded by start or a non-word (so it won't fire for snake_case)
+ re = /(^|[^'_0-9A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24B6-\u24E9])([_*])([A-Za-z\u00C0-\u02AF\u1E00-\u1EFF\u24D0-\u24E9])/g;
+ text = text.replace(re, function () {
+ return arguments[1] + arguments[2] + arguments[3].toUpperCase();


Here you can remove the magic numbers and leave something more descriptive like:

text = text.replace(re, function (_match, prefix, divider, char) { return prefix + divider + char.toUpperCase(); });

Sure, I just followed the existing implementation and coding style

No worries, it's just a coding tip.

diemol

Thank you, @vicky-iv!

[js] [atoms] Fixed text transformation issue with text-transform: cap…

a9cd945

…italize

selenium-ci added the B-atoms JavaScript chunks generated by Google closure label Aug 29, 2025

qodo-merge-pro bot added the Review effort 3/5 label Aug 29, 2025

vicky-iv mentioned this pull request Aug 29, 2025

[🐛 Bug]: Text transformation issue with text-transform: capitalize; using Selenium WebDriver with Firefox 127.0.2 #14271

Closed

Usielrivas reviewed Aug 29, 2025

View reviewed changes

diemol added 2 commits September 1, 2025 09:45

Merge branch 'trunk' into fix-text-capitalization

692c90e

Merge branch 'trunk' into fix-text-capitalization

46b4b10

diemol approved these changes Sep 2, 2025

View reviewed changes

diemol merged commit 775cfb3 into SeleniumHQ:trunk Sep 2, 2025
31 of 32 checks passed

vicky-iv deleted the fix-text-capitalization branch September 5, 2025 19:48

vicky-iv restored the fix-text-capitalization branch September 5, 2025 19:48

vicky-iv deleted the fix-text-capitalization branch September 5, 2025 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[js] [atoms] Fixed text transformation issue with text-transform: capitalize #16275

[js] [atoms] Fixed text transformation issue with text-transform: capitalize #16275

Uh oh!

vicky-iv commented Aug 29, 2025 •

edited

Loading

qodo-merge-pro bot commented Aug 29, 2025

qodo-merge-pro bot commented Aug 29, 2025 •

edited

Loading

Usielrivas Aug 29, 2025

vicky-iv Aug 29, 2025

Usielrivas Aug 29, 2025

diemol left a comment

Uh oh!

Labels

4 participants

Uh oh!

[js] [atoms] Fixed text transformation issue with text-transform: capitalize #16275

[js] [atoms] Fixed text transformation issue with text-transform: capitalize #16275

Uh oh!

Conversation

vicky-iv commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

🔗 Related Issues

💥 What does this PR do?

🔧 Implementation Notes

💡 Additional Considerations

🔄 Types of changes

PR Type

Description

Diagram Walkthrough

File Walkthrough

qodo-merge-pro bot commented Aug 29, 2025

PR Reviewer Guide 🔍

qodo-merge-pro bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Usielrivas Aug 29, 2025

Choose a reason for hiding this comment

vicky-iv Aug 29, 2025

Choose a reason for hiding this comment

Usielrivas Aug 29, 2025

Choose a reason for hiding this comment

diemol left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants

vicky-iv commented Aug 29, 2025 •

edited

Loading

qodo-merge-pro bot commented Aug 29, 2025 •

edited

Loading