src: improve windows1252 decoding speed #61120
Draft
+555 −164
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Tracking: #61041
This builds on top of #61093 and #61118 and gives an additional ~1.4x improvement by using iteration in pairs for this single encoding, as it's commonly used (it's aliased as
latin1)Combined, this is ~114x faster than
mainon ASCII (due to #61093 + #61118) and ~71x faster thanmainon non-ASCIIThis could be improved further with latin1 checks and by moving ascii checks to prefixes instead, but let #61119 land first
I'm not sure if this even makes sense at this point, it comes at a cost of 128 KiB cache (even though allocated at the first large use)
Perhaps there is some other way or we could ignore this
Warning
Very crude, just a concept demonstration at this point
See #61118 for previous benchmarks