Skip to content

Conversation

@milseman
Copy link
Member

_StoredCapture stored an Any?, which is rarely set to anything non-nil, but since it could it is non-trivial. This experiment tries to see the perf overhead of this non-triviality.

@milseman
Copy link
Member Author

This gives us a somewhat robust (given our benchmark suite's variability) perf improvement of a few percent typically. In its current state, this code breaks support for value captures and that would need to be fixed by setting the valueIndex to the index in an array of capture-values maintained by the processor.

=== Regressions ====================================================================== - CompilerMessages_All 104ms	102ms	1.3ms	1.3% - CompilerMessages_All_Scalar 86.7ms	85.6ms	1.01ms	1.2% - EmailLookaheadNoMatches_All 41.8ms	41.3ms	434µs	1.1% - SubtractionCCC_All_Scalar 21.9ms	21.5ms	388µs	1.8% - IntersectionCCC_All 22.5ms	22.2ms	363µs	1.6% - AnchoredNotFound_Whole 9.08ms	8.92ms	154µs	1.7% - AnchoredNotFound_Whole_Scalar 5.81ms	5.67ms	143µs	2.5% - EmailLookahead_All_Scalar 22.8ms	22.7ms	99.9µs	0.4% - EagarQuantWithTerminal_Whole_Scalar 1.42ms	1.36ms	58.7µs	4.3% === Improvements ===================================================================== - EmailRFCNoMatches_All 136ms	139ms	-3.34ms	-2.4% - symDiffCCC_All 48.4ms	50ms	-1.63ms	-3.3% - EmojiRegex_All 71.6ms	73.2ms	-1.6ms	-2.2% - EmojiRegex_All_Scalar 48.4ms	50ms	-1.59ms	-3.2% - symDiffCCC_All_Scalar 48.5ms	50ms	-1.51ms	-3.0% - DiceRollsInText_All_Scalar 43.6ms	44.7ms	-1.09ms	-2.4% - Words_All_Scalar 12.5ms	13.6ms	-1.02ms	-7.5% - InvertedCCC_All 19.7ms	20.7ms	-996µs	-4.8% - ReluctantQuant_Whole_Scalar 9.77ms	10.7ms	-955µs	-8.9% - ReluctantQuant_Whole 9.89ms	10.8ms	-937µs	-8.7% - Words_All 13.2ms	14.1ms	-935µs	-6.6% - IntersectionCCC_All_Scalar 21.7ms	22.4ms	-669µs	-3.0% - DiceRollsInText_All 46.5ms	47.2ms	-668µs	-1.4% - InvertedCCC_All_Scalar 19.8ms	20.4ms	-668µs	-3.3% - EmailRFC_All 63.7ms	64.3ms	-631µs	-1.0% - EmailBuiltinCharacterClass_All_Scalar 12.5ms	13.1ms	-600µs	-4.6% - NotFound_All_Scalar 6.3ms	6.76ms	-460µs	-6.8% - NotFound_All 7.02ms	7.47ms	-448µs	-6.0% - Css_All 3.36ms	3.79ms	-434µs	-11.4% - BasicBuiltinCharacterClass_All_Scalar 7.34ms	7.75ms	-415µs	-5.4% - BasicBuiltinCharacterClass_All 8.06ms	8.45ms	-384µs	-4.5% - Css_All_Scalar 2.88ms	3.26ms	-380µs	-11.6% - BasicRangeCCC_All 10.8ms	11.2ms	-345µs	-3.1% - LiteralSearchNotFound_All_Scalar 5.68ms	6ms	-326µs	-5.4% - BasicCCC_All_Scalar 10.4ms	10.7ms	-313µs	-2.9% - Numbers_All_Scalar 6.42ms	6.73ms	-308µs	-4.6% - CaseInsensitiveCCC_All 11.6ms	11.9ms	-304µs	-2.6% - LiteralSearchNotFound_All 6.47ms	6.77ms	-302µs	-4.5% - LiteralSearch_All 6.69ms	6.98ms	-295µs	-4.2% - BasicCCC_All 10.4ms	10.7ms	-287µs	-2.7% - HangulSyllable_All 6.89ms	7.17ms	-285µs	-4.0% - GraphemeBreakNoCap_All 4.09ms	4.37ms	-280µs	-6.4% - BasicRangeCCC_All_Scalar 10.8ms	11.1ms	-279µs	-2.5% - CaseInsensitiveCCC_All_Scalar 11.5ms	11.8ms	-278µs	-2.4% - SubtractionCCC_All 21.4ms	21.6ms	-274µs	-1.3% - EmailLookaheadNoMatches_All_Scalar 27.8ms	28.1ms	-267µs	-0.9% - GraphemeBreakNoCap_All_Scalar 3.52ms	3.76ms	-243µs	-6.5% - LiteralSearch_All_Scalar 5.96ms	6.2ms	-237µs	-3.8% - HangulSyllable_All_Scalar 6.16ms	6.39ms	-232µs	-3.6% - MACAddress 2.77ms	2.98ms	-216µs	-7.2% - MACAddress_Scalar 2.35ms	2.54ms	-187µs	-7.4% - IPv6Address 3.85ms	3.99ms	-145µs	-3.6% - EmailBuiltinCharacterClass_All 12.7ms	12.8ms	-132µs	-1.0% - HangulSyllable_First_Scalar 2.94ms	3.07ms	-130µs	-4.2% - IPv4Address 2.4ms	2.51ms	-117µs	-4.7% - HangulSyllable_First 3.3ms	3.4ms	-97.5µs	-2.9% - DiceNotation_Scalar 4.59ms	4.69ms	-94.1µs	-2.0% - DiceNotation 4.93ms	5.01ms	-82.4µs	-1.6% - IPv6Address_Scalar 2.77ms	2.85ms	-77.2µs	-2.7% - Lines_All 1.76ms	1.84ms	-76.8µs	-4.2% - Lines_All_Scalar 1.71ms	1.78ms	-76µs	-4.3% - IPv4Address_Scalar 2.19ms	2.25ms	-61.8µs	-2.8% - EmailLookaheadList_Scalar 5.15ms	5.21ms	-58.1µs	-1.1% - ReluctantQuantWithTerminal_Whole_Scalar 6.63ms	6.68ms	-49.3µs	-0.7% 

Beyond this, we could also improve the struct's layout, but that's a much lower-impact change for further down the line.

@natecook1000
Copy link
Member

Bulk closing old PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants