Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 53% (0.53x) speedup for _encode_error_event in src/deepgram/extensions/telemetry/proto_encoder.py

⏱️ Runtime : 3.87 milliseconds 2.53 milliseconds (best of 356 runs)

📝 Explanation and details

The optimized version achieves a 53% speedup through several key optimizations targeting the hot paths in protobuf encoding:

1. Single-byte varint caching: A precomputed cache _varint_single_byte_cache eliminates repeated bytearray allocations for values 0-127 (common in field numbers, booleans, small integers). This directly optimizes _varint() and _bool() functions.

2. List-based concatenation strategy: Both _map_str_str() and _encode_error_event() now use list accumulation with b"".join() instead of repeated bytearray += operations. This reduces memory copying overhead significantly when building large messages.

3. Local function reference optimization: In _map_str_str(), frequently called functions are cached as local variables (append = outs.append, ld = _len_delimited, s = _string) to avoid repeated attribute lookups in the inner loop.

Performance impact by test case:

  • Large-scale tests show the biggest gains: 61.6% faster for 1000 attributes, 62.4% faster for large maps with long strings
  • Small/medium tests: Generally neutral to slightly faster (1-8% improvements)
  • Edge cases: Slight variations but consistent correctness

The optimizations are most effective when encoding many map entries or building large messages, as evidenced by the dramatic improvements in tests with hundreds of attributes. For typical small error events, the overhead is minimal while maintaining the same significant benefits for high-throughput scenarios.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 41 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations import typing from typing import Dict # imports import pytest # used for our unit tests from deepgram.extensions.telemetry.proto_encoder import _encode_error_event # unit tests # Helper function to decode varint from bytes for test validation def decode_varint(data, offset=0): result = 0 shift = 0 pos = offset while True: b = data[pos] result |= ((b & 0x7F) << shift) pos += 1 if not (b & 0x80): break shift += 7 return result, pos # Helper function to extract field numbers and wire types from the encoded message def extract_fields(data): fields = [] i = 0 while i < len(data): key, next_i = decode_varint(data, i) field_number = key >> 3 wire_type = key & 0x7 fields.append((field_number, wire_type, i)) i = next_i # Length-delimited wire type if wire_type == 2: length, next_i = decode_varint(data, i) i = next_i + length # Varint wire type elif wire_type == 0: _, next_i = decode_varint(data, i) i = next_i # Other wire types not used in this proto else: raise ValueError("Unexpected wire type: %d" % wire_type) return fields # -------------------- # 1. Basic Test Cases # -------------------- def test_basic_minimal_fields(): # Only required fields, no optionals codeflash_output = _encode_error_event( err_type="TypeError", message="Something went wrong", severity=3, handled=True, ts=1620000000.0, attributes=None, ); result = codeflash_output # 9.00μs -> 9.33μs (3.52% slower) # Should contain fields: 1 (err_type), 2 (message), 7 (severity), 8 (handled), 9 (timestamp) fields = [f[0] for f in extract_fields(result)] def test_basic_all_fields_present(): # All fields present attrs = {"foo": "bar", "baz": "qux"} codeflash_output = _encode_error_event( err_type="ValueError", message="Invalid value", severity=2, handled=False, ts=1620000000.123456, attributes=attrs, stack_trace="traceback here", file="main.py", line=42, column=7, ); result = codeflash_output # 15.3μs -> 14.5μs (5.67% faster) fields = [f[0] for f in extract_fields(result)] for field_num in [1,2,3,4,5,6,7,8,9,10]: pass def test_basic_utf8_strings(): # UTF-8 strings in err_type/message/attributes codeflash_output = _encode_error_event( err_type="Ошибка", message="Произошла ошибка", severity=1, handled=True, ts=0.0, attributes={"ключ": "значение"}, ); result = codeflash_output # 10.1μs -> 9.82μs (2.78% faster) def test_basic_false_handled(): # handled=False should encode as varint 0 codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=False, ts=1.0, attributes=None, ); result = codeflash_output # 7.45μs -> 7.35μs (1.31% faster) # Find field 8, check that its value is 0 for field_num, wire_type, idx in extract_fields(result): if field_num == 8: val, _ = decode_varint(result, idx+1) break else: raise AssertionError("Field 8 not found") def test_basic_true_handled(): # handled=True should encode as varint 1 codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, ); result = codeflash_output # 7.06μs -> 7.16μs (1.42% slower) # Find field 8, check that its value is 1 for field_num, wire_type, idx in extract_fields(result): if field_num == 8: val, _ = decode_varint(result, idx+1) break else: raise AssertionError("Field 8 not found") def test_basic_severity_enum(): # Severity values should be encoded as given for sev in [1,2,3,4,10,0,255]: codeflash_output = _encode_error_event( err_type="E", message="M", severity=sev, handled=True, ts=1.0, attributes=None, ); result = codeflash_output # 29.9μs -> 27.5μs (8.40% faster) for field_num, wire_type, idx in extract_fields(result): if field_num == 7: val, _ = decode_varint(result, idx+1) break else: raise AssertionError("Field 7 not found") # -------------------- # 2. Edge Test Cases # -------------------- def test_edge_empty_strings(): # Empty strings should not encode fields for err_type/message codeflash_output = _encode_error_event( err_type="", message="", severity=1, handled=True, ts=1.0, attributes=None, ); result = codeflash_output # 5.33μs -> 5.89μs (9.36% slower) fields = [f[0] for f in extract_fields(result)] def test_edge_none_optional_fields(): # All optional fields as None codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, stack_trace=None, file=None, line=None, column=None, ); result = codeflash_output # 6.96μs -> 7.14μs (2.55% slower) fields = [f[0] for f in extract_fields(result)] def test_edge_zero_line_column(): # line=0 and column=0 should be encoded codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, line=0, column=0, ); result = codeflash_output # 8.00μs -> 7.92μs (0.985% faster) fields = [f[0] for f in extract_fields(result)] def test_edge_negative_line_column(): # Negative values for line/column should be encoded as unsigned varint codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, line=-1, column=-123, ); result = codeflash_output # 11.0μs -> 11.3μs (2.63% slower) # Should encode as unsigned 64-bit for field_num, wire_type, idx in extract_fields(result): if field_num == 5: val, _ = decode_varint(result, idx+1) if field_num == 6: val, _ = decode_varint(result, idx+1) def test_edge_empty_attributes(): # Empty dict for attributes should not encode field 10 codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes={}, ); result = codeflash_output # 6.90μs -> 6.80μs (1.37% faster) fields = [f[0] for f in extract_fields(result)] def test_edge_large_timestamp_nanos(): # Timestamp with large nanos, e.g. ts=1.999999999 should roll over to next sec ts = 1.999999999 codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=ts, attributes=None, ); result = codeflash_output # 8.63μs -> 9.20μs (6.16% slower) # Extract timestamp message (field 9) for field_num, wire_type, idx in extract_fields(result): if field_num == 9: # Length-delimited length, next_i = decode_varint(result, idx+1) ts_msg = result[next_i:next_i+length] # Should encode sec=2, nanos=0 sec, pos = decode_varint(ts_msg, 1) # skip key for field 1 # Should not have field 2 (nanos) # If field 2 present, it must be 0 if len(ts_msg) > pos: key, next_pos = decode_varint(ts_msg, pos) field2 = key >> 3 nanos, _ = decode_varint(ts_msg, next_pos) break def test_edge_long_strings(): # Very long strings for err_type/message long_str = "x" * 1000 codeflash_output = _encode_error_event( err_type=long_str, message=long_str, severity=1, handled=True, ts=1.0, attributes=None, ); result = codeflash_output # 9.28μs -> 9.44μs (1.61% slower) def test_edge_attributes_special_chars(): # Attributes with special characters attrs = {"a\nb": "c\td", "e\u2603": "f\u20ac"} codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=attrs, ); result = codeflash_output # 11.1μs -> 10.3μs (7.91% faster) for k, v in attrs.items(): pass # -------------------- # 3. Large Scale Test Cases # -------------------- def test_large_many_attributes(): # Large number of attributes (up to 1000) attrs = {f"key{i}": f"value{i}" for i in range(1000)} codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=attrs, ); result = codeflash_output # 1.34ms -> 830μs (61.6% faster) # All keys/values should be present for i in [0, 499, 999]: pass # Should not take excessive time/memory def test_large_long_stack_trace(): # Large stack trace string stack = "\n".join([f"File 'file{i}.py', line {i}" for i in range(500)]) codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, stack_trace=stack, ); result = codeflash_output # 10.8μs -> 11.6μs (7.11% slower) def test_large_long_file_name(): # Very long file name fname = "a"*500 + ".py" codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, file=fname, ); result = codeflash_output # 9.07μs -> 9.41μs (3.56% slower) def test_large_extreme_severity(): # Severity at max int codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=2**31-1, handled=True, ts=1.0, attributes=None, ); result = codeflash_output # 8.26μs -> 8.49μs (2.63% slower) # Find field 7, check value for field_num, wire_type, idx in extract_fields(result): if field_num == 7: val, _ = decode_varint(result, idx+1) break def test_large_extreme_line_column(): # line/column at max int codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=None, line=2**31-1, column=2**31-1, ); result = codeflash_output # 9.38μs -> 9.71μs (3.32% slower) for field_num, wire_type, idx in extract_fields(result): if field_num == 5: val, _ = decode_varint(result, idx+1) if field_num == 6: val, _ = decode_varint(result, idx+1) def test_large_timestamp_high_precision(): # High precision timestamp ts = 1620000000.987654321 codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=ts, attributes=None, ); result = codeflash_output # 9.66μs -> 10.2μs (5.42% slower) # Extract timestamp message (field 9) for field_num, wire_type, idx in extract_fields(result): if field_num == 9: length, next_i = decode_varint(result, idx+1) ts_msg = result[next_i:next_i+length] sec, pos = decode_varint(ts_msg, 1) key, next_pos = decode_varint(ts_msg, pos) field2 = key >> 3 nanos, _ = decode_varint(ts_msg, next_pos) break def test_large_attributes_long_keys_values(): # Attributes with long keys and values attrs = {("k"*100): ("v"*100)} codeflash_output = _encode_error_event( err_type="TypeError", message="msg", severity=1, handled=True, ts=1.0, attributes=attrs, ); result = codeflash_output # 10.1μs -> 10.1μs (0.149% faster) def test_large_all_fields_maximal(): # All fields present and large attrs = {f"k{i}": "v"*50 for i in range(100)} codeflash_output = _encode_error_event( err_type="E"*100, message="M"*100, severity=4, handled=False, ts=9999999999.999999, attributes=attrs, stack_trace="S"*500, file="F"*500, line=2**31-1, column=2**31-1, ); result = codeflash_output # 152μs -> 98.9μs (54.4% faster) for i in [0, 99]: pass # codeflash_output is used to check that the output of the original code is the same as that of the optimized code. #------------------------------------------------ from __future__ import annotations import struct import time import typing from typing import Dict # imports import pytest from deepgram.extensions.telemetry.proto_encoder import _encode_error_event # unit tests # Helper to parse varint from bytes def parse_varint(data, offset=0): shift = 0 result = 0 while True: b = data[offset] result |= ((b & 0x7F) << shift) offset += 1 if not (b & 0x80): break shift += 7 return result, offset # Helper to parse key (field_number, wire_type) def parse_key(data, offset=0): key, offset = parse_varint(data, offset) field_number = key >> 3 wire_type = key & 0x7 return field_number, wire_type, offset # Helper to parse length-delimited field def parse_len_delimited(data, offset=0): length, offset = parse_varint(data, offset) payload = data[offset:offset+length] return payload, offset + length # Helper to parse a string field def parse_string_field(data, offset): payload, offset = parse_len_delimited(data, offset) return payload.decode('utf-8'), offset # Helper to parse timestamp message def parse_timestamp_message(data): offset = 0 sec = None nanos = 0 while offset < len(data): field_number, wire_type, offset = parse_key(data, offset) if field_number == 1 and wire_type == 0: sec, offset = parse_varint(data, offset) elif field_number == 2 and wire_type == 0: nanos, offset = parse_varint(data, offset) else: raise ValueError("Unexpected field in timestamp") return sec, nanos # BASIC TEST CASES def test_all_fields_present(): # All fields provided now = time.time() attrs = {"foo": "bar", "baz": "qux"} codeflash_output = _encode_error_event( err_type="ValueError", message="Something went wrong", severity=4, handled=False, ts=now, attributes=attrs, stack_trace="Traceback (most recent call last): ...", file="main.py", line=42, column=7 ); result = codeflash_output # 18.4μs -> 18.1μs (1.30% faster) # Check that all fields are present by their field numbers # We'll scan and collect field numbers offset = 0 found_fields = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) found_fields.append(field) offset = next_offset # Skip payloads if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) # Should contain all fields 1-10 (except 11+) # attributes is field 10, timestamp is 9, etc. for field in [1,2,3,4,5,6,7,8,9,10]: pass def test_optional_fields_none(): # All optional fields set to None codeflash_output = _encode_error_event( err_type="IOError", message="File not found", severity=2, handled=False, ts=42.0, attributes=None, stack_trace=None, file=None, line=None, column=None ); result = codeflash_output # 7.98μs -> 8.35μs (4.50% slower) # Only required fields should be present offset = 0 fields = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) fields.append(field) offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) # Should not contain 3,4,5,6,10 for opt in [3,4,5,6,10]: pass def test_attributes_empty_dict(): # attributes is empty dict, should not emit field 10 codeflash_output = _encode_error_event( err_type="KeyError", message="Missing key", severity=1, handled=True, ts=0.0, attributes={} ); result = codeflash_output # 7.80μs -> 8.17μs (4.46% slower) # Scan for field 10 offset = 0 has_field_10 = False while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 10: has_field_10 = True offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_stack_trace_and_file_present(): # stack_trace and file are present codeflash_output = _encode_error_event( err_type="RuntimeError", message="Oops", severity=3, handled=False, ts=1.5, attributes=None, stack_trace="stack", file="file.py" ); result = codeflash_output # 10.5μs -> 10.4μs (0.929% faster) # Should contain fields 3 and 4 offset = 0 fields = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) fields.append(field) offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) # EDGE TEST CASES def test_empty_strings_and_zero_values(): # err_type and message are empty strings, severity 0, handled False, ts 0, attributes None codeflash_output = _encode_error_event( err_type="", message="", severity=0, handled=False, ts=0.0, attributes=None ); result = codeflash_output # 6.10μs -> 6.64μs (8.12% slower) # Should skip fields 1 and 2 (empty strings) offset = 0 fields = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) fields.append(field) offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_negative_line_and_column(): # Negative line and column numbers should be encoded as unsigned varint (two's complement) codeflash_output = _encode_error_event( err_type="TypeError", message="Negative line/col", severity=1, handled=True, ts=1.0, attributes=None, line=-1, column=-42 ); result = codeflash_output # 11.3μs -> 11.8μs (4.22% slower) # Find and parse line and column fields offset = 0 found = {} while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 5 and wire == 0: val, offset2 = parse_varint(result, next_offset) found['line'] = val offset = offset2 elif field == 6 and wire == 0: val, offset2 = parse_varint(result, next_offset) found['column'] = val offset = offset2 else: offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_large_severity_and_timestamp(): # Large values for severity and timestamp big_severity = 2**31 big_ts = 2**40 + 0.123456789 codeflash_output = _encode_error_event( err_type="BigError", message="Big numbers", severity=big_severity, handled=False, ts=big_ts, attributes=None ); result = codeflash_output # 10.6μs -> 11.1μs (4.97% slower) # Find severity and timestamp offset = 0 found_severity = None found_sec = None found_nanos = None while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 7 and wire == 0: found_severity, offset = parse_varint(result, next_offset) elif field == 9 and wire == 2: payload, offset = parse_len_delimited(result, next_offset) sec, nanos = parse_timestamp_message(payload) found_sec = sec found_nanos = nanos else: offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_unicode_and_non_ascii_strings(): # Unicode in err_type, message, stack_trace, file, attributes codeflash_output = _encode_error_event( err_type="Ошибка", message="Произошла ошибка", severity=2, handled=True, ts=1000.0, attributes={"ключ": "значение", "emoji": "😀"}, stack_trace="Трассировка", file="файл.py" ); result = codeflash_output # 13.9μs -> 12.9μs (7.38% faster) # Parse and check that all strings decode as utf-8 offset = 0 seen_unicode = False while offset < len(result): field, wire, next_offset = parse_key(result, offset) if wire == 2: payload, offset2 = parse_len_delimited(result, next_offset) try: # Try decoding as utf-8 payload.decode('utf-8') seen_unicode = True except UnicodeDecodeError: pass offset = offset2 elif wire == 0: _, offset = parse_varint(result, next_offset) def test_empty_attributes_and_stack_trace(): # attributes is empty dict, stack_trace is empty string codeflash_output = _encode_error_event( err_type="Test", message="Empty fields", severity=1, handled=True, ts=0.0, attributes={}, stack_trace="" ); result = codeflash_output # 7.37μs -> 7.69μs (4.23% slower) # stack_trace should not be present (empty string) offset = 0 fields = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) fields.append(field) offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) # LARGE SCALE TEST CASES def test_large_attributes_map(): # attributes map with 999 entries attrs = {f"key{i}": f"value{i}" for i in range(999)} codeflash_output = _encode_error_event( err_type="OverflowError", message="Too many attributes", severity=3, handled=True, ts=123.456, attributes=attrs ); result = codeflash_output # 1.35ms -> 830μs (62.4% faster) # Count number of field 10 entries in the output offset = 0 count_10 = 0 while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 10: count_10 += 1 offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_long_strings(): # Very long err_type, message, stack_trace, file long_str = "A" * 1000 codeflash_output = _encode_error_event( err_type=long_str, message=long_str, severity=1, handled=False, ts=0.0, attributes=None, stack_trace=long_str, file=long_str ); result = codeflash_output # 11.8μs -> 12.4μs (4.17% slower) # Check that the length of each string field is 1000 offset = 0 lengths = [] while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field in [1,2,3,4] and wire == 2: payload, offset2 = parse_len_delimited(result, next_offset) lengths.append(len(payload)) offset = offset2 else: offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_many_optional_fields(): # All optional fields, with line/column at upper bound codeflash_output = _encode_error_event( err_type="Test", message="Test", severity=1, handled=True, ts=0.0, attributes=None, stack_trace="x", file="y", line=2**63-1, column=2**63-1 ); result = codeflash_output # 11.7μs -> 12.0μs (2.18% slower) # Parse and check line and column offset = 0 found = {} while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 5 and wire == 0: val, offset2 = parse_varint(result, next_offset) found['line'] = val offset = offset2 elif field == 6 and wire == 0: val, offset2 = parse_varint(result, next_offset) found['column'] = val offset = offset2 else: offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) def test_performance_with_large_map_and_long_strings(): # Large attributes and long message attrs = {f"k{i}": "v"*50 for i in range(500)} long_msg = "x" * 900 codeflash_output = _encode_error_event( err_type="E", message=long_msg, severity=2, handled=False, ts=999.999, attributes=attrs ); result = codeflash_output # 686μs -> 422μs (62.4% faster) # Check that all 500 attributes are present and message is correct length offset = 0 attr_count = 0 msg_len = None while offset < len(result): field, wire, next_offset = parse_key(result, offset) if field == 2 and wire == 2: payload, offset2 = parse_len_delimited(result, next_offset) msg_len = len(payload) offset = offset2 elif field == 10 and wire == 2: attr_count += 1 _, offset = parse_len_delimited(result, next_offset) else: offset = next_offset if wire == 2: _, offset = parse_len_delimited(result, offset) elif wire == 0: _, offset = parse_varint(result, offset) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code. #------------------------------------------------ from deepgram.extensions.telemetry.proto_encoder import _encode_error_event def test__encode_error_event(): _encode_error_event(err_type='\x00', message='\x00', severity=0, handled=True, ts=0.0, attributes={'\U00040000\x00': ''}, stack_trace=None, file='\U00040000', line=0, column=128) def test__encode_error_event_2(): _encode_error_event(err_type='', message='', severity=0, handled=False, ts=0.0, attributes={}, stack_trace='𐀀𐀀𐀀', file='', line=None, column=0)
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_7zeygj7s/tmpw2vaginn/test_concolic_coverage.py::test__encode_error_event 13.2μs 13.2μs 0.417%✅
codeflash_concolic_7zeygj7s/tmpw2vaginn/test_concolic_coverage.py::test__encode_error_event_2 7.30μs 7.53μs -3.09%⚠️

To edit these changes git checkout codeflash/optimize-_encode_error_event-mh4jvqtd and push.

Codeflash

The optimized version achieves a **53% speedup** through several key optimizations targeting the hot paths in protobuf encoding: **1. Single-byte varint caching**: A precomputed cache `_varint_single_byte_cache` eliminates repeated bytearray allocations for values 0-127 (common in field numbers, booleans, small integers). This directly optimizes `_varint()` and `_bool()` functions. **2. List-based concatenation strategy**: Both `_map_str_str()` and `_encode_error_event()` now use list accumulation with `b"".join()` instead of repeated `bytearray +=` operations. This reduces memory copying overhead significantly when building large messages. **3. Local function reference optimization**: In `_map_str_str()`, frequently called functions are cached as local variables (`append = outs.append`, `ld = _len_delimited`, `s = _string`) to avoid repeated attribute lookups in the inner loop. **Performance impact by test case**: - **Large-scale tests show the biggest gains**: 61.6% faster for 1000 attributes, 62.4% faster for large maps with long strings - **Small/medium tests**: Generally neutral to slightly faster (1-8% improvements) - **Edge cases**: Slight variations but consistent correctness The optimizations are most effective when encoding many map entries or building large messages, as evidenced by the dramatic improvements in tests with hundreds of attributes. For typical small error events, the overhead is minimal while maintaining the same significant benefits for high-throughput scenarios.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 07:49
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

1 participant