Skip to content

Conversation

@SyMind
Copy link
Member

@SyMind SyMind commented Oct 25, 2025

No description provided.

Copilot AI review requested due to automatic review settings October 25, 2025 00:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the rope data structure to use BTreeMap<usize, &str> instead of Vec<(&str, usize)> for internal storage, aiming to improve performance characteristics. The key change is storing string chunks indexed by their starting byte position in a BTreeMap, which provides better lookup performance for operations like byte indexing and slicing.

Key changes:

  • Replaced Vec<(&str, usize)> with BTreeMap<usize, &str> in the Repr::Full variant
  • Updated all methods to work with the new BTreeMap structure, swapping tuple order from (chunk, position) to (position, chunk)
  • Refactored the Lines iterator implementation to use BTreeMap iterators instead of vector indexing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/rope.rs Outdated
let pos = byte_index - start_pos;
Some(s.as_bytes()[pos])
Repr::Full(tree) => {
if let Some((start_pos, chunk)) = tree.range(byte_index..).next()
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range query should use ..=byte_index instead of byte_index.. to find the chunk containing the byte at byte_index. Currently, this looks for chunks starting at or after byte_index, but we need the chunk whose starting position is less than or equal to byte_index.

Suggested change
if let Some((start_pos, chunk)) = tree.range(byte_index..).next()
if let Some((start_pos, chunk)) = tree.range(..=byte_index).next_back()
Copilot uses AI. Check for mistakes.
Comment on lines +420 to +474
todo!();
// // [start_chunk
// let start_chunk_index = data
// .binary_search_by(|(_, start_pos)| start_pos.cmp(&start_range))
// .unwrap_or_else(|insert_pos| insert_pos.saturating_sub(1));

// // end_chunk)
// let end_chunk_index = data
// .binary_search_by(|(chunk, start_pos)| {
// let end_pos = start_pos + chunk.len(); // exclusive
// end_pos.cmp(&end_range)
// })
// .unwrap_or_else(|insert_pos| insert_pos);

// // same chunk
// if start_chunk_index == end_chunk_index {
// // SAFETY: start_chunk_index guarantees valid range
// let (chunk, start_pos) =
// unsafe { data.get_unchecked(start_chunk_index) };
// let start = start_range - start_pos;
// let end = end_range - start_pos;
// // SAFETY: invariant guarantees valid range
// return Rope::from(unsafe { chunk.get_unchecked(start..end) });
// }

// if end_chunk_index < start_chunk_index {
// return Rope::new();
// }

// let mut raw =
// Vec::with_capacity(end_chunk_index - start_chunk_index + 1);
// let mut len = 0;

// // different chunk
// // [start_chunk, end_chunk]
// (start_chunk_index..end_chunk_index + 1).for_each(|i| {
// // SAFETY: [start_chunk_index, end_chunk_index] guarantees valid range
// let (chunk, start_pos) = unsafe { data.get_unchecked(i) };

// if start_chunk_index == i {
// let start = start_range - start_pos;
// // SAFETY: invariant guarantees valid range
// let chunk = unsafe { chunk.get_unchecked(start..) };
// raw.push((chunk, len));
// len += chunk.len();
// } else if end_chunk_index == i {
// let end = end_range - start_pos;
// // SAFETY: invariant guarantees valid range
// let chunk = unsafe { chunk.get_unchecked(..end) };
// raw.push((chunk, len));
// len += chunk.len();
// } else {
// raw.push((chunk, len));
// len += chunk.len();
// }
// });

// Rope {
// repr: Repr::Full(Rc::new(raw)),
// }
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unsafe slice_unchecked method is not implemented. This leaves critical functionality incomplete and will panic at runtime when called.

Suggested change
todo!();
// // [start_chunk
// let start_chunk_index = data
// .binary_search_by(|(_, start_pos)| start_pos.cmp(&start_range))
// .unwrap_or_else(|insert_pos| insert_pos.saturating_sub(1));
// // end_chunk)
// let end_chunk_index = data
// .binary_search_by(|(chunk, start_pos)| {
// let end_pos = start_pos + chunk.len(); // exclusive
// end_pos.cmp(&end_range)
// })
// .unwrap_or_else(|insert_pos| insert_pos);
// // same chunk
// if start_chunk_index == end_chunk_index {
// // SAFETY: start_chunk_index guarantees valid range
// let (chunk, start_pos) =
// unsafe { data.get_unchecked(start_chunk_index) };
// let start = start_range - start_pos;
// let end = end_range - start_pos;
// // SAFETY: invariant guarantees valid range
// return Rope::from(unsafe { chunk.get_unchecked(start..end) });
// }
// if end_chunk_index < start_chunk_index {
// return Rope::new();
// }
// let mut raw =
// Vec::with_capacity(end_chunk_index - start_chunk_index + 1);
// let mut len = 0;
// // different chunk
// // [start_chunk, end_chunk]
// (start_chunk_index..end_chunk_index + 1).for_each(|i| {
// // SAFETY: [start_chunk_index, end_chunk_index] guarantees valid range
// let (chunk, start_pos) = unsafe { data.get_unchecked(i) };
// if start_chunk_index == i {
// let start = start_range - start_pos;
// // SAFETY: invariant guarantees valid range
// let chunk = unsafe { chunk.get_unchecked(start..) };
// raw.push((chunk, len));
// len += chunk.len();
// } else if end_chunk_index == i {
// let end = end_range - start_pos;
// // SAFETY: invariant guarantees valid range
// let chunk = unsafe { chunk.get_unchecked(..end) };
// raw.push((chunk, len));
// len += chunk.len();
// } else {
// raw.push((chunk, len));
// len += chunk.len();
// }
// });
// Rope {
// repr: Repr::Full(Rc::new(raw)),
// }
// Find the start and end chunk indices
let mut start_chunk_index = None;
let mut end_chunk_index = None;
let mut chunk_starts: Vec<_> = data.keys().cloned().collect();
chunk_starts.sort_unstable();
for (i, &pos) in chunk_starts.iter().enumerate() {
if start_chunk_index.is_none() && pos <= start_range {
start_chunk_index = Some(i);
}
if end_chunk_index.is_none() && pos + data[&pos].len() > end_range {
end_chunk_index = Some(i);
}
}
// If not found, set to last/first as appropriate
let start_chunk_index = start_chunk_index.unwrap_or(0);
let end_chunk_index = end_chunk_index.unwrap_or(chunk_starts.len().saturating_sub(1));
if end_chunk_index < start_chunk_index {
return Rope::from("");
}
// If the slice is within a single chunk
if start_chunk_index == end_chunk_index {
let start_pos = chunk_starts[start_chunk_index];
let chunk = data.get(&start_pos).unwrap();
let start = start_range - start_pos;
let end = end_range - start_pos;
// SAFETY: invariant guarantees valid range
return Rope::from(unsafe { chunk.get_unchecked(start..end) });
}
// Otherwise, collect substrings from multiple chunks
let mut raw = BTreeMap::new();
let mut len = 0;
for i in start_chunk_index..=end_chunk_index {
let start_pos = chunk_starts[i];
let chunk = data.get(&start_pos).unwrap();
if i == start_chunk_index {
let start = start_range - start_pos;
// SAFETY: invariant guarantees valid range
let sub = unsafe { chunk.get_unchecked(start..) };
raw.insert(len, sub);
len += sub.len();
} else if i == end_chunk_index {
let end = end_range - start_pos;
// SAFETY: invariant guarantees valid range
let sub = unsafe { chunk.get_unchecked(..end) };
raw.insert(len, sub);
len += sub.len();
} else {
raw.insert(len, chunk);
len += chunk.len();
}
}
Rope {
repr: Repr::Full(Rc::new(raw)),
}
Copilot uses AI. Check for mistakes.
src/rope.rs Outdated
let start_key = tree.range(..=start_range).next_back().map(|(k, _)| *k);
let mut iter = match start_key {
Some(k) => tree.range(k..),
None => tree.range(start_range..),
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When start_key is None, the fallback tree.range(start_range..) will skip all chunks with keys less than start_range. This is incorrect because if no key is <= start_range, the tree is empty or all keys are greater, and we should start from the beginning of the tree with tree.range(..) or return early if empty.

Suggested change
None => tree.range(start_range..),
None => tree.range(..),
Copilot uses AI. Check for mistakes.
}

true
}
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return true; statement at the end of the equality comparison. The function should return true when all chunks have been successfully compared, but currently falls through without returning a value.

Suggested change
}
// If the loop terminates, all chunks matched.
true
Copilot uses AI. Check for mistakes.
Comment on lines +605 to +604
if self.ended {
return None;
} else if byte_idx == total_bytes {
if trailing_line_break_as_newline {
*ended = true;
} else if self.byte_idx == self.total_bytes {
if self.trailing_line_break_as_newline {
self.ended = true;
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The use of self.field instead of the existing pattern match bindings (*ended, *byte_idx) is inconsistent within the same match arm. Either use the pattern bindings consistently or access fields directly throughout.

Copilot uses AI. Check for mistakes.
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 25, 2025

CodSpeed Performance Report

Merging #180 will degrade performances by 54.98%

Comparing SyMind:perf-rope (dcb4cec) with main (faf2138)

Summary

❌ 1 regression
✅ 8 untouched
⏩ 4 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
repetitive_react_components_source 634.5 µs 1,409.2 µs -54.98%

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@SyMind SyMind force-pushed the perf-rope branch 3 times, most recently from 9ade21c to 0cdd63e Compare October 27, 2025 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant