Skip to content
Merged
Prev Previous commit
Next Next commit
Index enumeration by minimally sized type
  • Loading branch information
varkor committed Mar 16, 2018
commit bdcc6f939a10bc23a434b2ef301238650841dec9
37 changes: 25 additions & 12 deletions src/liballoc/slice.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ use core::mem::size_of;
use core::mem;
use core::ptr;
use core::slice as core_slice;
use core::{u8, u16, u32};

use borrow::{Borrow, BorrowMut, ToOwned};
use boxed::Box;
Expand Down Expand Up @@ -1329,19 +1330,31 @@ impl<T> [T] {
pub fn sort_by_key<K, F>(&mut self, f: F)
where F: FnMut(&T) -> K, K: Ord
{
let mut indices: Vec<_> = self.iter().map(f).enumerate().map(|(i, k)| (k, i)).collect();
// The elements of `indices` are unique, as they are indexed, so any sort will be stable
// with respect to the original slice. We use `sort_unstable` here because it requires less
// memory allocation.
indices.sort_unstable();
for i in 0..self.len() {
let mut index = indices[i].1;
while index < i {
index = indices[index].1;
}
indices[i].1 = index;
self.swap(i, index);
// Helper macro for indexing our vector by the smallest possible type, to reduce allocation.
macro_rules! sort_by_key {
($t:ty, $slice:ident, $f:ident) => ({
let mut indices: Vec<_> =
Copy link

@ghost ghost Mar 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to be able to use SmallVec here and save an allocation when sorting short slices.

Just a suggestion, though. Maybe leave that for a future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea; I might leave that for a future PR though, yeah.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than pulling in a full crate (which is mostly about encapsulating stuff in a single type that can be moved as a unit), you can reproduce that functionality with a couple of stack variables. Something like:

pub fn foo<T>(s: &[T], f: &Fn(&T) -> T) { let iter = s.iter().map(f).enumerate().map(|(i, k)| (k, i as u16)); let mut vec: Vec<_>; let mut array: [_; ARRAY_LEN]; const ARRAY_LEN: usize = 32; let len = iter.len(); let mapped = if len <= ARRAY_LEN { array = unsafe { std::mem::uninitialized() }; for (x, slot) in iter.zip(&mut array) { *slot = x } &mut array[..len] } else { vec = iter.collect(); &mut vec[..] }; // … }
Copy link
Contributor

@bluss bluss Mar 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just a caveat, that's the gist of it and the array code needs to take more precautions to be safe.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bluss Oh, do you mean panic-safe? Yes, I forgot about that, the array should be inside a ManuallyDrop.

Copy link
Contributor

@bluss bluss Mar 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And use ptr::write or the equivalent with a ManuallyDrop around each element, that's the two things I could spot.

$slice.iter().map($f).enumerate().map(|(i, k)| (k, i as $t)).collect();
// The elements of `indices` are unique, as they are indexed, so any sort will be
// stable with respect to the original slice. We use `sort_unstable` here because it
// requires less memory allocation.
indices.sort_unstable();
for i in 0..$slice.len() {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need some kind of proof (even an informal one would be okay) that this loop takes linear time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the pictures in this article.

I'm having a harder time proving it is correct rather than linear time, but it looks good to me.

It should be linear time because the inner while loop is walking one of those cycles (illustrated in the article after “will create the following set of cycles:”).

Whenever we walked k + 1 links in a cycle, the indices[i].1 = index write means that we shorten the cycle by k links. (All the now-unreachable links have an index that is < i, so they are not going to be used again.)

That means that the inner while loop's body can only be visited O(self.len()) times.

let mut index = indices[i].1;
while (index as usize) < i {
index = indices[index as usize].1;
}
indices[i].1 = index;
$slice.swap(i, index as usize);
}
})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the time complexity of this for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was too concerned about correctness to notice that I might have slipped up with the time complexity! I think this is quadratic in the worst case, given some thought. I'll switch to the one you suggested — that seems like a safer choice in any respect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I can easily modify my method to reduce it to linear. It also seems to have a lower constant factor overhead than the one in the article, so a win both ways :)


let len = self.len();
if len <= ( u8::MAX as usize) { return sort_by_key!( u8, self, f) }
if len <= (u16::MAX as usize) { return sort_by_key!(u16, self, f) }
if len <= (u32::MAX as usize) { return sort_by_key!(u32, self, f) }
sort_by_key!(usize, self, f)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have four cases, and each one will use a separate monomorphized version of sort_unstable. This creates some code bloat, but the opportunity of choosing the most suitable version at runtime is hopefully worth it (any benchmarks?).

However, my feeling is that in many cases size_of::<(K, u8)>() will be equal to size_of::<(K, usize)>() anyway so it doesn't matter which index type we choose.

Let's add another set of conditions to these ifs to remove the code bloat whenever possible, like this:

let size_u8 = mem::size_of::<(K, u8)>(); let size_u16 = mem::size_of::<(K, u16)>(); let size_u32 = mem::size_of::<(K, u32)>(); let size_usize = mem::size_of::<(K, usize)>(); if size_u8 < size_u16 && len <= ( u8::MAX as usize) { ... } if size_u16 < size_u32 && len <= (u16::MAX as usize) { ... } if size_u32 < size_usize && len <= (u32::MAX as usize) { ... }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check that modifying the sizes like this had a positive effect, though I don't think I kept the results now; I can do more later if you'd like to see them.

}

/// Sorts the slice, but may not preserve the order of equal elements.
Expand Down