Vectors smaller than 32 allocate or reuse small arrays when possible #7743

joshlemer · 2019-02-12T22:19:42Z

One critique often made against Vectors is that small Vectors consume a lot of memory (See Li Haoyi's post here ). Previous to this PR, Vectors of sizes in range [1, 31] would always round up and allocate an Array of size 32. Now, when creating a Vector with known size < 32, only exactly that size of an Array is allocated.

As well, when creating from varargs of AnyRefs (collection.immutable.ArraySeq[T <: AnyRef]) of size <= 32, we reuse the underlying Array[AnyRef].

Performance of creating a Vector from varargs is greatly improved, for both AnyRef and AnyVal types.

Benchmarks

Code:

@Benchmark def apply5String(bh: Blackhole): Unit = { bh.consume(Vector("1", "2", "3", "4", "5")) } @Benchmark def apply5Int(bh: Blackhole): Unit = { bh.consume(Vector(1,2,3,4,5)) } @Benchmark def apply5StringOld(bh: Blackhole): Unit = { bh.consume(Vector.applyOld("1", "2", "3", "4", "5")) } @Benchmark def apply5IntOld(bh: Blackhole): Unit = { bh.consume(Vector.applyOld(1,2,3,4,5)) }

GC-profiled benchmark:

[info] Benchmark Mode Cnt Score Error Units [info] VectorBenchmark.apply5Int avgt 10 37.862 ± 0.941 ns/op [info] VectorBenchmark.apply5Int:·gc.alloc.rate avgt 10 3223.910 ± 77.807 MB/sec [info] VectorBenchmark.apply5Int:·gc.alloc.rate.norm avgt 10 192.000 ± 0.001 B/op [info] VectorBenchmark.apply5Int:·gc.churn.PS_Eden_Space avgt 10 3222.198 ± 273.634 MB/sec [info] VectorBenchmark.apply5Int:·gc.churn.PS_Eden_Space.norm avgt 10 191.965 ± 17.591 B/op [info] VectorBenchmark.apply5Int:·gc.churn.PS_Survivor_Space avgt 10 0.110 ± 0.071 MB/sec [info] VectorBenchmark.apply5Int:·gc.churn.PS_Survivor_Space.norm avgt 10 0.007 ± 0.004 B/op [info] VectorBenchmark.apply5Int:·gc.count avgt 10 106.000 counts [info] VectorBenchmark.apply5Int:·gc.time avgt 10 83.000 ms [info] VectorBenchmark.apply5IntOld avgt 10 73.371 ± 0.792 ns/op [info] VectorBenchmark.apply5IntOld:·gc.alloc.rate avgt 10 2979.859 ± 32.103 MB/sec [info] VectorBenchmark.apply5IntOld:·gc.alloc.rate.norm avgt 10 344.000 ± 0.001 B/op [info] VectorBenchmark.apply5IntOld:·gc.churn.PS_Eden_Space avgt 10 2984.372 ± 249.212 MB/sec [info] VectorBenchmark.apply5IntOld:·gc.churn.PS_Eden_Space.norm avgt 10 344.551 ± 29.454 B/op [info] VectorBenchmark.apply5IntOld:·gc.churn.PS_Survivor_Space avgt 10 0.100 ± 0.061 MB/sec [info] VectorBenchmark.apply5IntOld:·gc.churn.PS_Survivor_Space.norm avgt 10 0.012 ± 0.007 B/op [info] VectorBenchmark.apply5IntOld:·gc.count avgt 10 106.000 counts [info] VectorBenchmark.apply5IntOld:·gc.time avgt 10 80.000 ms [info] VectorBenchmark.apply5String avgt 10 10.452 ± 0.080 ns/op [info] VectorBenchmark.apply5String:·gc.alloc.rate avgt 10 5835.117 ± 45.558 MB/sec [info] VectorBenchmark.apply5String:·gc.alloc.rate.norm avgt 10 96.000 ± 0.001 B/op [info] VectorBenchmark.apply5String:·gc.churn.PS_Eden_Space avgt 10 5730.019 ± 452.277 MB/sec [info] VectorBenchmark.apply5String:·gc.churn.PS_Eden_Space.norm avgt 10 94.270 ± 7.385 B/op [info] VectorBenchmark.apply5String:·gc.churn.PS_Survivor_Space avgt 10 0.104 ± 0.049 MB/sec [info] VectorBenchmark.apply5String:·gc.churn.PS_Survivor_Space.norm avgt 10 0.002 ± 0.001 B/op [info] VectorBenchmark.apply5String:·gc.count avgt 10 112.000 counts [info] VectorBenchmark.apply5String:·gc.time avgt 10 91.000 ms [info] VectorBenchmark.apply5StringOld avgt 10 71.918 ± 1.024 ns/op [info] VectorBenchmark.apply5StringOld:·gc.alloc.rate avgt 10 3110.849 ± 44.149 MB/sec [info] VectorBenchmark.apply5StringOld:·gc.alloc.rate.norm avgt 10 352.000 ± 0.001 B/op [info] VectorBenchmark.apply5StringOld:·gc.churn.PS_Eden_Space avgt 10 3125.554 ± 234.281 MB/sec [info] VectorBenchmark.apply5StringOld:·gc.churn.PS_Eden_Space.norm avgt 10 353.583 ± 23.029 B/op [info] VectorBenchmark.apply5StringOld:·gc.churn.PS_Survivor_Space avgt 10 0.100 ± 0.063 MB/sec [info] VectorBenchmark.apply5StringOld:·gc.churn.PS_Survivor_Space.norm avgt 10 0.011 ± 0.007 B/op [info] VectorBenchmark.apply5StringOld:·gc.count avgt 10 119.000 counts [info] VectorBenchmark.apply5StringOld:·gc.time avgt 10 88.000 ms

Edit:
Some benchmark for Vector.from(ArrayBuffer)

 val ab = collection.mutable.ArrayBuffer(1,2,3,4,5) @Benchmark def from5Old(bh: Blackhole): Unit = { bh.consume(Vector.fromOld(ab)) } @Benchmark def from5(bh: Blackhole): Unit = { bh.consume(Vector.from(ab)) }

[info] Benchmark Mode Cnt Score Error Units [info] VectorBenchmark.from5 avgt 10 39.882 ± 0.785 ns/op [info] VectorBenchmark.from5:·gc.alloc.rate avgt 10 2295.068 ± 46.150 MB/sec [info] VectorBenchmark.from5:·gc.alloc.rate.norm avgt 10 144.000 ± 0.001 B/op [info] VectorBenchmark.from5:·gc.churn.PS_Eden_Space avgt 10 2325.590 ± 232.257 MB/sec [info] VectorBenchmark.from5:·gc.churn.PS_Eden_Space.norm avgt 10 145.898 ± 13.829 B/op [info] VectorBenchmark.from5:·gc.churn.PS_Survivor_Space avgt 10 0.106 ± 0.052 MB/sec [info] VectorBenchmark.from5:·gc.churn.PS_Survivor_Space.norm avgt 10 0.007 ± 0.003 B/op [info] VectorBenchmark.from5:·gc.count avgt 10 104.000 counts [info] VectorBenchmark.from5:·gc.time avgt 10 94.000 ms [info] VectorBenchmark.from5Old avgt 10 68.592 ± 1.548 ns/op [info] VectorBenchmark.from5Old:·gc.alloc.rate avgt 10 2743.165 ± 61.659 MB/sec [info] VectorBenchmark.from5Old:·gc.alloc.rate.norm avgt 10 296.000 ± 0.001 B/op [info] VectorBenchmark.from5Old:·gc.churn.PS_Eden_Space avgt 10 2717.767 ± 422.149 MB/sec [info] VectorBenchmark.from5Old:·gc.churn.PS_Eden_Space.norm avgt 10 293.202 ± 44.510 B/op [info] VectorBenchmark.from5Old:·gc.churn.PS_Survivor_Space avgt 10 0.090 ± 0.052 MB/sec [info] VectorBenchmark.from5Old:·gc.churn.PS_Survivor_Space.norm avgt 10 0.010 ± 0.005 B/op [info] VectorBenchmark.from5Old:·gc.count avgt 10 86.000 counts [info] VectorBenchmark.from5Old:·gc.time avgt 10 79.000 ms

src/library/scala/collection/immutable/Vector.scala

joshlemer · 2019-02-14T20:32:07Z

Unfortunately I am seeing about a 40% slowdown in this code:

 @Benchmark def appended0To33(bh: Blackhole): Unit = { var v: Vector[Int] = Vector.empty[Int] var i = 0 while (i < 33) { v = v.appended(i) i += 1 } bh.consume(v) }

hoping to avoid that..

joshlemer · 2019-02-14T20:43:18Z

❗ (tentatively) Fixed it, turned that around to a 50-70% speedup over 2.13.x, and on par with ArraySeq

 // this branch [info] Benchmark Mode Cnt Score Error Units [info] VectorBenchmark.appended0To32 avgt 10 853.292 ± 17.276 ns/op [info] VectorBenchmark.appended0To32ArraySeq avgt 10 813.642 ± 204.851 ns/op [info] VectorBenchmark.appended0To33 avgt 10 942.780 ± 8.932 ns/op // 2.13.x [info] Benchmark Mode Cnt Score Error Units [info] VectorBenchmark.appended0To32 avgt 10 3371.617 ± 1899.937 ns/op [info] VectorBenchmark.appended0To33 avgt 10 2699.577 ± 85.111 ns/op

in the more general case...

 @Param(Array("1", "10", "100", "1000", "10000")) var size: Int = _ @Benchmark def appended0ToN(bh: Blackhole): Unit = { var v: Vector[Int] = Vector.empty[Int] var i = 0 while (i < size) { v = v.appended(i) i += 1 } bh.consume(v) }

branch small-vectors [info] Benchmark (size) Mode Cnt Score Error Units [info] VectorBenchmark.appended0ToN 1 avgt 10 9.776 ± 0.220 ns/op [info] VectorBenchmark.appended0ToN 10 avgt 10 229.207 ± 11.370 ns/op [info] VectorBenchmark.appended0ToN 100 avgt 10 7135.687 ± 301.753 ns/op [info] VectorBenchmark.appended0ToN 1000 avgt 10 90524.928 ± 867.974 ns/op [info] VectorBenchmark.appended0ToN 10000 avgt 10 946733.072 ± 14830.945 ns/op branch 2.13.x [info] Benchmark (size) Mode Cnt Score Error Units [info] VectorBenchmark.appended0ToN 1 avgt 10 17.879 ± 0.230 ns/op [info] VectorBenchmark.appended0ToN 10 avgt 10 840.940 ± 9.183 ns/op [info] VectorBenchmark.appended0ToN 100 avgt 10 9266.415 ± 277.926 ns/op [info] VectorBenchmark.appended0ToN 1000 avgt 10 106326.725 ± 4853.579 ns/op [info] VectorBenchmark.appended0ToN 10000 avgt 10 935528.214 ± 11043.096 ns/op

joshlemer · 2019-02-15T15:20:21Z

src/library/scala/collection/immutable/Vector.scala

- s.display0(lo) = value.asInstanceOf[AnyRef]
+ val thisLength = length
+ val result =
+ if (depth == 1 && thisLength < 32) {


This branch and the Vector.single branch are the only changed branches, the reset of this method is just indentation

joshlemer · 2019-02-15T15:23:44Z

src/library/scala/collection/immutable/Vector.scala

- } else {
- val shift = startIndex & ~((1 << (5 * (depth - 1))) - 1)
- val shiftBlocks = startIndex >>> (5 * (depth - 1))
+ } else if (thisLength > 0) {


Mostly indentation, plus changing condition to have a bit more obvious of logic:

old:
if (startIndex != endIndex)
new:
if (thisLength > 0) // thisLength already computed anyways

src/library/scala/collection/immutable/Vector.scala

retronym · 2019-03-27T04:46:53Z

src/library/scala/collection/immutable/Vector.scala

+ System.arraycopy(display0, startIndex, newDisplay0, 1, thisLength)
+ newDisplay0(0) = value.asInstanceOf[AnyRef]
+ s.display0 = newDisplay0
 s


👍 already Covered by the releaseFence below.

retronym · 2019-03-27T04:47:13Z

src/library/scala/collection/immutable/Vector.scala

+ System.arraycopy(display0, startIndex, newDisplay0, 0, thisLength)
+ newDisplay0(thisLength) = value.asInstanceOf[AnyRef]
+ s.display0 = newDisplay0
 s


👍 already Covered by the releaseFence below.

src/library/scala/collection/immutable/Vector.scala

retronym · 2019-03-27T04:52:07Z

Just pushed a commit with the fences

retronym

Looking forward to these slimmer Vectors!

…ating from varargs, reuse underlying array

… code

SethTisue · 2019-03-27T16:31:05Z

rebased (trivial merge conflict involving imports), let's merge once CI likes it

SethTisue · 2019-03-30T08:03:12Z

Looking forward to these slimmer Vectors!

yeah, it's awesome this got in, thanks Josh

joshlemer added WIP performance the need for speed. usually compiler performance, sometimes runtime performance. library:collections PRs involving changes to the standard collection library labels Feb 12, 2019

scala-jenkins added this to the 2.13.1 milestone Feb 12, 2019

joshlemer changed the title ~~Vectors smaller than 32 allocate or reuse small arrays when possible.~~ [WIP][do not merge]Vectors smaller than 32 allocate or reuse small arrays when possible. Feb 12, 2019

viktorklang reviewed Feb 12, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Show resolved Hide resolved

joshlemer force-pushed the small-vectors branch 3 times, most recently from ef7b1bd to f8c01a6 Compare February 13, 2019 04:40

joshlemer commented Feb 14, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Outdated Show resolved Hide resolved

joshlemer force-pushed the small-vectors branch 4 times, most recently from 821065a to 07f352f Compare February 15, 2019 02:50

joshlemer commented Feb 15, 2019

View reviewed changes

joshlemer changed the title ~~[WIP][do not merge]Vectors smaller than 32 allocate or reuse small arrays when possible.~~ Vectors smaller than 32 allocate or reuse small arrays when possible. Mar 22, 2019

joshlemer force-pushed the small-vectors branch from 07f352f to 2277048 Compare March 22, 2019 18:44

joshlemer removed the WIP label Mar 22, 2019

joshlemer force-pushed the small-vectors branch from 2277048 to 0ca94c6 Compare March 22, 2019 18:47

adriaanm modified the milestones: 2.13.1, 2.13.0-RC1 Mar 26, 2019

retronym reviewed Mar 27, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Show resolved Hide resolved

retronym reviewed Mar 27, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Show resolved Hide resolved

retronym reviewed Mar 27, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Show resolved Hide resolved

retronym reviewed Mar 27, 2019

View reviewed changes

src/library/scala/collection/immutable/Vector.scala Show resolved Hide resolved

retronym approved these changes Mar 27, 2019

View reviewed changes

joshlemer and others added 2 commits March 27, 2019 17:27

Vectors smaller than 32 allocate small arrays when possible. When cre…

b5bce42

…ating from varargs, reuse underlying array

Add releaseFences before we return a freshly mutated Vector to client…

fe80a23

… code

SethTisue force-pushed the small-vectors branch from 67752df to fe80a23 Compare March 27, 2019 16:30

SethTisue merged commit 12cde30 into scala:2.13.x Mar 27, 2019

joshlemer deleted the small-vectors branch March 27, 2019 17:22

SethTisue added the release-notes worth highlighting in next release notes label Apr 4, 2019

SethTisue changed the title ~~Vectors smaller than 32 allocate or reuse small arrays when possible.~~ Vectors smaller than 32 allocate or reuse small arrays when possible Apr 4, 2019

joshlemer mentioned this pull request Sep 9, 2019

Construct Vector without an invalid internal state #8246

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectors smaller than 32 allocate or reuse small arrays when possible #7743

Vectors smaller than 32 allocate or reuse small arrays when possible #7743

Uh oh!

joshlemer commented Feb 12, 2019 •

edited

Loading

Uh oh!

Uh oh!

joshlemer commented Feb 14, 2019

joshlemer commented Feb 14, 2019 •

edited

Loading

joshlemer Feb 15, 2019

joshlemer Feb 15, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

retronym Mar 27, 2019

retronym Mar 27, 2019

Uh oh!

retronym commented Mar 27, 2019

retronym left a comment

SethTisue commented Mar 27, 2019 •

edited

Loading

SethTisue commented Mar 30, 2019 •

edited

Loading

Labels

6 participants

Vectors smaller than 32 allocate or reuse small arrays when possible #7743

Vectors smaller than 32 allocate or reuse small arrays when possible #7743

Uh oh!

Conversation

joshlemer commented Feb 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Uh oh!

Uh oh!

joshlemer commented Feb 14, 2019

joshlemer commented Feb 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

joshlemer Feb 15, 2019

Choose a reason for hiding this comment

joshlemer Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

retronym Mar 27, 2019

Choose a reason for hiding this comment

retronym Mar 27, 2019

Choose a reason for hiding this comment

Uh oh!

retronym commented Mar 27, 2019

retronym left a comment

Choose a reason for hiding this comment

SethTisue commented Mar 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

SethTisue commented Mar 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

6 participants

joshlemer commented Feb 12, 2019 •

edited

Loading

joshlemer commented Feb 14, 2019 •

edited

Loading

joshlemer Feb 15, 2019 •

edited

Loading

SethTisue commented Mar 27, 2019 •

edited

Loading

SethTisue commented Mar 30, 2019 •

edited

Loading