Models for Concurrent Programming tobias@neotechnology.com Tobias Ivarsson twitter: @thobe web: http://thobe.org/ http://neo4j.org/ hacker @ Neo Technology
Common misconceptions 2
More threads == More throughput 3
Finite number of cores 4
Other limiting factors f.ex. I/O is only one pipe 5
Locking always impedes performance 6
Concurrency on the track, only one in the station 7
Amdahl’s law 1 Speedup ≤ 1-F F+ N N: Number of processors F: Serial fraction 8
Throughput with Throughput: 3 synchronized regions 9
Throughput with Throughput: 6 synchronized regions wait... 9
Throughput with synchronized Throughput: 9 regions wait... wait... 9
Throughput with synchronized Throughput: 11 regions wait... wait... wait... 9
Throughput with synchronized Throughput: 11 regions wait... wait... wait... wait... wait... wait... wait... wait... wait... wait... wait... 10
Throughput with synchronized regions Throughput: 11 wait... wait... wait... wait... wait... wait... wait... 11
The fundamentals ๏Threads ๏Mutual exclusion synchronization and locks ๏Java Memory Model read/write barriers & publication 12
The Java Memory Model 1 33 What you need to know JSR ๏ Visibility / Publication of objects/fields ๏ (Un)Acceptable out-of-order behavior i.e. guarantees about ordering ๏ Implementations based on the hardware MM of the target platform 13
Concurrency abstractions 14
Java as we know it 15
Threads ๏ Abstract the line of excution from the CPU(s) ๏ Provided by most (all mainstream) operating systems ๏ CPU execution can continue on another thread if one thread blocks ๏ Increases CPU utilization 16
Monitors synchronized (someMonitor) { read barrier // single threaded region } write barrier 17
Volatile read/write 18
class Thing { volatile String name; String getName() { return this.name; read barrier } void setName( String newName ) { this.name = newName; write barrier } } ๏ Guarantees visibility: always read the latest value ๏Guarantees safe publication: no re-ordering of pre-write ops with post-write ops 19
Monitors synchronized (someMonitor) { read barrier // single threaded region } write barrier ๏Writes may not be reordered across the end ๏Reads may not be reordered across the start 20
java.util.concurrent 21
ConcurrentMap<String,Thing> map = new ConcurrentHashMap(); ๏Adds atomic operations to the Map interface: - putIfAbsent(key, value) - remove(key, value) - replace(key,[oldVal,]newVal) - but not yet putIfAbsent(key, #{makeValue()}) ๏CHM is lock striped, i.e. synchronized on regions 22
List<Thing> list = new CopyOnWriteArrayList(); ๏Synchronize writers, unrestricted readers ๏Good for: #reads >> #writes ๏Readers get snapshot state: gives stable iteration ๏Volatile variable for immutable state to ensure publication 23
public class CopyOnWriteArrayList<T> private volatile Object[] array = new Object[0]; public synchronized boolean add(T value) { Object[] newArray = Arrays.copyOf(array, array.length+1); newArray[array.length] = value; array = newArray; // write barrier return true; } // write barrier public Iterator<T> iterator() { return new ArrayIterator<T>(this.array); // read barrier } private static class ArrayIterator<T> implements Iterator<T> { // I’ve written too many of these to be amused any more ... } 24
ExecutorService executor = new ThreadPoolExecutor(); executor.submit(new Runnable() { ๏ void run() { doWork(); Focus on tasks - not threads } }); ๏Mitigate thread start/stop overhead ๏Ensure a balanced number of threads for the target platform 25
java.util.concurrent.locks 26
LockSupport.park(); // currentThread LockSupport.unpark(Thread t); ๏The thread will “sleep” until unparked ๏Although unpark-before-park marks the thread as unparked, returning from park immediately ๏... and there’s still the chance of spurious wakeups ... ๏Too low level for most people 27
Lock lock = new ReentrantLock(); ReadWriteLock rwLock = new ReentrantReadWriteLock(); Lock read = rwLock.readLock(); // Two related locks Lock write = rwLock.writeLock(); lock.lock(); try { doWork() } finally { lock.unlock(); } if ( lock.tryLock() ) try { doWork() } finally { lock.unlock(); } else backOffAndTryAgainLater(); 28
java.util.concurrent.atomic 29
AtomicReference<Thing> ref = new AtomicReference(); AtomicReferenceArray<Thing> array = new AtomicReferenceArray(); 30
AtomicReference<Thing> ref = new AtomicReference(); AtomicReferenceArray<Thing> array = new AtomicReferenceArray(); class MyAtomicThingReference { volatile Thing thing; static AtomicReferenceFieldUpdater <MyAtomicThingReference,Thing> THING = newUpdater(...); Thing swap( Thing newThing ) { return THING.getAndSet( this, newThing ); } ๏Atomic* gives you compareAndSet() } ๏Atomic*Updater saves indirection: ๏One less object header - less memory ๏One read-address-get-object operation: better cache locality 30
Java of Tomorrow (actually Today, JDK7 is already out) 31
ForkJoin ๏More “advanced” Executor ๏Assumes/requires more of the tasks it runs ๏Tasks first split into multiple sub-tasks, push on stack ๏Idle workers “steal” work from the bottom of the stack of other workers 32
Models not in Java today 33
ParallelArray 34
ParallelArray<Student> students = ... double highestScore = students .filter( #{ Student s -> s.gradYear == 2010 } ) .map( #{ Student s -> s.score } ) .max(); 35
Transactional Memory 36
void transfer( TransactionalLong source, TransactionalLong target, long amount ) { try ( Transaction tx = txManager.beingTransaction() ) { long sourceFunds = source.getValue(); if (sourceFunds < amount) { throw new InsufficientFundsException(); } source.setValue( sourceFunds - amount ); target.setValue( target.getValue() + amount ); tx.success(); } } 37
Actors 38
class SimpleActor extends Actor { var state def receive = { case Get => self reply state case Set(newState) => state = newState } } ge” ans ssa “se s me me nd i Th val response = simple !! Set( "new state" ) 39
Scala Parallel Collection Framework 40
Efficient Collection splitting & combining 41
Efficient Collection splitting & combining 41
Efficient Collection splitting & combining 41
Efficient Collection splitting & combining 41
Splittable maps 15 5 174 2 8 42
Splittable maps 8 5 15 2 174 43
Hash tries ๏Similar performance to hash tables ๏Splittable (like arrays) ๏Memory (re)allocation more like trees ๏No resizing races 44
Clojure 45
Immutable by default 46
Thread local (and scoped) override 47
(def x 10) ; create a root binding (def x 42) ; redefine the root binding (binding [x 13] ...) ; create a thread local override 48
Transactional memory (ref) and (dosync ...) 49
(def street (ref)) (def city (ref)) (defn move [new-street new-city] (dosync ; synchronize the changes (ref-set street new-street) (ref-set city new-city))) ; will retry if txn fails due to concurrent change (defn print-address [] (dosync ; get a snapshot state of the refs (printf "I live on %s in %s%n" @street @city))) 50
(atom) easier API for AtomicReference 51
(defstruct address-t :street :city) (def address (atom)) (defn move [new-street new-city] (set address (struct address-t new-street new-city))) (defn print-address [] (let [addr @address] ; get snapshot (printf "I live on %s in %s%n" (get addr :street) (get addr :city)))) 52
Explicit asynchronous updates 53
(agent) A simpler cousin to actors 54
(def account (agent)) (defn deposit [balance amount] (+ balance amount)) (send account deposit 1000) 55
we’re hiring http://neotechnology.com/about-us/jobs http://neotechnology.com
Questions?

[JavaOne 2011] Models for Concurrent Programming