Parallel Programming Patterns Аудиторія: розробники Олександр Павлишак, 2011 pavlyshak@gmail.com
Зміст - Тренд - Основні терміни - Managing state - Паралелізм - Засоби
Вчора
Сьогодні
Завтра
Що відбувається? - Ріст частоти CPU вповільнився - Через фізичні обмеження - Free lunch is over - ПЗ більше не стає швидшим саме по собі
Сучасні тренди - Manycore, multicore - GPGPU, GPU acceleration, heterogeneous computing - Distributed computing, HPC
Основні поняття - Concurrency - Many interleaved threads of control - Parallelism - Same result, but faster - Concurrency != Parallelism - It is not always necessary to care about concurrency while implementing parallelism - Multithreading - Asynchrony
Задачі - CPU-bound - number crunching - I/O-bound - network, disk
Стан - Shared - accessible by more than one thread - sharing is transitive - Private - used by single thread only
Task-based program Application Tasks (CPU, I/O) Runtime (queuing, scheduling) Processors (threads, processes)
Managing state
Isolation - Avoiding shared state - Own copy of state - Examples: - process isolation - intraprocess isolation - by convention
Immutability - Multiple read -- not a problem! - All functions are pure - Requires immutable collections - Functional way: Haskell, F#, Lisp
Synchronization - The only thing that remains to deal with shared mutable state - Kinds: - data synchronization - control synchronization
Data synchronization - Why? To avoid race conditions and data corruption - How? Mutual exclusion - Data remains consistent - Critical regions - locks, monitors, critical sections, spin locks - Code-centered - rather than associated with data
Critical region |Thread 1 |Thread 2 |// ... |// ... |lock (locker) | |{ | | // ... | | data.Operation(); | | // ... | |} | |// ... |lock (locker) | |{ | | // ... | | data.Operation(); | // ...
Control synchronization - To coordinate control flow - exchange data - orchestrate threads - Waiting, notifications - spin waiting - events - alternative: continuations
Three ways to manage state - Isolation: simple, loosely coupled, highly scalable, right data structures, locality - Immutability: avoids sync - Synchronization: complex, runtime overheads, contention - in that order
Паралелізм
Підходи до розбиття задач - Data parallelism - Task parallelism - Message based parallelism
Data parallelism How? - Data is divided up among hardware processors - Same operation is performed on elements - Optionally -- final aggregation
Data parallelism When? - Large amounts of data - Processing operation is costly - or both
Data parallelism Why? - To achieve speedup - For example, with GPU acceleration: - hours instead of days!
Data parallelism Embarrassingly parallel problems - parallelizable loops - image processing Non-embarrassingly parallel problems - parallel QuickSort
Data parallelism ... ... Thread 1 Thread 2
Data parallelism Structured parallelism - Well defined begin and end points - Examples: - CoBegin - ForAll
CoBegin var firstDataset = new DataItem[1000]; var secondDataset = new DataItem[1000]; var thirdDataset = new DataItem[1000]; Parallel.Invoke( () => Process(firstDataset), () => Process(secondDataset), () => Process(thirdDataset) );
Parallel For var items = new DataItem[1000 * 1000]; // ... Parallel.For(0, items.Length, i => { Process(items[i]); });
Parallel ForEach var tickers = GetNasdaqTickersStream(); Parallel.ForEach(tickers, ticker => { Process(ticker); });
Striped Partitioning ... Thread 1 Thread 2
Iterate complex data structures var tree = new TreeNode(); // ... Parallel.ForEach( TraversePreOrder(tree), node => { Process(node); });
Iterate complex data Thread 1 Thread 2 ...
Declarative parallelism var items = new DataItem[1000 * 1000]; // ... var validItems = from item in items.AsParallel() let processedItem = Process(item) where processedItem.Property > 42 select Convert(processedItem); foreach (var item in validItems) { // ... }
Data parallelism Challenges - Partitioning - Scheduling - Ordering - Merging - Aggregation - Concurrency hazards: data races, contention
Task parallelism How? - Programs are already functionally partitioned: statements, methods etc. - Run independent pieces in parallel - Control synchronization - State isolation
Task parallelism Why? - To achieve speedup
Task parallelism Kinds - Structured - clear begin and end points - Unstructured - often demands explicit synchronization
Fork/join - Fork: launch tasks asynchronously - Join: wait until they complete - CoBegin, ForAll - Recursive decomposition
Fork/join Task 1 Task 2 Task 3 Seq Seq
Fork/join Parallel.Invoke( () => LoadDataFromFile(), () => SavePreviousDataToDB(), () => RenewOtherDataFromWebService());
Fork/join Task loadData = Task.Factory.StartNew(() => { // ... }); Task saveAnotherDataToDB = Task.Factory.StartNew(() => { // ... }); // ... Task.WaitAll(loadData, saveAnotherDataToDB); // ...
Fork/join void Walk(TreeNode node) { var tasks = new[] { Task.Factory.StartNew(() => Process(node.Value)), Task.Factory.StartNew(() => Walk(node.Left)), Task.Factory.StartNew(() => Walk(node.Right)) }; Task.WaitAll(tasks); }
Fork/join recursive Root Node Left Seq Left Seq Right Right Node Left Right
Dataflow parallelism: Futures Task<DataItem[]> loadDataFuture = Task.Factory.StartNew(() => { //... return LoadDataFromFile(); }); var dataIdentifier = SavePreviousDataToDB(); RenewOtherDataFromWebService(dataIdentifier); //... DisplayDataToUser(loadDataFuture.Result);
Dataflow parallelism: Futures Future Seq Seq Seq
Dataflow parallelism: Futures Future Future Future Seq Seq Seq Seq Seq
Continuations Task Task Task Seq Seq Seq
Continuations var loadData = Task.Factory.StartNew(() => { return LoadDataFromFile(); }); var writeToDB = loadData.ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }); var reportToUser = writeToDB.ContinueWith(t => { // ... }); reportProgressToUser.Wait();
Producer/consumer pipeline reading parsing storing parsed lines DB lines
Producer/consumer pipeline lines parsed lines DB
Producer/consumer var lines = new BlockingCollection<string>(); Task.Factory.StartNew(() => { foreach (var line in File.ReadLines(...)) lines.Add(line); lines.CompleteAdding(); });
Producer/consumer var dataItems = new BlockingCollection<DataItem>(); Task.Factory.StartNew(() => { foreach (var line in lines.GetConsumingEnumerable() ) dataItems.Add(Parse(line)); dataItems.CompleteAdding(); });
Producer/consumer var dbTask = Task.Factory.StartNew(() => { foreach (var item in dataItems.GetConsumingEnumerable() ) WriteToDatabase(item); }); dbTask.Wait();
Task parallelism Challenges - Scheduling - Cancellation - Exception handling - Concurrency hazards: deadlocks, livelocks, priority inversions etc.
Message based parallelism - Accessing shared state vs. local state - No distinction, unfortunately - Idea: encapsulate shared state changes into messages - Async events - Actors, agents
Засоби
Concurrent data structures - Concurrent Queues, Stacks, Sets, Lists - Blocking collections, - Work stealing queues - Lock free data structures - Immutable data structures
Synchronization primitives - Critical sections, - Monitors, - Auto- and Manual-Reset Events, - Coundown Events, - Mutexes, - Semaphores, - Timers, - RW locks - Barriers
Thread local state - A way to achieve isolation var parser = new ThreadLocal<Parser>( () => CreateParser()); Parallel.ForEach(items, item => parser.Value.Parse(item));
Thread pools ThreadPool.QueueUserWorkItem(_ => { // do some work });
Async Task.Factory.StartNew(() => { //... return LoadDataFromFile(); }) .ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }) .ContinueWith(t => { // ... });
Async var dataItems = await LoadDataFromFileAsync(); textBox.Text = dataItems.Count.ToString(); await WriteToDatabaseAsync(dataItems); // continue work
Технології - TPL, PLINQ, C# async, TPL Dataflow - PPL, Intel TBB, OpenMP - CUDA, OpenCL, C++ AMP - Actors, STM - Many others
Підсумок - Програмування для багатьох CPU - Concurrency != parallelism - CPU-bound vs. I/O-bound tasks - Private vs. shared state
Підсумок - Managing state: - Isolation - Immutability - Synchronization - Data: mutual exclusion - Control: notifications
Підсумок - Паралелізм: - Data parallelism: scalable - Task parallelism: less scalable - Message based parallelism
Підсумок - Data parallelism - CoBegin - Parallel ForAll - Parallel ForEach - Parallel ForEach over complex data structures - Declarative data parallelism - Challenges: partitioning, scheduling, ordering, merging, aggregation, concurrency hazards
Підсумок - Task parallelism: structured, unstructured - Fork/Join - CoBegin - Recursive decomposition - Futures - Continuations - Producer/consumer (pipelines) - Challenges: scheduling, cancellation, exceptions, concurrency hazards
Підсумок - Засоби/інструменти - Компілятори, бібліотеки - Concurrent data structures - Synchronization primitives - Thread local state - Thread pools - Async invocations - ...
Q/A

Parallel programming patterns - Олександр Павлишак

  • 1.
    Parallel Programming Patterns Аудиторія: розробники Олександр Павлишак, 2011 pavlyshak@gmail.com
  • 2.
    Зміст - Тренд - Основні терміни - Managing state - Паралелізм - Засоби
  • 3.
  • 4.
  • 5.
  • 6.
    Що відбувається? - Ріст частоти CPU вповільнився - Через фізичні обмеження - Free lunch is over - ПЗ більше не стає швидшим саме по собі
  • 7.
    Сучасні тренди - Manycore,multicore - GPGPU, GPU acceleration, heterogeneous computing - Distributed computing, HPC
  • 8.
    Основні поняття - Concurrency - Many interleaved threads of control - Parallelism - Same result, but faster - Concurrency != Parallelism - It is not always necessary to care about concurrency while implementing parallelism - Multithreading - Asynchrony
  • 9.
    Задачі - CPU-bound - number crunching - I/O-bound - network, disk
  • 10.
    Стан - Shared - accessible by more than one thread - sharing is transitive - Private - used by single thread only
  • 11.
    Task-based program Application Tasks (CPU, I/O) Runtime (queuing, scheduling) Processors (threads, processes)
  • 12.
  • 13.
    Isolation - Avoiding sharedstate - Own copy of state - Examples: - process isolation - intraprocess isolation - by convention
  • 14.
    Immutability - Multiple read -- not a problem! - All functions are pure - Requires immutable collections - Functional way: Haskell, F#, Lisp
  • 15.
    Synchronization - The onlything that remains to deal with shared mutable state - Kinds: - data synchronization - control synchronization
  • 16.
    Data synchronization - Why?To avoid race conditions and data corruption - How? Mutual exclusion - Data remains consistent - Critical regions - locks, monitors, critical sections, spin locks - Code-centered - rather than associated with data
  • 17.
    Critical region |Thread 1 |Thread 2 |// ... |// ... |lock (locker) | |{ | | // ... | | data.Operation(); | | // ... | |} | |// ... |lock (locker) | |{ | | // ... | | data.Operation(); | // ...
  • 18.
    Control synchronization - Tocoordinate control flow - exchange data - orchestrate threads - Waiting, notifications - spin waiting - events - alternative: continuations
  • 19.
    Three ways tomanage state - Isolation: simple, loosely coupled, highly scalable, right data structures, locality - Immutability: avoids sync - Synchronization: complex, runtime overheads, contention - in that order
  • 20.
  • 21.
    Підходи до розбиттязадач - Data parallelism - Task parallelism - Message based parallelism
  • 22.
    Data parallelism How? - Datais divided up among hardware processors - Same operation is performed on elements - Optionally -- final aggregation
  • 23.
    Data parallelism When? - Largeamounts of data - Processing operation is costly - or both
  • 24.
    Data parallelism Why? - Toachieve speedup - For example, with GPU acceleration: - hours instead of days!
  • 25.
    Data parallelism Embarrassingly parallelproblems - parallelizable loops - image processing Non-embarrassingly parallel problems - parallel QuickSort
  • 26.
    Data parallelism ... ... Thread 1 Thread 2
  • 27.
    Data parallelism Structured parallelism -Well defined begin and end points - Examples: - CoBegin - ForAll
  • 28.
    CoBegin var firstDataset =new DataItem[1000]; var secondDataset = new DataItem[1000]; var thirdDataset = new DataItem[1000]; Parallel.Invoke( () => Process(firstDataset), () => Process(secondDataset), () => Process(thirdDataset) );
  • 29.
    Parallel For var items= new DataItem[1000 * 1000]; // ... Parallel.For(0, items.Length, i => { Process(items[i]); });
  • 30.
    Parallel ForEach var tickers= GetNasdaqTickersStream(); Parallel.ForEach(tickers, ticker => { Process(ticker); });
  • 31.
    Striped Partitioning ... Thread 1 Thread 2
  • 32.
    Iterate complex datastructures var tree = new TreeNode(); // ... Parallel.ForEach( TraversePreOrder(tree), node => { Process(node); });
  • 33.
    Iterate complex data Thread 1 Thread 2 ...
  • 34.
    Declarative parallelism var items= new DataItem[1000 * 1000]; // ... var validItems = from item in items.AsParallel() let processedItem = Process(item) where processedItem.Property > 42 select Convert(processedItem); foreach (var item in validItems) { // ... }
  • 35.
    Data parallelism Challenges - Partitioning - Scheduling - Ordering - Merging - Aggregation - Concurrency hazards: data races, contention
  • 36.
    Task parallelism How? - Programsare already functionally partitioned: statements, methods etc. - Run independent pieces in parallel - Control synchronization - State isolation
  • 37.
  • 38.
    Task parallelism Kinds - Structured - clear begin and end points - Unstructured - often demands explicit synchronization
  • 39.
    Fork/join - Fork: launch tasks asynchronously - Join: wait until they complete - CoBegin, ForAll - Recursive decomposition
  • 40.
    Fork/join Task 1 Task 2 Task 3 Seq Seq
  • 41.
    Fork/join Parallel.Invoke( () => LoadDataFromFile(), () => SavePreviousDataToDB(), () => RenewOtherDataFromWebService());
  • 42.
    Fork/join Task loadData = Task.Factory.StartNew(() => { // ... }); Task saveAnotherDataToDB = Task.Factory.StartNew(() => { // ... }); // ... Task.WaitAll(loadData, saveAnotherDataToDB); // ...
  • 43.
    Fork/join void Walk(TreeNode node){ var tasks = new[] { Task.Factory.StartNew(() => Process(node.Value)), Task.Factory.StartNew(() => Walk(node.Left)), Task.Factory.StartNew(() => Walk(node.Right)) }; Task.WaitAll(tasks); }
  • 44.
    Fork/join recursive Root Node Left Seq Left Seq Right Right Node Left Right
  • 45.
    Dataflow parallelism: Futures Task<DataItem[]>loadDataFuture = Task.Factory.StartNew(() => { //... return LoadDataFromFile(); }); var dataIdentifier = SavePreviousDataToDB(); RenewOtherDataFromWebService(dataIdentifier); //... DisplayDataToUser(loadDataFuture.Result);
  • 46.
  • 47.
    Dataflow parallelism: Futures Future Future Future Seq Seq Seq Seq Seq
  • 48.
    Continuations Task Task Task Seq Seq Seq
  • 49.
    Continuations var loadData =Task.Factory.StartNew(() => { return LoadDataFromFile(); }); var writeToDB = loadData.ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }); var reportToUser = writeToDB.ContinueWith(t => { // ... }); reportProgressToUser.Wait();
  • 50.
    Producer/consumer pipeline reading parsing storing parsed lines DB lines
  • 51.
    Producer/consumer pipeline lines parsed lines DB
  • 52.
    Producer/consumer var lines = new BlockingCollection<string>(); Task.Factory.StartNew(() => { foreach (var line in File.ReadLines(...)) lines.Add(line); lines.CompleteAdding(); });
  • 53.
    Producer/consumer var dataItems = new BlockingCollection<DataItem>(); Task.Factory.StartNew(() => { foreach (var line in lines.GetConsumingEnumerable() ) dataItems.Add(Parse(line)); dataItems.CompleteAdding(); });
  • 54.
    Producer/consumer var dbTask =Task.Factory.StartNew(() => { foreach (var item in dataItems.GetConsumingEnumerable() ) WriteToDatabase(item); }); dbTask.Wait();
  • 55.
    Task parallelism Challenges - Scheduling -Cancellation - Exception handling - Concurrency hazards: deadlocks, livelocks, priority inversions etc.
  • 56.
    Message based parallelism -Accessing shared state vs. local state - No distinction, unfortunately - Idea: encapsulate shared state changes into messages - Async events - Actors, agents
  • 57.
  • 58.
    Concurrent data structures - Concurrent Queues, Stacks, Sets, Lists - Blocking collections, - Work stealing queues - Lock free data structures - Immutable data structures
  • 59.
    Synchronization primitives - Critical sections, - Monitors, - Auto- and Manual-Reset Events, - Coundown Events, - Mutexes, - Semaphores, - Timers, - RW locks - Barriers
  • 60.
    Thread local state -A way to achieve isolation var parser = new ThreadLocal<Parser>( () => CreateParser()); Parallel.ForEach(items, item => parser.Value.Parse(item));
  • 61.
  • 62.
    Async Task.Factory.StartNew(() => { //... return LoadDataFromFile(); }) .ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }) .ContinueWith(t => { // ... });
  • 63.
    Async var dataItems = await LoadDataFromFileAsync(); textBox.Text = dataItems.Count.ToString(); await WriteToDatabaseAsync(dataItems); // continue work
  • 64.
    Технології - TPL, PLINQ, C# async, TPL Dataflow - PPL, Intel TBB, OpenMP - CUDA, OpenCL, C++ AMP - Actors, STM - Many others
  • 65.
    Підсумок - Програмування для багатьох CPU - Concurrency != parallelism - CPU-bound vs. I/O-bound tasks - Private vs. shared state
  • 66.
    Підсумок - Managing state: - Isolation - Immutability - Synchronization - Data: mutual exclusion - Control: notifications
  • 67.
    Підсумок - Паралелізм: - Data parallelism: scalable - Task parallelism: less scalable - Message based parallelism
  • 68.
    Підсумок - Data parallelism - CoBegin - Parallel ForAll - Parallel ForEach - Parallel ForEach over complex data structures - Declarative data parallelism - Challenges: partitioning, scheduling, ordering, merging, aggregation, concurrency hazards
  • 69.
    Підсумок - Task parallelism:structured, unstructured - Fork/Join - CoBegin - Recursive decomposition - Futures - Continuations - Producer/consumer (pipelines) - Challenges: scheduling, cancellation, exceptions, concurrency hazards
  • 70.
    Підсумок - Засоби/інструменти - Компілятори, бібліотеки - Concurrent data structures - Synchronization primitives - Thread local state - Thread pools - Async invocations - ...
  • 71.