Presentation Fork/Join and Akka (part 1/2)


Fork/Join is no silver bullet when it comes to scaling computation, and it is somewhat low-level. Therefore we will also look at how it compares to other techniques such as Map-Reduce and of course good-old threads.
Published on: 2012-05-07T18:49:28.000Z
Channel: BeJUG (all)
Tags: forkjoin concurrency akka
Speakers:

Sander Mak


Sander studeerde Software Technology aan de Universiteit Utrecht en is momenteel werkzaam als software architect bij Info Support BV. Hij blogt sinds maart 2012 op http://branchandbound.net over software development in het JVM landschap. Je kent hem wellicht van artikelen in Java Magazine (o.a. 'Scala voor Java ontwikkelaars' en 'Hibernate Performance & Tuning') of eerdere presentaties op J-Fall en soortgelijke conferenties.

PDF: slides.pdf

Slides:

ForkJoin


ForkJoin @Sander_Mak

About


About Coding at ( ) Writing Blog @ branchandbound.net Speaking

Agenda


Agenda ForkJoin Setting the scene ForkJoin API & Patterns Comparisons & Future break Akka Introduction Actors Async IO

Problem?


Problem?

Problem?


Problem?

Problem?


Problem?

Problem?


Problem?

Problem?


Problem? Architectural mismatch

Problem?


Problem? So let the compiler/runtime solve it!

Problem?


Problem? So let the compiler/runtime solve it! n < 2 n if (n < 2) { return n; } else { return fib(n - 1) + fib(n - 2); }

Problem?


Problem? So let the compiler/runtime solve it! Unfortunately not in , but feasible in pure functional languages like

First attempt


First attempt public static int fib(final int n) { if (n < 2) { return n; } else { final int[] t1 = {0}, t2 = {0}; Thread thread1 = new Thread() { public void run() { t1[0] = fib(n - 1); } }; Thread thread2 = new Thread() { public void run() { t2[0] = fib(n - 2); } }; thread1.start(); thread2.start(); thread1.join(); thread2.join(); return t1[0] + t2[0]; } } if (n < 2) { return n; } else { return fib(n - 1) + fib(n - 2); }

Demo



Threads


Threads Threads are mostly waiting Improve with threadpooling but not by much starvation What if we sum in a new thread? synchronization visibility issues (Java Memory Model)

ForkJoin


ForkJoin Fork: Recursively decompose large task into subtasks Result = 3 Fib(4) Join: Await results of recursive tasks and combine Fib(1) Fib(3) Fib(2) Fib(2) Fib(1) Fib(1) Fib(0) Fib(0)

ForkJoin


ForkJoin Introducing: ForkJoinPool java.util.concurrent.ForkJoinTask java.util.concurrent. ForkJoinTask RecursiveAction RecursiveTask ForkJoinPool: void T execute (ForkJoinTask) invoke (ForkJoinTask)

ForkJoin


ForkJoin compute(problem) { if (problem.size < threshold) directlySolve(problem) else { do-forked { leftResult = compute(left(problem)) rightResult = compute(right(problem)) } join(leftResult, rightResult) return combine(leftResult, rightResult) } } Keep overhead down: sequential cutoff

Demo



ForkJoin


ForkJoin compute(problem) { if (problem.size < threshold) directlySolve(problem) else { do-forked { leftResult = compute(left(problem)) rightResult = compute(right(problem)) } join(leftResult, rightResult) return combine(leftResult, rightResult) } } Keep overhead down: sequential cutoff

ForkJoinPool


ForkJoinPool Implements ExecutorService Autosize workers (overridable) Work queue per worker Work-stealing between queues

Work stealing


Work stealing ForkJoinPool

Work stealing


Work stealing ForkJoinPool 1

Work stealing


Work stealing ForkJoinPool 2 3 1

Work stealing


Work stealing ForkJoinPool st ol en 3 2

Work stealing


Work stealing ForkJoinPool 4 5 3 6 7 2

Work stealing


Work stealing ForkJoinPool 4 sto len 5 7 6

Work stealing


Work stealing ForkJoinPool 4 8 9 5 10 11 12 13 sto len 7 6

Scalability


Scalability ForkJoinPool

Patterns


Patterns #1 Problem structure Acyclic CPU Bound - I/O upfront - No webcrawlers please...

Patterns


Patterns #2 Sequential cutoff Guidelines > 100 and < 10.000 `basic computational steps' Experiment and tune Never lock/synchronize compute(problem) { if (problem.size < threshold) directlySolve(problem) else { do-forked { ... } join ... } }

Patterns


Patterns #3 Fork once, fool me twice Why? Implementation specific Avoids overhead Especially on smallish tasks left.fork() rightResult = right.compute() leftResult = left.join() return leftResult + rightResult left.fork() leftResult = left.join() rightResult = right.compute() return leftResult + rightResult

Patterns


Patterns #4 Use convenience methods invoke fjTask.invoke(); // Equivalent to fjTask.fork(); fjTask.join(); // But always tries execution // in current thread first! invokeAll invokeAll(fjTask1, fjTask2, fjTaskN); // Or: invokeAll(collectionOfTasks);

Demo


Demo 2.797.245 world cities (Maxmind.com) Demo 1: simple search Demo 2: lat/long bounded