Presentation Fork/Join and Akka (part 1/2)
Fork/Join is no silver bullet when it comes to scaling computation, and it is somewhat low-level. Therefore we will also look at how it compares to other techniques such as Map-Reduce and of course good-old threads.
Published on: 2012-05-07T18:49:28.000Z
Channel: BeJUG (all)
Tags: forkjoin concurrency akka
Speakers:
Sander Mak
Sander studeerde Software Technology aan de Universiteit Utrecht en is momenteel werkzaam als software architect bij Info Support BV. Hij blogt sinds maart 2012 op http://branchandbound.net over software development in het JVM landschap. Je kent hem wellicht van artikelen in Java Magazine (o.a. 'Scala voor Java ontwikkelaars' en 'Hibernate Performance & Tuning') of eerdere presentaties op J-Fall en soortgelijke conferenties.
PDF: slides.pdf
Slides:
ForkJoin
ForkJoin
@Sander_Mak
About
About
Coding
at
(
)
Writing
Blog @ branchandbound.net
Speaking
Agenda
Agenda
ForkJoin Setting the scene ForkJoin API & Patterns Comparisons & Future
break
Akka Introduction Actors Async IO
Problem?
Problem?
Problem?
Problem?
Problem?
Problem?
Problem?
Problem?
Problem?
Problem?
Architectural mismatch
Problem?
Problem?
So let the compiler/runtime solve it!
Problem?
Problem?
So let the compiler/runtime solve it!
n
<
2
n
if
(n
<
2)
{
return
n;
} else
{
return
fib(n
-
1)
+
fib(n
-
2);
}
Problem?
Problem?
So let the compiler/runtime solve it!
Unfortunately not in , but feasible in pure functional languages like
First attempt
First attempt
public
static
int
fib(final
int
n)
{
if
(n
<
2)
{
return
n;
}
else
{
final
int[]
t1
=
{0},
t2
=
{0};
Thread
thread1
=
new
Thread()
{
public
void
run()
{
t1[0]
=
fib(n
-
1);
}
};
Thread
thread2
=
new
Thread()
{
public
void
run()
{
t2[0]
=
fib(n
-
2);
}
};
thread1.start();
thread2.start();
thread1.join();
thread2.join();
return
t1[0]
+
t2[0];
}
}
if
(n
<
2)
{
return
n;
} else
{
return
fib(n
-
1)
+
fib(n
-
2);
}
Demo
Threads
Threads
Threads are mostly waiting Improve with threadpooling but not by much starvation What if we sum in a new thread? synchronization visibility issues (Java Memory Model)
ForkJoin
ForkJoin
Fork:
Recursively decompose large task into subtasks
Result = 3 Fib(4)
Join:
Await results of recursive tasks and combine
Fib(1) Fib(3) Fib(2) Fib(2) Fib(1) Fib(1) Fib(0)
Fib(0)
ForkJoin
ForkJoin
Introducing: ForkJoinPool java.util.concurrent.ForkJoinTask
java.util.concurrent.
ForkJoinTask
RecursiveAction
RecursiveTask
ForkJoinPool:
void T
execute (ForkJoinTask>) invoke (ForkJoinTask)
ForkJoin
ForkJoin
compute(problem)
{
if
(problem.size
<
threshold)
directlySolve(problem)
else
{
do-forked
{
leftResult
=
compute(left(problem))
rightResult
=
compute(right(problem))
}
join(leftResult,
rightResult)
return
combine(leftResult,
rightResult)
} }
Keep overhead down: sequential cutoff
Demo
ForkJoin
ForkJoin
compute(problem)
{
if
(problem.size
<
threshold)
directlySolve(problem)
else
{
do-forked
{
leftResult
=
compute(left(problem))
rightResult
=
compute(right(problem))
}
join(leftResult,
rightResult)
return
combine(leftResult,
rightResult)
} }
Keep overhead down: sequential cutoff
ForkJoinPool
ForkJoinPool
Implements ExecutorService Autosize workers (overridable) Work queue per worker Work-stealing between queues
Work stealing
Work stealing
ForkJoinPool
Work stealing
Work stealing
ForkJoinPool
1
Work stealing
Work stealing
ForkJoinPool
2 3 1
Work stealing
Work stealing
ForkJoinPool
st
ol
en
3
2
Work stealing
Work stealing
ForkJoinPool
4 5 3
6 7 2
Work stealing
Work stealing
ForkJoinPool
4
sto len
5
7
6
Work stealing
Work stealing
ForkJoinPool
4 8 9 5
10 11 12 13
sto len
7
6
Scalability
Scalability
ForkJoinPool
Patterns
Patterns
#1 Problem structure Acyclic CPU Bound
- I/O upfront - No webcrawlers please...
Patterns
Patterns
#2 Sequential cutoff Guidelines
> 100 and < 10.000 `basic computational steps' Experiment and tune Never lock/synchronize
compute(problem)
{
if
(problem.size
<
threshold)
directlySolve(problem)
else
{
do-forked
{
...
}
join
...
} }
Patterns
Patterns
#3 Fork once, fool me twice
Why?
Implementation specific Avoids overhead Especially on smallish tasks
left.fork() rightResult
=
right.compute() leftResult
=
left.join() return
leftResult
+
rightResult
left.fork() leftResult
=
left.join() rightResult
=
right.compute() return
leftResult
+
rightResult
Patterns
Patterns
#4 Use convenience methods
invoke
fjTask.invoke(); //
Equivalent
to fjTask.fork(); fjTask.join(); //
But
always
tries
execution //
in
current
thread
first!
invokeAll
invokeAll(fjTask1,
fjTask2,
fjTaskN); //
Or: invokeAll(collectionOfTasks);
Demo
Demo
2.797.245 world cities (Maxmind.com) Demo 1: simple search Demo 2: lat/long bounded