Presentation Real-World Akka Recipes

In the brave new world of actor programming conventional design patterns frequently are not applicable, as is witnessed by questions we get on the mailing list and at conferences. That is why we have collected a number of common solutions and best practices for solving typical problems you will encounter when building scalable and robust systems with Akka actors. In this session we will show you how to implement flow control, distributed workers, blocking resources, reliable messaging and more.

Speakers


PDF: slides.pdf

Slides

Real World Akka Recipes

Real World Akka Recipes Jamie Allen Björn Antonsson Patrik Nordwall

The Fallacy of

The Fallacy of Guaranteed Delivery Jamie Allen @jamie_allen

Guaranteed Delivery

Guaranteed Delivery • From Enterprise Integration Patterns • Messaging system uses built-in store to persist • ACK everywhere – Producer to sender – Sender to receiver – Receiver to consumer @jamie_allen 3

Akka Guarantees

Akka Guarantees • Not much, intentionally • At most once, with no reordering • Pick your poison: – At most once – At least once – Exactly once • You have to add it on top @jamie_allen 4

How Do I Do It?

How Do I Do It? • Handle “at least” semantics on receiver to deal with duplicates – Idempotent behavior in receiver – Check message ID • Handle “at most” semantics on the sender via retries – ACK every time message is handled – Cancel repeated send @jamie_allen 5

Durable Mailboxes?

Durable Mailboxes? Uh,  no. @jamie_allen 6

Durable Mailboxes

Durable Mailboxes • Doesn’t work with future-based message sending (ask, ?) • No guarantee is there that the message even got to the mailbox in distributed systems • Asking for guarantees in an uncertain world @jamie_allen 7

Event Sourcing?

Event Sourcing? • Wonderful pattern for compiling a list of timeseries events • Separation of concerns from actor mailboxes • Still lots of things that can go wrong – Disk writing – Replication consistency – Network partitions – Application latency @jamie_allen 8

External Durable Message Queue

External Durable Message Queue • You still have to ACK • No certainty the message you needed even got this far • Additional dependencies in your architecture @jamie_allen 9

Guaranteed Delivery Doesn’t Exist

Guaranteed Delivery Doesn’t Exist • We don’t know what we don’t know • Increased effort • Increased complexity • Increased latency • No guarantees of consistency • Doesn’t guarantee ordering @jamie_allen 10

So What Do We Do?

So What Do We Do? • This falls outside of actor supervision; nothing the actors know about has gone wrong • Listen to Roland Kuhn: “Recovery ... should ideally be automatic in order to restore normal service as quickly as possible.” @jamie_allen 11

Sentinels

Sentinels @jamie_allen 12

Sentinels

Sentinels • Supervisors handle failure BELOW them. Sentinels handle failure ABOVE. • Responsible for querying a “source of truth” and getting latest state • Sends data to supervisor, who resolves differences in which instances of actors should exist versus those that do • Supervisor forwards data to instances that should exist for them to resolve their internal state @jamie_allen 13

Missed Events

Missed Events Add  Customer  3 Delete  Customer  1 Update  Customer  2 Customer  1 @jamie_allen Customer   Supervisor Customer  2

Sentinels

Sentinels Customer  2 Customer  3 Customer   Supervisor Customer  1 @jamie_allen Customer   Sentinel Customer   2

Sentinels

Sentinels Customer  2 Customer  3 Customer   Supervisor Customer  1 @jamie_allen Customer   Sentinel Customer   2

Sentinels

Sentinels Customer  2 Customer  3 X Customer  1 @jamie_allen Customer   Supervisor Customer   Sentinel Customer   2

Sentinels

Sentinels Customer  2 Customer  3 Customer   Supervisor Customer   Sentinel Customer  2 Customer   2 @jamie_allen

Sentinels

Sentinels Customer  2 Customer  3 Customer   Supervisor Customer   Sentinel Customer   2 @jamie_allen Customer   3

Sentinels

Sentinels • Localize them for each kind of data that must be synchronized in your supervisor hierarchy • Do not create one big one and try to resolve the entire tree at once @jamie_allen 20

Drawbacks

Drawbacks • Doesn’t work well with localized event sourcing - time series can be lost • Does introduce additional complexity and tunable latency over applications with no guarantees • Pattern only works when there is a queryable source of truth @jamie_allen 21

Inconsistent Views?

Inconsistent Views? • Using Sentinels at multiple levels of a supervisory hierarchy can lead to temporarily inconsistent views when child actors are resolved before parents on delete (no atomicity) • But is this necessarily bad? @jamie_allen 22

Sentinels in Hierarchy

Sentinels in Hierarchy S CS C2 C1 AS A1 @jamie_allen A2 S A2 DS D1 S D2

Sentinels in Hierarchy

Sentinels in Hierarchy S CS X C2 C1 AS A1 @jamie_allen A2 S A2 DS D1 S D2

Sentinels in Hierarchy

Sentinels in Hierarchy S CS X C2 C1 AS A1 @jamie_allen A2 S A2 DS D1 S D2

Sentinels in Hierarchy

Sentinels in Hierarchy S CS X C2 C1 AS S DS D1 @jamie_allen S D2

Sentinels in Hierarchy

Sentinels in Hierarchy S CS X C2 C1 AS S DS D1 @jamie_allen S D2

Sentinels in Hierarchy

Sentinels in Hierarchy CS C1 @jamie_allen S

A Huge Win

A Huge Win • Your system is resilient to external failures • You can tune sentinel update frequency to meet changing requirements • Your system is considerably less complex than attempting to guarantee no message loss @jamie_allen 29

Flow Control

Flow Control Björn Antonsson @bantonsson

Pure Push Applications

Pure Push Applications • Often the first Actor application you write – Once you start telling and stop asking • Easy to implement and reason about • Fits nicely with short lived jobs that come at a fixed rate @bantonsson 31

@bantonsson

@bantonsson 32

Why do you need anything else?

Why do you need anything else? • Produce jobs faster than you can finish them • Jobs are expensive compute/memory wise • External resources impose limits • Unpredictable job patterns @bantonsson 33

What can you do instead?

What can you do instead? • Push with rate limiting – A fixed number of jobs per time unit are pushed • Push with acknowledgment – A fixed number of jobs can be in progress. – New jobs are pushed after old jobs finish • Pull – Jobs are pulled from the master at the rate that they are completed @bantonsson 34

Push with rate limiting

Push with rate limiting • A timer sends the master ticks at fixed intervals • When a tick arrives, the master fills up its token count • If a job arrives and there are no tokens, it gets queued • When the master has tokens, it pulls jobs off the queue and pushes them @bantonsson 35

@bantonsson

@bantonsson 36

Push with acknowledgement

Push with acknowledgement • The master push a fixed number of jobs before waiting for an acknowledgement • If a job arrives and the master can't push, it gets queued • To keep workers busy, push more than one job per worker – You can use a high water mark to stop and a low water mark to start pushing @bantonsson 37

@bantonsson

@bantonsson 38

Pull

Pull • The master actor queues incoming jobs • Worker actors ask the master for a job and receives jobs when available • The workers don't need to do active polling • Can lead to lag if jobs are small compared to the time it takes to get a new one – Use batching to counteract lag @bantonsson 39

@bantonsson

@bantonsson 40

References

References • Push with rate limiting – Kaspar Fischer http://letitcrash.com/post/28901663062/throttling-messages-in-akka-2 • Pull – Derek Wyatt http://letitcrash.com/post/29044669086/balancing-workload-across-nodes-withakka-2 – Michael Pollmeier http://www.michaelpollmeier.com/akka-work-pulling-pattern-to-throttle-work/ @bantonsson 41

Distributed Workers

Distributed Workers Patrik Nordwall @patriknw

43

43

Goal

Goal • elastic addition/removal of front end nodes • elastic addition/removal of workers • thousands of workers • jobs should not be lost 44

45

45

46

46

47

47

48

48

49

49

50

50

DistributedPubSubMediator

DistributedPubSubMediator 51

52

52

53

53

54

54

55

55

56

56

56

56

56

56

54

54

52

52

49

49