You are on page 1of 60

Reactive Streams

Handling Data-Flows the Reactive Way

Dr. Roland Kuhn


Akka Tech Lead
@rolandkuhn
Introduction: Streams
What is a Stream?
ephemeral flow of data
focused on describing transformation
possibly unbounded in size

3
Common uses of Streams
bulk data transfer
real-time data sources
batch processing of large data sets
monitoring and analytics

4
What is special about Reactive Streams?
The Four Reactive Traits

http://reactivemanifesto.org/

Reactive Applications 6
Needed: Asynchrony
Resilience demands it:
encapsulation
isolation
Scalability demands it:
distribution across nodes
distribution across cores

7
Many Kinds of Async Boundaries

8
Many Kinds of Async Boundaries
between different applications

8
Many Kinds of Async Boundaries
between different applications
between network nodes

8
Many Kinds of Async Boundaries
between different applications
between network nodes
between CPUs

8
Many Kinds of Async Boundaries
between different applications
between network nodes
between CPUs
between threads

8
Many Kinds of Async Boundaries
between different applications
between network nodes
between CPUs
between threads
between actors

8
The Problem:
!
Getting Data across an Async Boundary
Possible Solutions

10
Possible Solutions
the Traditional way: blocking calls

10
Possible Solutions

10
Possible Solutions
!
the Push way: buffering and/or dropping

11
Possible Solutions

11
Possible Solutions
!
!
the Reactive way:
non-blocking & non-dropping & bounded

12
How do we achieve that?
Supply and Demand
data items flow downstream
demand flows upstream
data items flow only when there is demand
recipient is in control of incoming data rate
data in flight is bounded by signaled demand
demand
Publisher Subscriber
data

14
Dynamic PushPull
push behavior when consumer is faster
pull behavior when producer is faster
switches automatically between these
batching demand allows batching data

demand
Publisher Subscriber
data

15
Explicit Demand: Tailored Flow Control

demand
data

splitting the data means merging the demand

16
Explicit Demand: Tailored Flow Control

merging the data means splitting the demand

17
Reactive Streams
asynchronous non-blocking data flow
asynchronous non-blocking demand flow
minimal coordination and contention
message passing allows for distribution
across applications
across nodes
across CPUs
across threads
across actors

18
Are Streams Collections?
What is a Collection?

20
What is a Collection?
Oxford Dictionary:
a group of things or people

20
What is a Collection?
Oxford Dictionary:
a group of things or people
wikipedia:
a grouping of some variable number of data items

20
What is a Collection?
Oxford Dictionary:
a group of things or people
wikipedia:
a grouping of some variable number of data items
backbone.js:
collections are simply an ordered set of models

20
What is a Collection?
Oxford Dictionary:
a group of things or people
wikipedia:
a grouping of some variable number of data items
backbone.js:
collections are simply an ordered set of models
java.util.Collection:
definite size, provides an iterator, query membership

20
User Expectations
an Iterator is expected to visit all elements
(especially with immutable collections)
x.head + x.tail == x
the contents does not depend on who is
processing the collection
the contents does not depend on when the
processing happens
(especially with immutable collections)

21
Streams have Unexpected Properties
the observed sequence depends on
when the observer subscribed to the stream
whether the observer can process fast enough
whether the streams flows fast enough

22
Streams are not Collections!
java.util.stream:
Stream is not derived from Collection
Streams differ from Colls in several ways
no storage
functional in nature
laziness seeking
possibly unbounded
consumable

23
Streams are not Collections!
a collection can be streamed
a stream observer can create a collection
but saying that a Stream is just a lazy
Collection evokes the wrong associations

24
So, Reactive Streams:
why not just java.util.stream.Stream?
Java 8 Stream
import java.util.stream.*;
!
// get some stream
final Stream<Integer> s = Stream.of(1, 2, 3);
// describe transformation
final Stream<String> s2 = s.map(i -> "a" + i);
// make a pull collection
s2.iterator();
// or alternatively push it somewhere
s2.forEach(i -> System.out.println(i));
// (need to pick one, Stream is consumable)

26
Java 8 Stream
provides a DSL for describing transformation
introduces staged computation
(but does not allow reuse)

prescribes an eager model of execution


offers either push or pull, chosen statically

27
What about RxJava?
RxJava
import rx.Observable;
import rx.Observable.*;
!
// get some stream source
final Observable<Integer> obs = range(1, 3);
// describe transformation
final Observable<String> obs2 =
obs.map(i -> "b" + i);
// and use it twice
obs2.subscribe(i -> System.out.println(i));
obs2.filter(i -> i.equals("b2"))
.subscribe(i -> System.out.println(i));

29
RxJava
implements pure push model
includes extensive DSL for transformations
only allows blocking for back pressure
currently uses unbounded buffering for
crossing an async boundary
work on distributed Observables sparked
participation in Reactive Streams

30
The Reactive Streams Project
Participants
Engineers from
Netflix
Oracle
Pivotal
Red Hat
Twitter
Typesafe
Individuals like Doug Lea and Todd Montgomery

32
The Motivation
all participants had the same basic problem
all are building tools for their community
a common solution benefits everybody
interoperability to make best use of efforts
e.g. use Reactor data store driver with Akka
transformation pipeline and Rx monitoring to drive a
vert.x REST API (purely made up, at this point)

see also Jon Brisbins post on Tribalism as a Force for Good

33
Recipe for Success
minimal interfaces
rigorous specification of semantics
full TCK for verification of implementation
complete freedom for many idiomatic APIs

34
The Meat
trait Publisher[T] {
def subscribe(sub: Subscriber[T]): Unit
}
trait Subscription {
def requestMore(n: Int): Unit
def cancel(): Unit
}
trait Subscriber[T] {
def onSubscribe(s: Subscription): Unit
def onNext(elem: T): Unit
def onError(thr: Throwable): Unit
def onComplete(): Unit
}

35
The Sauce
all calls on Subscriber must dispatch async
all calls on Subscription must not block
Publisher is just there to create Subscriptions

36
How does it Connect?
Publisher Subscriber

subscribe

Subscription onSubscribe

requestMore

onNext

37
Akka Streams
Akka Streams
powered by Akka Actors
execution
distribution
resilience
type-safe streaming through Actors with
bounded buffering

39
Basic Akka Example
implicit val system = ActorSystem("Sys")
val mat = FlowMaterializer(...)
!
Flow(text.split("\\s").toVector).
map(word => word.toUpperCase).
foreach(tranformed => println(tranformed)).
onComplete(mat) {
case Success(_) => system.shutdown()
case Failure(e) =>
println("Failure: " + e.getMessage)
system.shutdown()
}

40
More Advanced Akka Example
val s = Source.fromFile("log.txt", "utf-8")
Flow(s.getLines()).
groupBy {
case LogLevelPattern(level) => level
case other => "OTHER"
}.
foreach { case (level, producer) =>
val out = new PrintWriter(level+".txt")
Flow(producer).
foreach(line => out.println(line)).
onComplete(mat)(_ => Try(out.close()))
}.
onComplete(mat)(_ => Try(s.close()))

41
Java 8 Example
final ActorSystem system = ActorSystem.create("Sys");
final MaterializerSettings settings =
MaterializerSettings.create();
final FlowMaterializer materializer =
FlowMaterializer.create(settings, system);
!
final String[] lookup = { "a", "b", "c", "d", "e", "f" };
final Iterable<Integer> input = Arrays.asList(0, 1, 2, 3, 4, 5);
!
Flow.create(input).drop(2).take(3). // leave 2, 3, 4
map(elem -> lookup[elem]). // translate to "c","d","e"
filter(elem -> !elem.equals("c")). // filter out the "c"
grouped(2). // make into a list
mapConcat(list -> list). // flatten the list
fold("", (acc, elem) -> acc + elem). // accumulate into "de"
foreach(elem -> System.out.println(elem)). // print it
consume(materializer);

42
Akka TCP Echo Server
val future = IO(StreamTcp) ? Bind(addr, ...)
future.onComplete {
case Success(server: TcpServerBinding) =>
println("listen at "+server.localAddress)
!
Flow(server.connectionStream).foreach{ c =>
println("client from "+c.remoteAddress)
c.inputStream.produceTo(c.outputStream)
}.consume(mat)
!
case Failure(ex) =>
println("cannot bind: "+ex)
system.shutdown()
}

43
Akka HTTP Sketch (NOT REAL CODE)
val future = IO(Http) ? RequestChannelSetup(...)
future.onSuccess {
case ch: HttpRequestChannel =>
val p = ch.processor[Int]
val body = Flow(new File(...)).toProducer(mat)
Flow(HttpRequest(method = POST,
uri = "http://ex.amp.le/42",
entity = body) -> 42)
.produceTo(mat, p)
Flow(p).map { case (resp, token) =>
val out = FileSink(new File(...))
Flow(resp.entity.data).produceTo(mat, out)
}.consume(mat)
}

44
Closing Remarks
Current State
Early Preview is available:
"org.reactivestreams" % "reactive-streams-spi" % "0.2"
"com.typesafe.akka" %% "akka-stream-experimental" % "0.3"

check out the Activator template


"Akka Streams with Scala!"
(https://github.com/typesafehub/activator-akka-stream-scala)

46
Next Steps
we work towards inclusion in future JDK
try it out and give feedback!
http://reactive-streams.org/
https://github.com/reactive-streams

47
Typesafe 2014 All Rights Reserved

You might also like