PDF Kafka The Definitive Guide 2Nd Edition Early Release Neha Narkhede Ebook Full Chapter

Kafka: The Definitive Guide, 2nd Edition
(Early Release) Neha Narkhede

Visit to download the full and correct content document:
https://textbookfull.com/product/kafka-the-definitive-guide-2nd-edition-early-release-n
eha-narkhede/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...
Kafka The Definitive Guide Real Time Data and Stream

Processing at Scale 1st Edition Neha Narkhede
https://textbookfull.com/product/kafka-the-definitive-guide-real-
time-data-and-stream-processing-at-scale-1st-edition-neha-
narkhede/
Kafka: The Definitive Guide: Real-Time Data and Stream

Processing at Scale 2nd Edition Shapira
https://textbookfull.com/product/kafka-the-definitive-guide-real-
time-data-and-stream-processing-at-scale-2nd-edition-shapira/
Biota Grow 2C gather 2C cook Loucas
https://textbookfull.com/product/biota-grow-2c-gather-2c-cook-
loucas/
Cassandra The Definitive Guide Distributed Data at Web

Scale 2nd Edition Jeff Carpenter
https://textbookfull.com/product/cassandra-the-definitive-guide-
distributed-data-at-web-scale-2nd-edition-jeff-carpenter/
Federal Cloud Computing The Definitive Guide for Cloud
Service Providers 2nd Edition Matthew Metheny
https://textbookfull.com/product/federal-cloud-computing-the-
definitive-guide-for-cloud-service-providers-2nd-edition-matthew-
metheny/
JavaScript The Definitive Guide 7th Edition David

Flanagan
https://textbookfull.com/product/javascript-the-definitive-
guide-7th-edition-david-flanagan/
R Markdown: The Definitive Guide Yihui Xie
https://textbookfull.com/product/r-markdown-the-definitive-guide-
yihui-xie/
Fashion The Definitive Visual Guide Caryn Franklin
https://textbookfull.com/product/fashion-the-definitive-visual-
guide-caryn-franklin/
The Ghidra Book The Definitive Guide 1st Edition Chris

Eagle
https://textbookfull.com/product/the-ghidra-book-the-definitive-
guide-1st-edition-chris-eagle/
1. 1. Meet Kafka
a. Publish/Subscribe Messaging
i. How It Starts
ii. Individual Queue Systems
b. Enter Kafka
i. Messages and Batches

ii. Schemas
iii. Topics and Partitions
iv. Producers and Consumers
v. Brokers and Clusters
vi. Multiple Clusters
c. Why Kafka?
i. Multiple Producers
ii. Multiple Consumers
iii. Disk-Based Retention
iv. Scalable
v. High Performance
d. The Data Ecosystem
i. Use Cases
e. Kafka’s Origin
i. LinkedIn’s Problem
ii. The Birth of Kafka
iii. Open Source
iv. Commercial Engagement
v. The Name
f. Getting Started with Kafka
2. 2. Managing Apache Kafka Programmatically
a. AdminClient Overview
i. Asynchronous and Eventually Consistent

API
ii. Options
iii. Flat Hierarchy
iv. Additional Notes
b. AdminClient Lifecycle: Creating, Configuring

and Closing
i. client.dns.lookup
ii. request.timeout.ms
c. Essential Topic Management

d. Configuration management
e. Consumer group management
i. Exploring Consumer Groups

ii. Modifying consumer groups
f. Cluster Metadata
g. Advanced Admin Operations
i. Adding partitions to a topic

ii. Deleting records from a topic
iii. Leader Election
iv. Reassigning Replicas
h. Testing
i. Summary
Kafka: The Definitive Guide
SECOND EDITION
Real-Time Data and Stream Processing at

Scale
With Early Release ebooks, you get books in their earliest form—
the authors’ raw and unedited content as they write—so you can
take advantage of these technologies long before the official
release of these titles.
Gwen Shapira, Todd Palino, Rajini Sivaram,

and Neha Narkhede
Kafka: The Definitive Guide
by Gwen Shapira, Todd Palino, Rajini Sivaram, and Neha
Narkhede
Copyright © 2022 Gwen Shapira, Todd Palino, Rajini Sivaram,

and Neha Narkhede. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway

North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or

sales promotional use. Online editions are also available for
most titles (http://oreilly.com). For more information, contact
our corporate/institutional sales department: 800-998-9938
or corporate@oreilly.com.
Acquisitions Editor: Jonathan Hassell
Development Editor: Gary O’Brien
Production Editor: Kate Galloway
Interior Designer: David Futato
Cover Designer: Karen Montgomery

Illustrator: Rebecca Demarest
October 2021: Second Edition
Revision History for the Early

Release
2020-05-22: First Release
See http://oreilly.com/catalog/errata.csp?
isbn=9781492043089 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media,

Inc. Kafka: The Definitive Guide, the cover image, and related
trade dress are trademarks of O’Reilly Media, Inc.
While the publisher and the authors have used good faith
efforts to ensure that the information and instructions
contained in this work are accurate, the publisher and the
authors disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages
resulting from the use of or reliance on this work. Use of the
information and instructions contained in this work is at your
own risk. If any code samples or other technology this work
contains or describes is subject to open source licenses or the
intellectual property rights of others, it is your responsibility to
ensure that your use thereof complies with such licenses
and/or rights.
978-1-492-04301-0
Chapter 1. Meet Kafka
A NOTE FOR EARLY RELEASE READERS

With Early Release ebooks, you get books in their earliest form—the authors’ raw
and unedited content as they write—so you can take advantage of these
technologies long before the official release of these titles.
This will be the 1st chapter of the final book.
If you have comments about how we might improve the content and/or examples in
this book, or if you notice missing material within this chapter, please reach out to the
author at cshapi+ktdg@gmail.com.
Every enterprise is powered by data. We take information in,

analyze it, manipulate it, and create more as output. Every
application creates data, whether it is log messages, metrics,
user activity, outgoing messages, or something else. Every byte
of data has a story to tell, something of importance that will
inform the next thing to be done. In order to know what that is,
we need to get the data from where it is created to where it can
be analyzed. We see this every day on websites like Amazon,
where our clicks on items of interest to us are turned into
recommendations that are shown to us a little later.
The faster we can do this, the more agile and responsive our
organizations can be. The less effort we spend on moving data
around, the more we can focus on the core business at hand.
This is why the pipeline is a critical component in the data-
driven enterprise. How we move the data becomes nearly as
important as the data itself.
Any time scientists disagree, it’s because we have

insufficient data. Then we can agree on what kind of data to
get; we get the data; and the data solves the problem.
Either I’m right, or you’re right, or we’re both wrong. And
we move on.
—Neil deGrasse Tyson
Publish/Subscribe Messaging
Before discussing the specifics of Apache Kafka, it is important
for us to understand the concept of publish/subscribe
messaging and why it is important. Publish/subscribe
messaging is a pattern that is characterized by the sender
(publisher) of a piece of data (message) not specifically
directing it to a receiver. Instead, the publisher classifies the
message somehow, and that receiver (subscriber) subscribes to
receive certain classes of messages. Pub/sub systems often
have a broker, a central point where messages are published, to
facilitate this.
How It Starts
Many use cases for publish/subscribe start out the same way:
with a simple message queue or interprocess communication
channel. For example, you create an application that needs to
send monitoring information somewhere, so you write in a
direct connection from your application to an app that displays
your metrics on a dashboard, and push metrics over that
connection, as seen in Figure 1-1.
Figure 1-1. A single, direct metrics publisher
This is a simple solution to a simple problem that works when

you are getting started with monitoring. Before long, you
decide you would like to analyze your metrics over a longer
term, and that doesn’t work well in the dashboard. You start a
new service that can receive metrics, store them, and analyze
them. In order to support this, you modify your application to
write metrics to both systems. By now you have three more
applications that are generating metrics, and they all make the
same connections to these two services. Your coworker thinks it
would be a good idea to do active polling of the services for
alerting as well, so you add a server on each of the applications
to provide metrics on request. After a while, you have more
applications that are using those servers to get individual
metrics and use them for various purposes. This architecture
can look much like Figure 1-2, with connections that are even
harder to trace.
Figure 1-2. Many metrics publishers, using direct connections
The technical debt built up here is obvious, so you decide to pay

some of it back. You set up a single application that receives
metrics from all the applications out there, and provide a server
to query those metrics for any system that needs them. This
reduces the complexity of the architecture to something similar
to Figure 1-3. Congratulations, you have built a publish-
subscribe messaging system!
Figure 1-3. A metrics publish/subscribe system
Individual Queue Systems

At the same time that you have been waging this war with
metrics, one of your coworkers has been doing similar work
with log messages. Another has been working on tracking user
behavior on the frontend website and providing that
information to developers who are working on machine
learning, as well as creating some reports for management. You
have all followed a similar path of building out systems that
decouple the publishers of the information from the
subscribers to that information. Figure 1-4 shows such an
infrastructure, with three separate pub/sub systems.
Figure 1-4. Multiple publish/subscribe systems
This is certainly a lot better than utilizing point-to-point

connections (as in Figure 1-2), but there is a lot of duplication.
Your company is maintaining multiple systems for queuing
data, all of which have their own individual bugs and
limitations. You also know that there will be more use cases for
messaging coming soon. What you would like to have is a single
centralized system that allows for publishing generic types of
data, which will grow as your business grows.
Enter Kafka
Apache Kafka is a publish/subscribe messaging system
designed to solve this problem. It is often described as a
“distributed commit log” or more recently as a “distributing
streaming platform.” A filesystem or database commit log is
designed to provide a durable record of all transactions so that
they can be replayed to consistently build the state of a system.
Similarly, data within Kafka is stored durably, in order, and can
be read deterministically. In addition, the data can be
distributed within the system to provide additional protections
against failures, as well as significant opportunities for scaling
performance.
Messages and Batches

The unit of data within Kafka is called a message. If you are
approaching Kafka from a database background, you can think
of this as similar to a row or a record. A message is simply an
array of bytes as far as Kafka is concerned, so the data
contained within it does not have a specific format or meaning
to Kafka. A message can have an optional bit of metadata,
which is referred to as a key. The key is also a byte array and, as
with the message, has no specific meaning to Kafka. Keys are
used when messages are to be written to partitions in a more
controlled manner. The simplest such scheme is to generate a
consistent hash of the key, and then select the partition number
for that message by taking the result of the hash modulo, the
total number of partitions in the topic. This assures that
messages with the same key are always written to the same
partition.
For efficiency, messages are written into Kafka in batches. A

batch is just a collection of messages, all of which are being
produced to the same topic and partition. An individual
roundtrip across the network for each message would result in
excessive overhead, and collecting messages together into a
batch reduces this. Of course, this is a tradeoff between latency
and throughput: the larger the batches, the more messages that
can be handled per unit of time, but the longer it takes an
individual message to propagate. Batches are also typically
compressed, providing more efficient data transfer and storage
at the cost of some processing power. Both keys and batches
are discussed in more detail in Chapter 4.
Schemas
While messages are opaque byte arrays to Kafka itself, it is
recommended that additional structure, or schema, be imposed
on the message content so that it can be easily understood.
There are many options available for message schema,
depending on your application’s individual needs. Simplistic
systems, such as Javascript Object Notation (JSON) and
Extensible Markup Language (XML), are easy to use and
human-readable. However, they lack features such as robust
type handling and compatibility between schema versions.
Many Kafka developers favor the use of Apache Avro, which is
a serialization framework originally developed for Hadoop.
Avro provides a compact serialization format; schemas that are
separate from the message payloads and that do not require
code to be generated when they change; and strong data typing
and schema evolution, with both backward and forward
compatibility.
A consistent data format is important in Kafka, as it allows

writing and reading messages to be decoupled. When these
tasks are tightly coupled, applications that subscribe to
messages must be updated to handle the new data format, in
parallel with the old format. Only then can the applications that
publish the messages be updated to utilize the new format. By
using well-defined schemas and storing them in a common
repository, the messages in Kafka can be understood without
coordination. Schemas and serialization are covered in more
detail in Chapter 4.
Topics and Partitions

Messages in Kafka are categorized into topics. The closest
analogies for a topic are a database table or a folder in a
filesystem. Topics are additionally broken down into a number
of partitions. Going back to the “commit log” description, a
partition is a single log. Messages are written to it in an
append-only fashion, and are read in order from beginning to
end. Note that as a topic typically has multiple partitions, there
is no guarantee of message time-ordering across the entire
topic, just within a single partition. Figure 1-5 shows a topic
with four partitions, with writes being appended to the end of
each one. Partitions are also the way that Kafka provides
redundancy and scalability. Each partition can be hosted on a
different server, which means that a single topic can be scaled
horizontally across multiple servers to provide performance far
beyond the ability of a single server.
Figure 1-5. Representation of a topic with multiple partitions
The term stream is often used when discussing data within

systems like Kafka. Most often, a stream is considered to be a
single topic of data, regardless of the number of partitions. This
represents a single stream of data moving from the producers
to the consumers. This way of referring to messages is most
common when discussing stream processing, which is when
frameworks—some of which are Kafka Streams, Apache Samza,
and Storm—operate on the messages in real time. This method
of operation can be compared to the way offline frameworks,
namely Hadoop, are designed to work on bulk data at a later
time. An overview of stream processing is provided in Chapter
14.
Producers and Consumers

Kafka clients are users of the system, and there are two basic
types: producers and consumers. There are also advanced
client APIs—Kafka Connect API for data integration and Kafka
Streams for stream processing. The advanced clients use
producers and consumers as building blocks and provide
higher-level functionality on top.
Producers create new messages. In other publish/subscribe
systems, these may be called publishers or writers. In general,
a message will be produced to a specific topic. By default, the
producer does not care what partition a specific message is
written to and will balance messages over all partitions of a
topic evenly. In some cases, the producer will direct messages
to specific partitions. This is typically done using the message
key and a partitioner that will generate a hash of the key and
map it to a specific partition. This assures that all messages
produced with a given key will get written to the same
partition. The producer could also use a custom partitioner that
follows other business rules for mapping messages to
partitions. Producers are covered in more detail in Chapter 4.
Consumers read messages. In other publish/subscribe systems,

these clients may be called subscribers or readers. The
consumer subscribes to one or more topics and reads the
messages in the order in which they were produced. The
consumer keeps track of which messages it has already
consumed by keeping track of the offset of messages. The offset
is another bit of metadata—an integer value that continually
increases—that Kafka adds to each message as it is produced.
Each message in a given partition has a unique offset. By
storing the offset of the last consumed message for each
partition, either in Zookeeper or in Kafka itself, a consumer can
stop and restart without losing its place.
Consumers work as part of a consumer group, which is one or

more consumers that work together to consume a topic. The
group assures that each partition is only consumed by one
member. In Figure 1-6, there are three consumers in a single
group consuming a topic. Two of the consumers are working
from one partition each, while the third consumer is working
from two partitions. The mapping of a consumer to a partition
is often called ownership of the partition by the consumer.
In this way, consumers can horizontally scale to consume

topics with a large number of messages. Additionally, if a single
consumer fails, the remaining members of the group will
rebalance the partitions being consumed to take over for the
missing member. Consumers and consumer groups are
discussed in more detail in Chapter 6.
Figure 1-6. A consumer group reading from a topic
Brokers and Clusters

A single Kafka server is called a broker. The broker receives
messages from producers, assigns offsets to them, and commits
the messages to storage on disk. It also services consumers,
responding to fetch requests for partitions and responding with
the messages that have been committed to disk. Depending on
the specific hardware and its performance characteristics, a
single broker can easily handle thousands of partitions and
millions of messages per second.
Kafka brokers are designed to operate as part of a cluster.

Within a cluster of brokers, one broker will also function as the
cluster controller (elected automatically from the live members
of the cluster). The controller is responsible for administrative
operations, including assigning partitions to brokers and
monitoring for broker failures. A partition is owned by a single
broker in the cluster, and that broker is called the leader of the
partition. A partition may be assigned to multiple brokers,
which will result in the partition being replicated (as seen in
Figure 1-7). This provides redundancy of messages in the
partition, such that another broker can take over leadership if
there is a broker failure. However, all consumers and producers
operating on that partition must connect to the leader. Cluster
operations, including partition replication, are covered in detail
in Chapter 8.
Figure 1-7. Replication of partitions in a cluster
A key feature of Apache Kafka is that of retention, which is the

durable storage of messages for some period of time. Kafka
brokers are configured with a default retention setting for
topics, either retaining messages for some period of time (e.g.,
7 days) or until the topic reaches a certain size in bytes (e.g., 1
GB). Once these limits are reached, messages are expired and
deleted so that the retention configuration is a minimum
amount of data available at any time. Individual topics can also
be configured with their own retention settings so that
messages are stored for only as long as they are useful. For
example, a tracking topic might be retained for several days,
whereas application metrics might be retained for only a few
hours. Topics can also be configured as log compacted, which
means that Kafka will retain only the last message produced
with a specific key. This can be useful for changelog-type data,
where only the last update is interesting.
Multiple Clusters
As Kafka deployments grow, it is often advantageous to have
multiple clusters. There are several reasons why this can be
useful:
Segregation of types of data
Isolation for security requirements
Multiple datacenters (disaster recovery)
When working with multiple datacenters in particular, it is

often required that messages be copied between them. In this
way, online applications can have access to user activity at both
sites. For example, if a user changes public information in their
profile, that change will need to be visible regardless of the
datacenter in which search results are displayed. Or,
monitoring data can be collected from many sites into a single
central location where the analysis and alerting systems are
hosted. The replication mechanisms within the Kafka clusters
are designed only to work within a single cluster, not between
multiple clusters.
The Kafka project includes a tool called MirrorMaker, used for

this purpose. At its core, MirrorMaker is simply a Kafka
consumer and producer, linked together with a queue.
Messages are consumed from one Kafka cluster and produced
for another. Figure 1-8 shows an example of an architecture
that uses MirrorMaker, aggregating messages from two local
clusters into an aggregate cluster, and then copying that cluster
to other datacenters. The simple nature of the application
belies its power in creating sophisticated data pipelines, which
will be detailed further in Chapter 9.
Figure 1-8. Multiple datacenter architecture
Why Kafka?
There are many choices for publish/subscribe messaging
systems, so what makes Apache Kafka a good choice?
Multiple Producers
Kafka is able to seamlessly handle multiple producers, whether
those clients are using many topics or the same topic. This
makes the system ideal for aggregating data from many
frontend systems and making it consistent. For example, a site
that serves content to users via a number of microservices can
have a single topic for page views that all services can write to
using a common format. Consumer applications can then
receive a single stream of page views for all applications on the
site without having to coordinate consuming from multiple
topics, one for each application.
Multiple Consumers
In addition to multiple producers, Kafka is designed for
multiple consumers to read any single stream of messages
without interfering with each other. This is in contrast to many
queuing systems where once a message is consumed by one
client, it is not available to any other. Multiple Kafka
consumers can choose to operate as part of a group and share a
stream, assuring that the entire group processes a given
message only once.
Disk-Based Retention
Not only can Kafka handle multiple consumers, but durable
message retention means that consumers do not always need to
work in real time. Messages are committed to disk, and will be
stored with configurable retention rules. These options can be
selected on a per-topic basis, allowing for different streams of
messages to have different amounts of retention depending on
the consumer needs. Durable retention means that if a
consumer falls behind, either due to slow processing or a burst
in traffic, there is no danger of losing data. It also means that
maintenance can be performed on consumers, taking
applications offline for a short period of time, with no concern
about messages backing up on the producer or getting lost.
Consumers can be stopped, and the messages will be retained
in Kafka. This allows them to restart and pick up processing
messages where they left off with no data loss.
Scalable
Kafka’s flexible scalability makes it easy to handle any amount
of data. Users can start with a single broker as a proof of
concept, expand to a small development cluster of three
brokers, and move into production with a larger cluster of tens
or even hundreds of brokers that grows over time as the data
scales up. Expansions can be performed while the cluster is
online, with no impact on the availability of the system as a
whole. This also means that a cluster of multiple brokers can
handle the failure of an individual broker, and continue
servicing clients. Clusters that need to tolerate more
simultaneous failures can be configured with higher replication
factors. Replication is discussed in more detail in Chapter 8.
High Performance
All of these features come together to make Apache Kafka a
publish/subscribe messaging system with excellent
performance under high load. Producers, consumers, and
brokers can all be scaled out to handle very large message
streams with ease. This can be done while still providing
subsecond message latency from producing a message to
availability to consumers.
The Data Ecosystem

Many applications participate in the environments we build for
data processing. We have defined inputs in the form of
applications that create data or otherwise introduce it to the
system. We have defined outputs in the form of metrics,
reports, and other data products. We create loops, with some
components reading data from the system, transforming it
using data from other sources, and then introducing it back
into the data infrastructure to be used elsewhere. This is done
for numerous types of data, with each having unique qualities
of content, size, and usage.
Apache Kafka provides the circulatory system for the data

ecosystem, as shown in Figure 1-9. It carries messages between
the various members of the infrastructure, providing a
consistent interface for all clients. When coupled with a system
to provide message schemas, producers and consumers no
longer require tight coupling or direct connections of any sort.
Components can be added and removed as business cases are
created and dissolved, and producers do not need to be
concerned about who is using the data or the number of
consuming applications.
Figure 1-9. A big data ecosystem
Use Cases
ACTIVITY TRACKING
The original use case for Kafka, as it was designed at LinkedIn,

is that of user activity tracking. A website’s users interact with
frontend applications, which generate messages regarding
actions the user is taking. This can be passive information, such
as page views and click tracking, or it can be more complex
actions, such as information that a user adds to their profile.
The messages are published to one or more topics, which are
then consumed by applications on the backend. These
applications may be generating reports, feeding machine
learning systems, updating search results, or performing other
operations that are necessary to provide a rich user experience.
MESSAGING
Kafka is also used for messaging, where applications need to

send notifications (such as emails) to users. Those applications
can produce messages without needing to be concerned about
formatting or how the messages will actually be sent. A single
application can then read all the messages to be sent and
handle them consistently, including:
Formatting the messages (also known as decorating)

using a common look and feel
Collecting multiple messages into a single notification

to be sent
Applying a user’s preferences for how they want to

receive messages
Using a single application for this avoids the need to duplicate

functionality in multiple applications, as well as allows
operations like aggregation which would not otherwise be
possible.
METRICS AND LOGGING
Kafka is also ideal for collecting application and system metrics

and logs. This is a use case in which the ability to have multiple
applications producing the same type of message shines.
Another random document with
no related content on Scribd:
CHAPTER VI
THE MARITIME PROVINCES
I have come into Canada through the Maritime Provinces, which

lie on the Atlantic Coast between our own state of Maine and the
mouth of the St. Lawrence. The Provinces are Nova Scotia, New
Brunswick, and Prince Edward Island. Their area is almost equal to
that of our six New England states, and in climate and scenery they
are much the same. Their population, however, is only about one
million, or little more than one fourth as many as the number of
people living in Massachusetts. These provinces were the first British
possessions in Canada, and like New England they have seen the
centre of population and progress move ever westward.
Nova Scotia is the easternmost province of the Dominion of
Canada. Its capital and chief city is Halifax, situated on the Atlantic
on one of the world’s best natural harbours. This is a deep water
inlet ten miles in length, which is open all the year round. Montreal
and Quebec are closed to navigation during the winter months on
account of the freezing of the St. Lawrence.
Halifax is six hundred miles closer to Europe than is New York,
and nearer Rio de Janeiro than is New Orleans. As the eastern
terminus of the Canadian National Railways, it has direct
connections with all Canada. With these advantages, the city hopes
to become one of the great shipping centres on the North Atlantic.
Halifax has long been noted as the most English city in Canada.
It was once the military, naval, and political centre of British North
America, and gay with the social life of British officers and their
ladies. Now, both the warships and the soldiers are gone, and the
city is devoting itself to commercial activities.
As we steamed past the lighthouses and the hidden guns on the
headlands guarding the entrance, I was reminded of all that this
harbour has meant to America. The city was founded by Lord
Cornwallis in 1749 at the suggestion of Boston merchants who
complained that the French were using these waters as a base for
their sea raiders. Less than thirty years later it provided a haven for
Lord Howe when he was driven out of Boston by our soldiers of the
Revolution, and became the headquarters for the British operations
against the struggling colonies. In the war of 1812, the American
warship Chesapeake was brought here after her defeat by the British
frigate Shannon. During our Civil War Halifax served as a base for
blockade runners, and the fortunes of some of its wealthy citizens of
to-day were founded on the profits of this dangerous trade. No one
dreamed then that within two generations England and America
would be fighting side by side in a World War, that thousands of
United States soldiers would sail from Halifax for the battlefields of
Europe, or that an American admiral, commanding a fleet of
destroyers, would establish his headquarters here. Yet that is what
happened in 1917–18. All that now remains of the former duels on
the sea is the annual sailing race between the fastest schooners of
the Gloucester and the Nova Scotia fishing fleets.
Halifax is built on a hillside that rises steeply from the water-front
to a height of two hundred and sixty feet above the harbour. The city
extends about halfway up the hill, and reaches around on both sides
of it. The top is a bare, grassy mound, surmounted by an ancient
citadel.
Stand with me on the edge of the old moat, and look down upon
Halifax and its harbour. Far to our left is the anchorage where
occurred one of the greatest explosions the world ever knew. Just as
the city was eating breakfast on the morning of December 6, 1917, a
French munitions ship, loaded with benzol and TNT, collided with
another vessel leaving the harbour, and her cargo of explosives blew
up in a mighty blast. Nearly two thousand people were killed, six
thousand were injured, and eleven thousand were made homeless.
Hardly a pane of glass was left in a window, and acres of houses
were levelled to the ground. A deck gun was found three miles from
the water, and the anchor of one of the vessels lies in the woods six
miles away, where it was thrown by the explosion. A street-car
conductor was blown through a second-story window, and a sailor
hurled from his ship far up the hillside. Since then much of the
devastated area has been rebuilt along approved town-planning
lines, but the scars of the disaster are still visible. For a long time
after the explosion, the local institution for the blind was filled to
capacity, and one saw on the streets many persons wearing patches
over one eye.
Standing on the hill across the harbour one sees the town of
Dartmouth, where much of the industrial activity of the Halifax district
is centred. There are the largest oil works, chocolate factories, and
sugar refineries of Canada. Vessels from Mexico, South America,
and the British West Indies land their cargoes of tropical products at
the doors of the works. Fringing the water-front are the masts of
sailing vessels and the smokestacks of steamers. Among the latter is
a cable repair ship, just in from mending a break in one of the many
submarine telegraph lines that land on this coast. Next to her is a
giant new liner, making her first stop here to add to her cargo some
twenty-five thousand barrels of apples from the Annapolis Valley.
This valley, on the western side of Nova Scotia, is known also as
“Evangeline Land.” It was made famous by Longfellow’s poem based
on the expulsion of the French Acadians by the English because
they insisted on being neutral in the French-British wars. It is one of
the finest apple-growing districts in the world, and sends annually to
Europe nearly two million barrels. Many descendants of the former
French inhabitants have now returned to the land of their ancestors.
Looking toward the mouth of the harbour, we see the new
terminal, a twenty-five million dollar project that has for some years
stood half completed. Here are miles and miles of railroad tracks,
and giant piers equipped with modern machinery, a part of the
investment the Dominion and its government-owned railway system
have made to establish Halifax as a first-class port. Beyond the port
works another inlet, Northwest Arm, makes its way in between the
hills. I have motored out to its wooded shores, which in summer time
are crowded with the young people of Halifax, bathing and boating. It
is the city’s chief playground and a beautiful spot.
But now take a look at the city itself, stretching along the water-
front below where we stand. The big red brick building just under our
feet is the municipal market. There, on Saturdays, one may see an
occasional Indian, survivor of the ancient Micmacs, and Negroes
who are descendants of slaves captured by the British in Maryland
when they sailed up the Potomac and burned our Capitol. Farther
down the hillside are the business buildings of the city, none of them
more than five stories high, and all somewhat weatherbeaten. I have
seen no new construction under way in downtown Halifax; the city
seems to have missed the building booms of recent years. Most of
the older houses are of stone or brick. Outside the business district
the people live in wooden frame houses, each with its bit of yard
around it. One would know Halifax for an English town by its
chimney pots. Some of the houses have batteries of six or eight of
these tiles set on end sticking out of their chimneys.
The streets are built on terraces cut in the hillside, or plunging
down toward the water. Some of them are so narrow that they have
room for only a single trolley track, on which are operated little one-
man cars. I stepped for a moment into St. Paul’s Church, the first
English house of worship in Canada. Its front pew, to the left of the
centre aisle, is reserved for the use of royal visitors. Passing one of
the local newspaper offices, I noticed a big crowd that filled the
street, watching an electric score board that registered, play by play,
a World Series baseball game going on in New York. The papers are
full of baseball talk, and the people of this Canadian province seem
to follow the game as enthusiastically as our fans at home.
My nose will long remember Halifax. In lower Hollis Street, just
back from the water-front, and not far from the low gray stone
buildings that once quartered British officers, I smelled a most
delicious aroma. It was from a group of importing houses, where
cinnamon, cloves, and all the products of the East Indies are ground
up and packed for the market. If I were His Worship, the Mayor of
Halifax, I should propose that Hollis street be renamed and called
the Street of the Spices. Just below this sweet-scented district, I
came to a tiny brick building, with a sign in faded letters reading “S.
Cunard & Co., Coal Merchants.” This firm is the corporate lineal
descendant of Samuel Cunard, who, with his partners, established
the first transatlantic steamship service nearly a century ago, and
whose name is now carried all over the world by some of the
greatest liners afloat.
Another odour of the water-front is not so sweet as the spices. It
is the smell of salt fish, which here are dried on frames built on the
roofs near the docks. Nova Scotia is second only to Newfoundland in
her exports of dried cod, and all her fisheries combined earn more
than twelve million dollars a year. They include cod, haddock,
mackerel, herring, halibut, pollock, and salmon. Lunenburg, down
the coast toward Boston, is one of the centres of the deep-sea
fishing industry, and its schooners compete on the Grand Banks with
those from Newfoundland, Gloucester, and Portugal.
I talked in Halifax with the manager of a million-dollar corporation
that deals in fresh fish. He was a Gloucester man who, as he put it,
“has had fish scales on his boots” ever since he could remember.
“We operate from Canso, the easternmost tip of Nova Scotia,” he
said. “Our steamers make weekly trips to the fishing grounds, where
they take the fish with nets. They are equipped with wireless, and we
direct their operations from shore in accordance with market
conditions. While the price of salt fish is fairly steady, fresh fish
fluctuates from day to day, depending on the quantities caught and
the public taste. Such fish as we cannot sell immediately, we cure in
our smoking and drying plants.
“All our crews share in the proceeds of their catch, and the
captains get no fixed wages at all. We could neither catch the fish
nor sell them at a profit without the fullest coöperation on the part of
our men, most of whom come from across the Atlantic, from
Denmark, Norway, and Sweden, and also from Iceland. Next to the
captain, the most important man on our ships is the cook. Few fish
are caught unless the fishermen are well fed. The ‘cook’s locker’ is
always full of pies, cakes, and cookies, to which the men help
themselves, and the coffee-pot must be kept hot for all hands to
‘mug up.’”
From Halifax I crossed Nova Scotia by rail into the adjoining
province of New Brunswick. Nova Scotia is a peninsula that seems
to have been tacked on to the east coast of Canada. It is three
hundred and seventy-four miles long, and so narrow that no point in
it is more than thirty miles from the sea. The coast does not run due
north and south, but more east and west, so that its southernmost tip
points toward Boston. The Bay of Fundy separates it from the coasts
of Maine and New Brunswick, and leaves only an isthmus, in places
not more than twenty miles wide, connecting Nova Scotia with the
mainland. The lower or westernmost half of the province is encircled
with railroads, which carry every year increasing thousands of
tourists and hunters from the United States. The summer
vacationists and the artists go chiefly to the picturesque shore towns,
while those who come up for hunting and fishing strike inland to the
lakes and woods. Deer and moose are still so plentiful in Nova
Scotia that their meat is served at Halifax hotels during the season.
The scenery is much like that of Maine. Rolling hills alternate
with ledges of gray rock, while at every few miles there are lakes and
ponds. Much of the country is covered with spruce, and many of the
farms have hedges and tall windbreaks of those trees. The
farmhouses are large and well built; they are usually situated on high
ground and surrounded by sloping fields and pastures considerably
larger than the farm lots of New England. In some places the broad
hills are shaped like the sand dunes of Cape Cod. At nearly every
station freshly cut lumber was piled up, awaiting shipment, while one
of the little rivers our train crossed was filled with birch logs floating
down to a spool factory.
Some two hours from Halifax we came to Truro at the head of
Cobequid Bay, the easternmost arm of the Bay of Fundy. Scientists
who have studied the forty-foot Fundy tides attribute them to its
pocket-like shape. The tides are highest in the numerous deep inlets
at the head of the Bay. In the Petitcodiac River, which forms the
northernmost arm, as the tide comes in a wall of water two or three
feet high rushes upstream. These tides are felt far back from the
coast. The rivers and streams have deep-cut banks on account of
the daily inrush and outflow of waters and are bordered with
marshes through which run irrigation ditches dug by the farmers.
With his poem of Evangeline, Longfellow made

famous the old well at Grand Pré, the scene of the
expulsion of the Acadians because they wanted to
remain neutral in the French-British wars.
When the tide goes out at Digby, vessels tied to
the docks are left high and dry. At some points on the
Bay of Fundy the rise and fall of the water exceeds
forty feet.
Truro is a turning-off point for the rail journey down the Bay side
of Nova Scotia through “Evangeline Land” and the Annapolis Valley,
and also for the trip north and east up to Cape Breton Island. This
island is part of the province of Nova Scotia. It is separated from the
mainland only by the mile-wide Strait of Canso, across which railroad
trains are carried on ferries. In the southern part of the Island is the
Bras d’Or Lake, an inland sea covering two hundred and forty square
miles.
Because of the deep snows in winter the Quebec
farmhouse usually has high porches and often a
bridge from the rear leading to the upper floor of the
barn. The older houses are built of stone.
Spinning wheels and hand looms are still in use
among the French Canadian farm women. Besides
supplying clothes for their families, they make also
homespuns and rugs for sale.
Though Cabot landed on the coast of Cape Breton Island after
his discovery of the Newfoundland shore, it later fell into the hands of
the French. They found its fisheries worth more than all the gold of
Peru or Mexico. To protect the sea route to their St. Lawrence
territories, they built at Louisburg a great fortress that cost a sum
equal to twenty-five million dollars in our money. To-day, hardly one
stone remains upon another, as the works were destroyed by the
British in 1758. Not far from Louisburg is Glace Bay, where Marconi
continued the wireless experiments begun in Newfoundland, and it
was on this coast, also, that the first transatlantic cable was landed.
Cape Breton Island was settled mostly by Scotch, and even to-
day sermons in the churches are often delivered in Gaelic. As a
result of intermarriage sometimes half the people of a village bear
the same family name. For generations these people lived mostly by
fishing, but the opening of coal mines in the Sydney district brought
many of them into that industry. The Sydney mines, which normally
employ about ten thousand men, are the only coal deposits on the
continent of North America lying directly on the Atlantic Coast. They
are an asset of immense value to Canada, yielding more than one
third of her total coal production. One of the mines at North Sydney
has the largest coal shaft in the world. Because of these enormous
deposits of bituminous coal, and the presence near by of dolomite,
or limestone, steel industries have been developed in the Sydney
district. Ownership of most of the coal and steel properties has been
merged in the British Empire Steel Corporation, one of the largest
single industrial enterprises in all Canada. It is this corporation, you
will remember, that owns the Wabana iron mines in Newfoundland.
In the Gulf of St. Lawrence, and north of the isthmus connecting
Nova Scotia with the mainland, is Prince Edward Island, the
smallest, but proportionately the richest province in the Dominion of
Canada. It is not quite twice the size of Rhode Island, and has less
than one hundred thousand people, but every acre of its land is
tillable and most of it is cultivated. The island is sometimes called the
“Garden of the Gulf.”
Prince Edward Island is a favourite resort of Americans on
vacation. It leaped into fame as the scene of the first successful
experiments in raising foxes for their furs, and now has more than
half of the fox farms in Canada. The business of selling fox skins and
breeding stock is worth nearly two million dollars a year to the Prince
Edward Islanders. The greatest profits are from the sales of fine
breeding animals.
Most of the west shore of the Bay of Fundy and many of its
northern reaches are in the third and westernmost of the three
Maritime Provinces. This is the province of New Brunswick. It is
Maine’s next-door neighbour, and almost as large, but it has less
than half as many people. The wealth of New Brunswick, like that of
Maine, comes chiefly from the farms, the fisheries, and the great
forests that are fast being converted into lumber and paper. Its game
and fresh-water fishing attract a great many sportsmen from both the
United States and Canada.
St. John, the chief city of New Brunswick at the mouth of the St.
John River, used to be a centre of anti-American sentiment in
Canada. This was because the city was founded by the Tories, who
left the United States after we won our independence. St. John to-
day is a busy commercial centre competing with Halifax for first
place as Canada’s all-year Atlantic port. It is the eastern terminal of
the Canadian Pacific Railroad, whose transatlantic liners use the port
during the winter. It enjoys the advantage over Halifax of being some
two hundred miles nearer Montreal, but, like Halifax, suffers on
account of the long railway haul and high freight rates to central
Canada. As a matter of fact, New England, and not Canada, is the
natural market for the Maritime Provinces, and every few years the
proposal that this part of Canada form a separate Dominion comes
up for discussion. Such talk is not taken seriously by the well
informed, but it provides a good safety valve for local irritation.
CHAPTER VII
IN FRENCH CANADA
Come with me for a ride about Quebec, the oldest city in

Canada, the ancient capital of France in America, and a stronghold
of the Catholic Church. We go from the water-front through the
Lower Town, up the heights, and out to where the modern city eats
into the countryside. The Lower Town is largely French. The main
part of the Upper Town used to be enclosed by walls and stone
gates, parts of which are still standing. The dull gray buildings are of
stone, with only shelf-like sidewalks between them and the street.
Most of the streets are narrow. The heights are ascended by stairs,
by a winding street, and in one place by an elevator. The old French
caleche, a two-wheeled vehicle between a jinrikisha and a dog-cart,
has been largely displaced by motor-cars, which can climb the steep
grades in a jiffy. Even the ancient buildings are giving way to modern
necessities, and every year some are torn down.
As a city, Quebec is unique on this continent. It fairly drips with
“atmosphere,” and is concentrated romance and history. You know
the story, of course, of how Champlain founded it in 1608, on a
narrow shelf of land under the rocky bluff that rises nearly three
hundred and fifty feet above the St. Lawrence. Here brave French
noblemen and priests started what they hoped would be a new
empire for France. Between explorations, fights with the Indians, and
frequent British attacks, they lived an exciting life. Finally, General
Wolfe in 1759 succeeded in capturing for the British this Gibraltar of
the New World. Landing his men by night, at dawn he was in position
on the Plains of Abraham behind the fort. In the fight that followed
Wolfe was killed, Montcalm, the French commander, was mortally
wounded, and the city passed into the hands of the English. If
General Montgomery and Benedict Arnold had succeeded in their
attack on Quebec on New Year’s Eve, sixteen years later, the history
of all Canada would have been different, and the United States flag
might be flying over the city to-day.
The British built in the rock on top of the bluff a great fort and
citadel covering about forty acres. It still bristles with cannon, but
most of them are harmless compared with modern big guns. The
works serve chiefly as a show place for visitors, and a summer
residence for dukes and lords sent out to be governors-general of
Canada. The fortification is like a mediæval castle, with subterranean
chambers and passages, and cannon balls heaped around the
battlements. Below the old gun embrasures is a broad terrace, a
quarter of a mile long. This furnishes the people of Quebec a
beautiful promenade that overlooks the harbour and commands a
fine view of Levis and the numerous villages on the other shore.
The Parliament building stands a little beyond the entrance to
the citadel. As we go on the architecture reflects the transition from
French to British domination. The houses begin to move back from
the sidewalk, and to take on front porches. I saw workmen putting in
double windows, in preparation for winter, and noticed that the sides
of many of the brick houses are clapboarded to keep the frost out of
the mortar. Still farther out apartments appear, while a little beyond
are all the marks of a suburban real estate boom. Most of the “for
sale” signs are in both French and English.
Now come with me and look at another Quebec, of which you
probably have never heard. The city is built, as you know, where the
St. Charles River flows into the St. Lawrence. The valley of the St.
Charles has become a great hive of industry, and contains the
homes of thousands of French workers. Looking down upon it from
the ancient Martello Tower on the heights of the Upper Town, we see
a wilderness of factory walls, church spires, and the roofs of homes.
Beyond them great fields slope upward, finally losing themselves in
the wooded foothills of the Laurentian Mountains. Cotton goods,
boots and shoes, tobacco, and clothing are manufactured here. It
was from this valley that workers for the textile and shoe industries of
New England were recruited by thousands. A few miles upstream is
the village of Indian Lorette, where descendants of a Huron tribe,
Christianized by the French centuries ago, make leather moccasins
for lumberjacks and slippers for American souvenir buyers. A big fur
company also has a fox farm near Indian Lorette.
Quebec was once the chief port of Canada, but when the river
was dredged up to Montreal it fell far behind. All but the largest
transatlantic liners can now sail for Europe from Montreal, though
they make Quebec a port of call. Quebec is five hundred miles
nearer Liverpool than is New York, and passengers using this route
have two days less in the open sea. The navigation season is about
eight months. The port has rail connections with all Canada and the
United States. Above the city is the world’s longest cantilever bridge,
on which trains cross the river. After two failures the great central
span, six hundred and forty feet long, was raised from floating
barges and put into place one hundred and fifty feet above the water.
In the English atmosphere of the Maritime Provinces I felt quite
at home, but here I seem to be in a foreign land, and time has been
pushed back a century or so. We think of Canada as British, and
assume that English is the national language. But in Quebec, its
largest province, containing about one fifth of the total area, nearly
nine tenths of the people are French and speak the French
language. They number almost one fourth of the population of the
Dominion.
Quebec is larger than Texas, New Mexico, Arizona, and
California combined; it is nearly as big as all our states east of the
Mississippi River put together. Covering an area of seven hundred
thousand square miles, it reaches from the northern borders of New
York and New England to the Arctic Ocean; from the Gulf of St.
Lawrence and Labrador westward to Hudson Bay and the Ottawa
River. Most Americans see that part of Quebec along the St.
Lawrence between the capital and Montreal, but only one fourteenth
of the total area of the province lies south of the river. The St.
Lawrence is more than nineteen hundred miles long, and Quebec
extends along its north bank for almost the entire distance.
Jacques Cartier sailed up the St. Lawrence in 1535 and claimed
possession of the new land in the name of “Christ and France.”
Later, French soldiers and priests pushed their way up the river,
explored the Great Lakes, and went down the Mississippi. It was
French fur traders, fishermen, and farmers who opened up and
populated eastern Canada. With no immigration from France since
British rule began, the population of the province of Quebec has had
a natural increase from about sixty thousand to more than two
millions. The average family numbers from six to eight persons,
while families of twelve and fourteen children are common. Quebec
maintains the highest birth rate of any province in Canada. It has
also the highest death rate, but there is a large net gain every year.
Quebec is one of the chief reservoirs of Canada’s natural wealth.
It leads all other provinces in its production of pulpwood, and
contributes more than one half the Dominion’s output of pulp and
paper. It is second only to British Columbia and Ontario in lumber
production, while its northern reaches contain the last storehouse of
natural furs left on our continent.
Canada is one of the world’s great sources of water-power.
Nearly half of that already developed is in the province of Quebec,
and her falling waters are now yielding more than a million horse-
power. Tens of thousands of additional units are being put to work
every year, while some five million horse-power are in reserve. It
would take eight million tons of coal a year to supply as much power
as Quebec now gets from water.
The ancient citadel on the heights of Quebec is
now dwarfed by a giant castle-like hotel that helps
make the American Gibraltar a tourist resort. Its
windows command a magnificent view of the St.
Lawrence.
The St. Louis gate commemorates the days when
Quebec was a walled city and always well garrisoned
with troops. Just beyond is the building of the
provincial parliament, where most of the speeches are
in French.
At Three Rivers, about halfway between Montreal and Quebec,
the St. Maurice River empties into the St. Lawrence. Twenty miles
upstream are the Shawinigan Falls, the chief source of power of the
Shawinigan Company, which, with its subsidiaries, is now producing
in this district more than five hundred thousand horse-power. This is
nearly half the total power development in the province. Around the
power plant there have grown up electro-chemical industries that
support a town of twelve thousand people, while at Three Rivers
more paper is made than anywhere else in the world. Shawinigan
power runs the lighting plants and factories of Montreal and Quebec,
and also serves most of the towns south of the St. Lawrence. The
current is carried over the river in a thick cable, nearly a mile long,
suspended on high towers.
In the Thetford district of southern Quebec, power from
Shawinigan operates the machinery of the asbestos mines. Fifty
years ago, when these deposits were discovered, there was almost
no market for asbestos at ten dollars a ton. Nowadays, with its use in
theatre curtains, automobile brake linings, and coatings for furnaces
and steam pipes, the best grades bring two thousand dollars a ton,
and two hundred thousand tons are produced in a year. Quebec now
furnishes eighty-eight per cent. of the world’s annual supply of this
mineral.
The Quebec government controls all power sites, and leases
them to private interests for ninety-nine year terms. The province has
spent large sums in conserving its water-power resources. At the
headquarters of the St. Maurice River, it built the Gouin reservoir,
which floods an area of more than three hundred square miles, and
stores more water than the great Aswan Dam on the Nile.
Quebec is the third province in value of agricultural production.
What I have seen of its farms convinces me that the French
Canadian on the land is a conspicuous success. For a half day I
rode along the south shore of the St. Lawrence River through a
country like one great farm. Nearly every foot of it is occupied by
French farmers. Most of the time we were on high ground,
overlooking the river, which, where we first saw it, was forty miles
wide. It grew constantly narrower, until, where we crossed it on a
ferry to Quebec, its width was less than a mile. All the way we had
splendid views of the Laurentian Mountains, looming up on the north
shore of the river. Geologists say the Laurentians are the oldest rock
formation on our continent. They are not high, the peaks averaging
about sixteen hundred feet elevation, but they are one of the great
fish and game preserves of the world and are sprinkled with hunting
and fishing clubs.
In accordance with French law the Quebec farms have been
divided and sub-divided among so many succeeding generations
that the land is cut into narrow ribbons. Contrary to the custom in
France, however, every field is fenced in with rails. I am sure that the
fences I saw, if joined together, would easily reach from Quebec to
Washington and back. They did not zig-zag across the fields like
ours, thereby wasting both rails and land, but extended in a straight
line, up hill and down, sometimes for as much as a mile or more.
The standard French farm along the St. Lawrence used to be
“three acres wide and thirty acres long,” with a wood lot at the farther
end, and the house in the middle. As the river was the chief highway
of the country, it was essential that every farmer have water
frontage. With each division one or more new houses would be built,
and always in the middle of the strip. The result is that every farmer
has a near neighbour on each side of him, and the farmhouses form
an almost continuous settlement along the highway, much like the
homes on a suburban street. Each wood lot usually includes several
hundred maple trees, and the annual production of maple sugar and
syrup in Quebec is worth several hundred thousand dollars. The
maple leaf is the national emblem of Canada.
The houses are large and well built. They have narrow porches,
high above the ground, reached by steps from below. This
construction enables the occupants to gain access to their living
rooms in winter without so much snow shovelling as would otherwise
be necessary. For the same reason, most of the barns are entered
by inclines leading up to the second floor and some are connected
with the houses by bridges. The older houses are of stone, coated
with whitewashed cement. With their dormer windows and big,
square chimneys they look comfortable.

PDF Kafka The Definitive Guide 2Nd Edition Early Release Neha Narkhede Ebook Full Chapter

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PDF Kafka The Definitive Guide 2Nd Edition Early Release Neha Narkhede Ebook Full Chapter

Uploaded by

Copyright:

Available Formats

Kafka: The Definitive Guide, 2nd Edition

(Early Release) Neha Narkhede

Kafka The Definitive Guide Real Time Data and Stream

Kafka: The Definitive Guide: Real-Time Data and Stream

Biota Grow 2C gather 2C cook Loucas

Cassandra The Definitive Guide Distributed Data at Web

JavaScript The Definitive Guide 7th Edition David

R Markdown: The Definitive Guide Yihui Xie

Fashion The Definitive Visual Guide Caryn Franklin

The Ghidra Book The Definitive Guide 1st Edition Chris

i. Messages and Batches

f. Getting Started with Kafka

2. 2. Managing Apache Kafka Programmatically

i. Asynchronous and Eventually Consistent

b. AdminClient Lifecycle: Creating, Configuring

c. Essential Topic Management

i. Exploring Consumer Groups

i. Adding partitions to a topic

Real-Time Data and Stream Processing at

Gwen Shapira, Todd Palino, Rajini Sivaram,

Copyright © 2022 Gwen Shapira, Todd Palino, Rajini Sivaram,

Printed in the United States of America.

Published by O’Reilly Media, Inc., 1005 Gravenstein Highway

O’Reilly books may be purchased for educational, business, or

Acquisitions Editor: Jonathan Hassell

Development Editor: Gary O’Brien

Production Editor: Kate Galloway

Interior Designer: David Futato

Cover Designer: Karen Montgomery

October 2021: Second Edition

Revision History for the Early

The O’Reilly logo is a registered trademark of O’Reilly Media,

A NOTE FOR EARLY RELEASE READERS

This will be the 1st chapter of the final book.

Every enterprise is powered by data. We take information in,

Any time scientists disagree, it’s because we have

Figure 1-1. A single, direct metrics publisher

This is a simple solution to a simple problem that works when

Figure 1-2. Many metrics publishers, using direct connections

The technical debt built up here is obvious, so you decide to pay

Individual Queue Systems

This is certainly a lot better than utilizing point-to-point

Messages and Batches

For efficiency, messages are written into Kafka in batches. A

A consistent data format is important in Kafka, as it allows

Topics and Partitions

The term stream is often used when discussing data within

Producers and Consumers

Consumers read messages. In other publish/subscribe systems,

Consumers work as part of a consumer group, which is one or

In this way, consumers can horizontally scale to consume

Figure 1-6. A consumer group reading from a topic

Brokers and Clusters

Kafka brokers are designed to operate as part of a cluster.

A key feature of Apache Kafka is that of retention, which is the

Segregation of types of data

Isolation for security requirements

Multiple datacenters (disaster recovery)

When working with multiple datacenters in particular, it is

The Kafka project includes a tool called MirrorMaker, used for

Figure 1-8. Multiple datacenter architecture

The Data Ecosystem

Apache Kafka provides the circulatory system for the data