Professional Documents
Culture Documents
Table of Contents
General Notes ............................................................................................................. 1
Hands-On Exercise: Using Your Exercise Environment ......................................... 3
Demonstration: NiFi User Interface ......................................................................... 8
Hands-On Exercise: Build Your First Dataflow ..................................................... 16
Hands-On Exercise: Start Building a Dataflow Using Processors ........................ 20
Hands-On Exercise: Connect Processors in a Dataflow ....................................... 25
Hands-On Exercise: Build a More Complex Dataflow .......................................... 29
Hands-On Exercise: Creating a Fork Using Relationships .................................... 35
Hands-On Exercise: Set Back Pressure Thresholds .............................................. 42
Hands-On Exercise: Simplify Dataflows Using Process Groups ........................... 48
Hands-On Exercise: Using Data Provenance ......................................................... 57
Hands-On Exercise: Creating, Using, and Managing Templates .......................... 71
Hands-On Exercise: Versioning Flows Using NiFi Registry ................................. 76
Hands-On Exercise: Working with FlowFile Attributes ....................................... 83
Hands-On Exercise: Using the NiFi Expression Language ................................... 94
Hands-On Exercise: Building an Optimized Dataflow ........................................ 103
Hands-On Exercise: Building Site-to-Site Dataflows .......................................... 113
Hands-On Exercise: Monitoring and Reporting .................................................. 121
Hands-On Exercise: Adding Apache Hive Controller ......................................... 125
Hands-On Exercise: Integrating Dataflows with Kafka and HDFS .................... 127
General Notes
• scripts—contains the course setup script and other scripts required to complete
the exercises.
• In some steps in the exercises, you will see instructions to enter commands like this:
The dollar sign ($) at the beginning of each line indicates the Linux shell prompt. The
actual prompt will include additional information such as user name, host name, and
current directory (for example, [training@localhost ~]$) but this is omitted
from these instructions for brevity.
The backslash (\) at the end of a line signifies that the command is not complete
and continues on the next line. You can enter the code exactly as shown (on multiple
lines), or you can enter it on a single line. If you do the latter, you should not type in
the backslash.
• Graphical editors
If you prefer a graphical text editor, you can use Pluma. You can start Pluma using an
icon from the remote desktop tool bar. (You can also use Emacs if you prefer.)
Catching Up
Many of the exercises in this course build on dataflows you created in previous
exercises. If you are unable to complete an exercise, you can use use one of the provided
solution dataflows in the area of the NiFi canvas labeled Solutions. When a dataflow
from a previous exercise is needed, the necessary solution dataflow is indicated
toward the beginning of the exercise. Some solution dataflows are not prerequisites
for subsequent exercises, but are there simply to show the solution for a given exercise
if needed. The exercise "Optimizing a Dataflow" is an example. Not all exercises have
solution dataflows, for example, "Monitoring and Reporting".
The solution dataflow for the peer side of "Building Site-to-Site Dataflows" and
the solution dataflows for "Integrating Dataflows with Kafka and HDFS" are not on
the canvas initially. These are templates that must be imported and instantiated
individually if you need them (they are not prerequisites for any other exercises).
The template files for these dataflows are site-to-site-dataflow-solution-
peer.xml and integrate-dataflow-solution.xml, located in ~/
training_materials/nifi/exercises, as are all other individual dataflow
solutions. You learn to import and instantiate templates in this course.
1. In your local browser, open the URL provided by your instructor to view the
exercise environment portal.
2. The environment portal page displays thumbnail images for your exercise
environment hosts. The hosts should be started already (indicated by a green
background on the thumbnail image), but might be suspended or powered off,
indicated by a gray or blue background.
There are two hosts: Master and Peer.
a. If the Peer host is running, shut it down by clicking the “Power options for this
VM” icon above the thumbnail (indicated by a power button symbol).
Note: The Peer host is not used until the exercise in which a site-to-site
dataflow is built later in the course. It saves resources to keep it turned off until
it is needed.
3. Click the host thumbnail of the Master host to open a new window showing the
remote host machine.
The Master remote host desktop will display. The exercises refer to this as the
“remote desktop” to distinguish it from your own local machine’s desktop.
All exercises in the course are performed on the remote desktop of the Master host
except for the site-to-site exercise, which is completed using both the Master and
Peer host systems.
4. Open a new terminal window using the remote host desktop shortcut.
$ check-health.sh
6. Confirm that all required services are noted as “good.” Required services: HDFS,
Hive, Kafka, NFiFi, NiFi Registry, Zookeeper.
7. If any of the required services are “bad”, wait several minutes and then try again. (It
can take up to 15 minutes for all services to start fully.) If they are still not running
correctly, try restarting all the services by running the following script:
$ start-cluster.sh
Wait until the script completes and then check the health again.
Follow the instructions above in the Connecting to Your Remote Host section to restart
your environment.
The table below describes the usage of some important toolbar functions.
Return to the environment portal
Copy text between the remote desktop’s clipboard and your local
machine’s desktop
Resize the desktop to the size of your local browser window
Note: Depending on the size of your browser window, you might need to hide the
browser client toolbar to be able to see the full desktop toolbar.
a. Go to https://university.cloudera.com/user/learning/
enrollments and log in to your account. Then from the dashboard, find this
course in the list of your current courses.
b. Select the course title, then click to download the Exercise Manual under
Materials.
This will save the Exercise Manual PDF file in the Downloads folder in the
training user’s home directory on the remote host.
a. Open a terminal window using the shortcut on the remote desktop, and then
start the Atril PDF viewer:
$ atril &
b. In Atril, select menu item File > Open and open the Exercise Manual PDF file in
the Downloads directory.
Demo Instructions
Explore Canvas Navigation
1. If you have not already done so, start the Firefox browser on your remote desktop.
2. You will see browser bookmarks for NiFi and NiFi Registry.
3. In the “bird’s-eye view” area of the Navigate palette, drag the rectangle
representing the view left and right to pan the canvas. Also try scrolling by dragging
the canvas itself.
4. Try dragging the panning rectangle up and to the left, so that it extends out of the
bird’s-eye view of the canvas. When you release the rectangle, the available canvas
will grow.
You can do this whenever your canvas is full and you need a clear area to work on.
5. Use the + and - buttons on the Navigate palette to zoom in and out. Also try
zooming by using your mouse’s scrolling function.
6. Use the Navigate palette “fit” and “actual” buttons to adjust the canvas view.
7. Select a processor component from the toolbar at the top of the canvas and drag it
onto a clear area of the canvas.
8. The Add Processor dialog window will be displayed showing a list of available
processor types.
In the search box in the upper right, enter the text lists3. This will filter the list to
locate the ListS3 processor.
Note: The ListS3 processor retrieves a list of files from an Amazon Web Services
S3 bucket. However, this demo will not show the usage of the processor, only how
to create and manipulate a processor on the canvas.
9. Select ListS3 in the list of processors and click the ADD button. The ListS3
processor will appear on the canvas.
10. Click the component to select it, then try dragging it to a different location on
the canvas. Moving processors allows you to keep your canvas and dataflows
organized.
11. Locate the Operate palette (below the Navigate palette.) Selecting a component
on the canvas activates buttons to operate on component. Note that the palette
displays the name of the currently selected component.
While the ListS3 component is selected, click the “Copy” button to copy the
selected component to the clipboard.
Note: You can also copy a component by right-clicking on the component to bring
up the context menu and selecting Copy.
12. Click on the canvas to unselect the ListS3 component, then click the “Paste”
button on the Operate palette. A copy of the first component will be pasted near the
existing one.
Note: You can also paste a component by right-clicking on the canvas to bring up
the context menu and selecting Paste. (The Paste option will only show if you have
copied a component to the clipboard.)
13. Select the two components you created. You can select multiple components one of
two ways:
• Hold down the SHIFT key while you click on the components you want to select.
• Hold down the SHIFT key while you drag your pointer on the canvas and draw a
rectangle around the components you would like to select.
Using one of these methods, select the two ListS3 processor components you
added earlier.
14. Note that the Operate panel now says Multiple components selected. Selecting
multiple components activates the “Group” button, which will move your
components into a process group. Click the “Group” button.
15. The “Group” button will prompt you for a group name. Enter Demo Group and
click ADD.
The two components you selected will be replaced on the canvas by a process
group.
16. Double-click on the Demo Group process group to display the process group’s
canvas. The canvas should include the two processors you added to the group.
17. Return to the root canvas by clicking the NiFi Flow link in the breadcrumb trail in
the lower left.
18. Open the Global Menu, indicated by three horizontal lines in the upper right corner.
(This is also sometimes called the “hamburger menu” because of the appearance of
the icon.)
19. The options on the Global Menu lets you view and configure details about all your
flows and components, the state of the cluster, FlowFile data, and so on.
Select the Flow Configuration History item for an example of a global function.
This will display a list of various changes that have been made on the canvas,
including ones you made above.
Close the history window by clicking the X in the upper right corner.
20. One of the most important options for this class is being able to review the NiFi
documentation from within the UI.
Select the Help item on the Global Menu.
Explore the documentation by clicking on a few different subjects on the left and
viewing the documentation on the subjects on the right. Feel free to refer to this
documentation throughout the remainder of the course.
When you are done exploring the documentation, close the history window by
clicking the X in the upper right corner.
21. Click the search button on the right side of the toolbar (indicated by a magnifying
glass) to open a search box. Enter the search string lists3. This will display a list
of all the ListS3 components on the canvas.
If you select one of the search results, it will select and center the corresponding
component.
22. You will not need the processors and process group you created in this demo in
later exercises. You can delete them now so that they do not clutter you canvas.
Select the Demo Group process on the root canvas and click DELETE in the
Operate panel.
a. In the NiFi UI, drag a processor onto the canvas and filter with generateflow.
b. Click ADD.
e. Click APPLY.
a. Drag a processor to the canvas and filter with logattribute to add it.
e. Click APPLY.
b. You will see a Create Connection screen. Ensure that success is checked on the
DETAILS tab.
c. Click ADD.
Your dataflow should look like this:
Note: Do not start this dataflow. You will have opportunities to run dataflows in
subsequent exercises.
a. In the NiFi UI, drag a processor onto the canvas and filter with TailFile.
b. Click ADD.
c. Right-click on the TailFile processor and select Configure. (You can also
double-click on the processor).
d. On the SETTINGS tab, name the processor Put app log TailFile.
g. Click APPLY.
a. Drag a processor to the canvas and filter with splittext to add it.
c. On the SETTINGS tab, name the processor Put app log SplitText.
d. Check the boxes next to failure and original for Automatically Terminate
Relationships.
Your SETTINGS tab should look like this:
f. Click APPLY.
c. Check the boxes next to failure and success for Automatically Terminate
Relationships. This is because the FlowFiles will be written to disk at this point
and are not needed subsequently.
Your SETTINGS tab should look like this:
e. Click APPLY.
The three processors on your canvas should look something like this:
2. Select all processors by holding down the shift key and clicking each one, or holding
down the shift key and dragging your pointer around all components.
4. On the lower left of your screen, click NiFi Flow to return to the root canvas.
5. Use the Navigate palette or click on the canvas and drag your pointer to the right to
create an empty space for the components.
6. Right-click and select Paste to paste the processors onto the canvas.
7. Move the processors to a convenient spot on the canvas to work with them in this
exercise.
a. Hover over the TailFile processor with your pointer and a connection
symbol will appear ( ). Click on this symbol and drag it on top of the
SplitText processor. A dashed line will appear and turn green when the
connection is properly positioned.
b. You will see a Create Connection screen. Ensure that success is checked on the
DETAILS tab.
c. Click ADD.
b. On the DETAILS tab, check the box next to splits. You are only connecting
flowfiles that have been split at this point. The failure and original
relationships were terminated in the SplitText processor.
c. Click ADD
Your dataflow should look something like this:
a. While holding the shift key, select the processors in the dataflow. (You can also
hold down the shift key and drag an outline around all components).
b. On the Operate palette, click the “Start” button (shown as arrow icon) to start
the dataflow
After a few seconds, you should see statistics on the surface of the processors,
indicating that data is moving through your dataflow.
c. In a terminal window, use the ls -l command to list the files in the /tmp/
nifi/putfile directory.
$ ls -l /tmp/nifi/putfile
You should see one or more files. The filenames are of the form
nifi-app.xxx-yyy.log. The numbers xxx and yyy indicate the byte range
from the input file (nifi-app.log) that is in the saved file.
In this exercise, you will build a more complex dataflow by adding additional
processors and connections to the dataflow previously created.
You will add CompressContent and UpdateAttribute processors to the dataflow.
You will then start the dataflow and observe as the dataflow writes files to the disk.
The dataflow will implement the following scenario:
1. Select all dataflow components from inside this process group and copy them.
2. Return to the root canvas, create an empty space for the dataflow, and paste it onto
the canvas.
3. As needed, move the dataflow to a convenient spot on the canvas to work with in
this exercise.
3. Move the connection that is currently between the SplitText and PutFile
processors to be between the SplitText and CompressContent processors.
a. Click on the connection. It will highlight the ends of the arrow to be red and
blue.
b. Drag the blue end (the end by the point of the arrow) from the PutFile
processor to the CompressContent processor.
4. Drag the PutFile processor down on the canvas to make room for the next
processor.
In the dataflow so far, you have collected the log data, split it as needed, and
compressed the resulting file. You now need to rename the new files to include a
prefix with the current time.
b. In the PROPERTIES tab, click the + symbol and add a new property called
filename.
Note: User-defined properties and the NiFi Expression Language are covered in
depth later.
After a few moments, you will see that the statistics are changing as data flows
through the processors.
$ ls /tmp/nifi/putfile
11. Stop the dataflow after you have confirmed that it is working. You can use the
same techniques to stop the dataflow that you did to start it above, using the “Stop”
button instead of the “Start” button.
In this exercise, you will add additional PutFile processors to your existing
dataflow to configure different relationships.
The dataflow will implement the following scenario:
b. Under SETTINGS, observe that the boxes next to failure and original are
checked, while splits is not.
With failure checked, files will be dropped at this point if they cannot be split,
with no further processing.
With original checked, if files are successfully split, the original files will be
dropped at this point, but split files will be sent on in the dataflow for further
processing.
a. Ensure all the processors in the Put app log dataflow are stopped.
c. Change the names of each processor in the duplicated flow, replacing Put app
log with Relation at the beginning of each name.
c. Ensure the Conflict Resolution Strategy is set to replace and Create Missing
Directories is set to true.
d. On the SETTINGS tab, select failure and success for Automatically Terminate
Relationships, since there is nothing further to do with the FlowFiles after they
are saved to disk at this point.
c. Ensure the Conflict Resolution Strategy is set to replace and Create Missing
Directories is set to true.
d. On the SETTINGS tab, select failure and success for Automatically Terminate
Relationships.
6. Uncheck the failure and original relationship boxes under SETTINGS in the
SplitText processor.
No boxes should be checked under Automatically Terminate Relationships at
this point. Each relationship—failure, original, and splits—will now be routed to
different PutFile processors and should not automatically be terminated.
The relevant part of your dataflow should now look something like this:
10. After about 15 seconds, stop the entire dataflow to prevent it from using up too
much space on the system.
$ ls -l /tmp/nifi/relation
You should see zipped, timestamped files for today’s date, having filenames
beginning with nifi-applog.
$ ls -l /tmp/nifi/original
You should see files having filenames beginning with nifi-app and ending
with log.
c. You should not see a failure directory with any files in it.
$ ls -l /tmp/nifi/failure
This should return No such file or directory because there has not
been a failure in splitting files yet.
d. Observe that the Relation SplitText processor shows files inbound and
outbound, and the Relation PutFile Failure processor shows files and
data inbound.
e. Hover over the bulletin section (upper right corner) of the Relation
SplitText processor and observe the errors generated.
$ ls -l /tmp/nifi/failure
You should see files having filenames beginning with nifi-app and ending
with log.
13. Return the Relation SplitText properties to their default values by removing
the Maximum Fragment Size value and setting the Header Line Count back to 0.
14. Start the dataflow and note that the Relation PutFile Failure processor
no longer receives files or data, while the Relation PutFile Original and
Relation PutFile processors do.
In this exercise, you will set back pressure object and size thresholds in
connections to create conditions in which back pressure can be observed.
The Put app log dataflow will be used throughout this exercise.
Note that the canvas only refreshes by default every 30 seconds. As you observe
statistics on the connection queues and processors in this exercise, you can right-
click on the canvas and select Refresh in order to see the very latest statistics in your
dataflow.
Note also that statistics shown on processor surfaces will gradually return to zero
after 5 minutes when there is no data being processed, though the statistics on the
connection queues will remain. Some of the screen shots in this exercise show zero
statistics on processor surfaces because of this. You may observe the same thing,
depending on the pace with which you move through the exercise. This is normal NiFi
behavior.
Important: This exercise depends on completion of Hands-On Exercise: Build a More
Complex Dataflow. If you did not complete that exercise, use the Complex Dataflow
Solution process group provided for you on the Solutions area of your canvas.
Copy the dataflow inside this process group to an empty area of the root canvas prior to
beginning this exercise.
2. Click and drag the SplitText processor to make space for another processor. You
will add this processor back to the dataflow at the end of the exercise
9. Start the GenerateFlowFile and SplitText processors. Do not start any other
processors in the dataflow.
11. Set the Back Pressure Object Threshold of the first connection (between the
GenerateFlowFile and SplitText processors) to 10.
12. Start the GenerateFlowFile and SplitText processors. Do not start any other
processors in the dataflow.
Note that when the number of objects queued in the first connection reaches 10, the
GenerateFlowFile processor stops sending files.
Observe that the number of objects queued in the splits connection does not
change because it still exceeds the maximum object threshold.
Files still cannot be processed by the SplitText processor.
14. Set the Back Pressure Object Threshold of the first connection (between the
GenerateFlowFile and SplitText processors) to 20.
Note that the object threshold in the first connection queue is no longer exceeded.
15. Start the GenerateFlowFile and SplitText processors. Do not start any other
processors in the dataflow.
More files are now being queued in the first connection, but the splits connection
receives no more files and files still are not being processed by the SplitText
processor.
16. After the queued file count in the first connection reaches 20, stop the
GenerateFlowFile and SplitText processors.
17. Set the splits connection Back Pressure Object Threshold to 250 and the Size
Threshold to 5 KB.
Note: If the number of files shown in your splits connection queue is greater
than 250, set the Back Pressure Object Threshold to something greater than the
number of files shown in the queue.
Without starting any processors, the splits connection queue shows the same
number of files queued, but the size threshold is exceeded immediately. You may
need to refresh your canvas to see this.
19. Start the GenerateFlowFile and SplitText processors. Do not start any other
processors in the dataflow.
Note that more files are now queued in the splits connection and the first
connection queue goes empty. The SplitText processor also shows some data
being processed.
21. Stop the entire dataflow after running it for 20-30 seconds.
22. Verify that files have been written to the /tmp/nifi/putfile directory for this
dataflow.
23. Reset the Back Pressure Object Threshold to 10000 and the Back Pressure Size
Threshold to 1 GB in the first and splits connections.
In this exercise, you will practice using process groups with dataflows.
• Duplicate the existing flow that collects the output of an application log
• Partition both flows into separate process groups
• Create a third group to save the data using input and output ports
1. Ensure all the processors in the Put app log dataflow are stopped.
2. Select all the existing dataflow components. Right-click and select Copy, then right-
click and select Paste. This will duplicate the entire flow and create a new dataflow
to run in parallel with the original one.
3. Change the names of each processor in the new flow you just created to prepend
Put user log at the beginning of each name, instead of Put app log.
4. Modify the TailFile processor in the new Put user log dataflow.
5. View the state of the Put app log TailFile processor by right-clicking on
the processor and selecting View State, then click Clear State. This resets the read
pointer so the processor can beginning reading data again.
6. Repeat the previous step to clear the state of the Put user log TailFile
processor in the new dataflow if necessary.
7. Change the filename property of the new Put user log UpdateAttribute
processor to nifi-userlog.${now():format('HH:mm:ss')}.gz.
Now you have two parallel flows, each collecting data from a different log, processing it,
and saving it to same output directory with different file names.
8. Select all the components of the original Put app log dataflow, then click the
“Group” button on the Operate palette. This will prompt you for a process group
name. Enter Put app log Group.
9. Repeat the same steps for the Put user log (duplicate) dataflow. Give it the
name Put user log Group.
10. Now you have two independent process groups working in parallel, each with their
own flow. Start the two flows and confirm they are working correctly.
11. Stop the flows in each process group before moving to the next step.
12. Drag a process group icon onto the canvas. When prompted, set the name to Save
app-user log Group and click ADD.
13. Double-click the Put app log Group process group to view its dataflow.
14. Right-click the PutFile processor and select Copy. (Simply clicking the processor
and entering Ctrl+C will also work.)
15. Return to the root canvas by clicking NiFi Flow in the navigation breadcrumb trail
below the canvas.
16. Double-click to open the Save app-user log Group process group. Paste the
PutFile processor you copied above.
17. Now you have a PutFile processor inside the Save app-user log Group
process group. The next step is to add an input port to the process group to receive
data.
a. Drag the input port icon onto the canvas and name the port Put app-user
log InputPort.
18. Open the Put app log Group process group again and delete the connection
between the UpdateAttribute and PutFile processors, then delete the
PutFile processor.
19. Add an output port to the canvas and name it Put app log OutputPort.
20. Create a connection from the UpdateAttribute processor to Put app log
OutputPort.
Note: The port will stay in an invalid state until you connect it to another processor.
21. Repeat the same steps inside the Put user log Group process group to add an
output port called Put user log OutputPort.
23. Return to the root canvas. Connect Put app log Group to Save app-user
log Group. In the configuration pop up window, simply click ADD, as there is only
one input port and one output port at the source and destination.
24. Repeat the steps above to create a connection from Put user log Group to
Save app-user log Group.
25. Start the process groups and confirm that the flows still work as they did before.
When you are done, stop the process groups.
2. List the files in the /tmp/nifi/original directory in time order and note the
most recent FlowFile saved there.
$ ls -lt /tmp/nifi/original
You should see files having filenames beginning with nifi-app and ending with
log. These are files written to disk by the Relation Putfile Original
processor.
3. Copy the filename for the most recent file, for example nifi-
app.16358743-18061163.log.
4. Open the global menu using the “Open menu” icon in the upper right corner of the
UI (indicated by three horizontal lines) and select Data Provenance.
5. Peruse the list of provenance events and hover over the “View Details” (to the left of
each event), and “Show Lineage” and “Go To” icons (to the right of each event). Note
how many events are in the list. In the example screenshot, 1,000 events are shown.
6. Click on the search icon (indicated by a magnifying glass) on the top right of the list
and enter the filename you copied from the /tmp/nifi/original directory.
If your work on this exercise continues from one day to the next,
you will need to enter a value for Start Date in the search dialog,
specifying the day you first ran this dataflow .
7. Click SEARCH.
Note the resulting list of events from this search. In this example, 755 events are
listed now.
The list is ordered from the most recent to the oldest events.
Note: Depending on the width of your browser, you might not be able to see the
full contents of some of the cells in the list. If you hover your pointer over the cell, a
popup will display showing the full value in the cell.
8. Click on the Date/Time column to reorder the events from oldest to newest. Note
that the first events listed are RECEIVE, FORK, DROP, and SEND. All remaining
events are CONTENT_MODIFIED.
10. Scroll to the end of the list and note the timestamp of that event. In this example it is
09:53:16.602.
11. Calculate the difference between these two times. In this example, the difference
is 0.688 seconds. This means that in less than one second, 755 events were
processed for this single FlowFile. This gives some perspective on how fast NiFi can
process files. Your results will vary.
12. Scroll back to the top of the event list. Ensure that the list is still sorted by Date/
Time, from oldest to newest.
The first entry is a RECEIVE event. Notice that this event was produced by the
Relation TailFile component. This is the first processor in the dataflow and
the FlowFile was generated by tailing the nifi-app.log.
13. Click on the “View Details” icon to the left of the RECEIVE event. Ensure you are on
the DETAILS tab.
Note the values shown for Time, Event Duration, Type, FlowFile Uuid, File Size,
Component Name and Component Type. You might need to scroll down to see all
these details.
Note that the Source FlowFile Id field shows the bytes of the source file that the
FlowFile contains. The Transit Uri displays the source filename: /var/log/
nifi/nifi-app.log.
14. Click on the ATTRIBUTES tab. The filename attribute shows the name that the
TailFile processor assigned to the FlowFile it generated when reading from the
source file (/var/log/nifi/nifi-app.log). Note that the filename is in the
form nifi-app.xxx-yyy.log. The numbers xxx and yyy indicate the byte
range of the data read from the source file that is contained in this FlowFile.
16. Click on the “View Details” icon for this FORK event.
Note that the FlowFile Uuid shown on the DISPLAY tab is the same as for the
RECEIVE event (that is, it is the same file) and that it is the parent FlowFile of many
child FlowFiles.
17. On the ATTRIBUTES tab, note that the filename is the same as for the RECEIVE
event.
From the SplitText processor, the original FlowFile is sent to the Relation
PutFile Original processor where it is written to /tmp/nifi/original
directory and then dropped.
20. Scroll down if necessary and note Transit Uri refers to the filename of the FlowFile
it received from the upstream processor (SplitText).
The PutFile processor uses the transit URI of each FlowFile it receives as the
destination to which the FlowFile contents are saved.
24. Scroll down if necessary and note under Details that this event was Auto-
Terminated by success Relationship.
25. Click on the ATTRIBUTES tab and note that the filename is the same as the
FlowFiles in the previous events.
View the event’s details and note that the UUID is now different from the ones in
the previous events. This FlowFile is one of the files you saw in the list of Child
FlowFiles shown in the details of the FORK event. The Parent FlowFile is the
FlowFile first generated by TailFile, as you can see from its UUID.
27. On the ATTRIBUTES tab, note that the filename and segment.original.filename
give the name of the first FlowFile—in this example, nifi-
app.16358743-18061163.log. tailfile.original.path gives the path to the
nifi-app.log read by the TailFile processor and text.line.count gives the
number of lines specified for splitting FlowFiles in the SplitText processor—15.
In the example, the first FlowFile was split into 751 child FlowFiles, as shown in the
provenance event list.
29. Click on the “Show Lineage” icon to the right of the event line for the
CONTENT_MODIFIED event.
The red circle indicates the event for which the lineage is being examined
—CONTENT_MODIFIED in this example.
This lineage shows the events involved from compression of the first file generated
by the SplitText processor to its being given a timestamp and a new filename in
the UpdateAttribute processor, to its being written to disk and dropped.
31. Right-click on the CONTENT MODIFIED event of the lineage and click View details.
This is the same information seen for the first CONTENT_MODIFIED event by
clicking the “View Details” icon from the provenance event list. The FlowFile Uuid
specified here is the FlowFile for which this lineage is being examined.
33. Right-click on the ATTRIBUTES MODIFIED event of the lineage and click View
details.
Under DETAILS, note that the FlowFile Uuid is the same as for the
CONTENT_MODIFIED event seen previously.
36. Right-click on the SEND event of the lineage and click View details.
Note that the FlowFile Uuid is still the same as for the first CONTENT_MODIFIED
event and that the Transit Uri shows that the compressed, timestamped file was
written to the /tmp/nifi/relation directory as specified in the Relation
PutFile processor.
37. On the ATTRIBUTES tab, note that the filename is the same as observed in the
ATTRIBUTES MODIFIED part of the lineage.
40. On the ATTRIBUTES tab, observe that the filename is the same as for the SEND
lineage event (timestamped and zipped) and the UUID (seen at the bottom of the
window) is the same as the FlowFile UUID for the CONTENT_MODIFIED event.
The file is dropped after being written to disk.
The lineage is expanded to show the parent lineage of the CONTENT MODIFIED
FlowFile.
Another FlowFile icon is visible, representing the first FlowFile coming from a
RECEIVE event that is now also visible. Hovering over this icon shows its lineage
through being written to disk and dropped.
A diagram of child FlowFile lineages is displayed. You can drag the view to the right
or left to see more of the diagram.
46. Click the arrow icon on the upper right of the window to return to the provenance
event list.
47. Click the arrow on the far right of the first CONTENT_MODIFIED event to view the
Relation CompressContent processor on the canvas that generated the event.
Creating a Template
Build a Simple Dataflow
Build a very simple flow from which you will create a template.
Create a Template
4. Select all the components of the simple flow you created above.
6. Give the template the name Simple Flow Template, then click CREATE.
8. When prompted, ensure that the Simple Flow Template you created above is
selected, then click ADD.
9. A second copy of the simple flow will be added to the canvas. Review the
configurations of the new flow’s components with those of the original to confirm
that they are the same.
11. The NiFi Templates window appears showing a list of all templates, including the
one you created in the previous section.
12. Click the “Download” button to the right of your template. This will create an XML
containing the template.
13. Firefox will ask if you want to open or save the template XML file.
Choose Save File, then click OK. This will save the template to the file
Simple_Flow_Template.xml in your Downloads directory on your remote
desktop’s file system.
14. Open the XML file in an editor to review it. You can use whatever editor you prefer.
If you do not have a preference, you can use the Pluma editor following these steps:
a. Open Pluma using the editor icon on your remote desktop toolbar at the top of
the browser window.
c. In the Places selector on the left, click training to view the contents of your
home directory on the right.
When you have completed this section of the exercise, you can delete the components of
both the original and duplicate simple flows you created above.
Importing a Template
15. Click the “Upload Template” button on the Operate palette.
You need have nothing selected on the canvas.
17. In the Places selector on the left, click training to view the contents of your home
directory on the right.
Browse to the training_materials/nifi/exercises directory. Open the
putfile-dataflow-template file.
You should get a message confirming that the template was imported correctly.
18. Select the Templates option on the global menu. Confirm that the template you
just imported on the template list. If you are not sure which one it is, find the most
recent one based on the Date/Time column.
19. Return to the canvas. Drag the template icon onto a clear area of the canvas and
choose the newly uploaded template.
The flow defined by the imported template will appear.
When you are done with this section of the exercise, delete the flow created by the
imported template.
20. Make sure you are viewing the root canvas. Select all the components on the canvas.
The easiest way to do this is using Ctrl+A.
21. Click the “Create Template” button on the Operate palette as you did earlier.
22. Give the new template a name and description that provides key information about
what the backup template contains, such as a date and what was included.
23. Review the template list to confirm that the template was saved.
2. In the NiFi Registry UI, click the “Settings” icon (indicated by a symbol of a wrench
or spanner). This opens the Administration page which shows the list of buckets.
(The list is probably empty at this point.)
6. Return to the NiFi Registry main page by clicking the NiFi Registry navigation link
at the top of the window.
9. Select the REGISTRY CLIENTS tab. The list of clients should be empty.
10. Click the “Register a new registry client” button, indicated by a plus sign (+).
11. Enter the name Local registry and the URL of the NiFi Registry server:
http://master.example.com:18080. Then click ADD.
13. On the root canvas, locate the flow with the label First Dataflow.
14. Select the components of the flow and copy them to the clipboard using the “Copy”
button on the Operate palette.
15. Drag the process group icon from the toolbar onto the canvas and name the group
Versioned Flow Group.
16. Double-click to open the process group, then paste the components you copied
above.
17. Use the breadcrumbs on the lower left of the canvas to return to the root canvas.
Start Versioning
18. Right-click on the process group and select Version > Start Version Control.
19. You will be prompted to name the flow and enter comments for the current version.
Use the following values:
• Registry: Local registry (you created this registry client earlier)
• Bucket: Exercise Flows (you created this bucket earlier)
• Flow Name: Simple Versioned Flow
• Flow Description (optional): a description of your choosing
• Version Comments: initial version
20. Click SAVE to commit the initial version of the flow in the process group into the
flow repository.
Note the green check mark that now appears on the process group. This indicates
that the current flow version is the most recent one in the repository.
21. Open the versioned process group and make small changes to the flow. For
instance, try changing the names of the two processors to include the word
Versioned at the beginning.
22. Return to the tab in which you were viewing the NiFi Registry UI. You should be
viewing the top-level page showing a list of versioned flows. The list should now
include the flow you started versioning above. Notice the name of the flow (which
you set when you started versioning) and the number of current versions (just one,
so far).
23. Return to the root canvas and notice that the green check mark on the process
group has changed to a gray star. This indicates that you have made changes to the
flow locally, but not yet added the new version to the repository.
24. View the list of changes by right-clicking on the process group and selecting
Version > Show local changes.
A list of the all the changes you have made since the initial version was committed
will be displayed.
25. Note that each change shows you the name of the component that was changed,
the type of change, and a comparison between the old and new versions of the
component.
26. Click the “Go To” button to the right of one of the changes (indicated by an arrow)
to view the changed component on the canvas.
27. Return to the root canvas. Right-click the process group and select Version >
Commit local changes.
28. Enter a comment that describes the changes you have made, then click SAVE.
29. Notice that the green check mark is now shown on the process group again,
indicating that the version on the canvas is the same as the latest version in the
repository.
31. Review the flow information on the left and the change log on the right. Notice
that the changes are shown in reverse chronological order. Try clicking to switch
between Version 1 and Version 2 to view the version comments and commit date.
c. Under PROPERTIES, set File Size to 1 b and ensure Batch Size is set to 1,
Data Format is Text and Unique FlowFiles is false.
a. Add a LogAttribute processor to the canvas and position it below and to the
left of the RouteOnAttribute processor.
a. Add a LogAttribute processor to the canvas and position it below and to the
right of the RouteOnAttribute processor.
Run the Dataflow to View Attributes That Are Set and Logged
You will now run successive portions of the dataflow just created, to observe FlowFiles
moving through it and to check attributes that are set and logged.
Note that attributes are logged to the file /var/log/nifi/nifi-app.log.
17. Start the Generate Small Files processor. Let it run until at least two files are
queued in the subsequent connection.
Refresh the canvas as needed to see the latest number of files queued.
20. Once all FlowFiles have been sent to the queues coming out of the
UpdateAttribute 1 processor, stop it.
21. View the values of myAttribute in both queues coming out of the
UpdateAttribute 1 processor.
b. Click the View Details icon to the left of any queue item line.
d. View other queue items for both connections to verify that all FlowFiles have
the Initial-File attribute.
22. Start the UpdateAttribute 2 processor, then stop it after a few seconds.
23. View the values of myAttribute in the queue items for the connection coming out
of this processor.
It should be Prod1-File.
24. Start the UpdateAttribute 3 processor, then stop it after a few seconds.
25. View the values of myAttribute in the queue items for the connection coming out
of this processor.
It should be Prod2-File.
27. Hover over the bulletin icon in the upper right of the processor and note the
attributes shown.
29. View the values of myAttribute in the queue items for the connection coming
from the LogAttribute 1 processor.
Both Prod1-File and Prod2-File should be visible.
30. Start the RouteOnAttribute processor and then stop it after a few seconds.
31. View the values of myAttribute in the queue items for both connections coming
from this processor.
33. View bulletin information by selecting Bulletin Board from the Global Menu.
Logged attributes should be visible here as well.
$ less /var/log/nifi/nifi-app.log
b. Once you are viewing the file, search for logged attributes by typing /
Standard FlowFile Attributes to find logged attribute entries related
to this dataflow.
For example:
--------------------------------------------------
Standard FlowFile Attributes
Key: 'entryDate'
Value: 'Wed Sep 11 14:16:42 PDT 2019'
Key: 'lineageStartDate'
Value: 'Wed Sep 11 14:16:42 PDT 2019'
Key: 'fileSize'
Value: '1'
FlowFile Attribute Map Content
Key: 'filename'
Value: 'c464c054-9520-4b91-b281-2543dcf78b00'
Key: 'myAttribute'
Value: 'Prod2-File'
Key: 'path'
Value: './'
Key: 'uuid'
Value: 'c464c054-9520-4b91-b281-2543dcf78b00'
--------------------------------------------------
In this exercise, you will practice using the NiFi Expression Language.
Your dataflow will
a. Add a ListFile processor to the process group and name it NEL ListFile.
4. Connect the ListFile and FetchFile processors for the success relationship.
5. Determine the file types of the incoming file types based on the file extension.
Set the property to an expression to extract the file type—that is, the portion of
the filename after the dot (.): ${filename:substringAfter('.')}
7. Route the files based on the FileExtension attribute set in the upstream
processor you just added.
c. Set the value of the new property to an expression that returns true if the file
extension is xml or false otherwise: ${FileExtension:equals('xml')}
10. Connect the RouteOnAttribute processor to the new PutFile processor for
the isXML relationship.
Your dataflow should look something like this:
12. Connect the RouteOnAttribute and SplitText processors for the unmatched
relationship. This makes sure files other than XML files go to the SplitText
processor.
c. To have each matched record on separate lines, add a single empty line under
the Demarcator property, using Shift+Enter.
16. Make a connection from the MergeContent processor to the existing PutFile
processor for the merged relationship. This will save files in the same destination
directory.
Your dataflow should now look something like this:
17. Before starting the dataflow, verify the files to be pumped into the flow.
$ cd ~/training_materials/nifi/data/nel
$ ls -l
$ more Exchange_Data.txt
$ ls -l /tmp/nifi/nel
Note that the size of the XML file is still about 2K, but the size of the txt file is
significantly smaller due to extracting only records containing the string 2003-07.
20. Use the more command to view the contents of the txt file.
$ more /tmp/nifi/nel/Exchange_Data.txt
Note that only records containing the string 2003-07 are shown. The content of
the file has been filtered using NiFi expression language.
24. Remove the connections around the UpdateAttribute processor and then
remove the processor, as it is no longer needed.
$ ls -l /tmp/nifi/nel
27. Start the dataflow and check the files written to /tmp/nifi/nel to verify correct
functionality.
Clear the state of the ListFile processor first, if necessary.
Note that the timestamps on the files should have changed.
c. Set the File Size property to 1KB and keep the remaining defaults.
f. Set the File Size property to 10KB and keep the remaining defaults.
i. Set the File Size property to 100KB and keep the remaining defaults.
c. Ensure the Rate Control Criteria property is set to data rate, and the Time
Duration property is set to 1 min.
The ADVANCED tab is available from any of the other tabs in the
processor configuration.
c. On the subsequent screen under FlowFile Policy, click the drop-down arrow
and select use original.
f. Click the + to the right of Conditions and add the following Expression:
${fileSize:le(1024)}
j. Click ADD.
Note that the name of the rule appears on the upper right of the screen when a
rule is selected on the left side of the screen.
m. In the Add Rule screen in the box Copy From Existing Rule (Optional), type
Size.
This will show the Size1KB rule previously configured.
n. Select the existing Size1KB rule to use as a template for this new rule.
o. Click ADD.
p. Click in the box containing the expression for the Size10KB rule and change it
to:
${fileSize:gt(1024)}
q. Click OK.
s. Click in the Value entry for the filename attribute and change the string
small to be medium.
t. Click SAVE.
Your rules should now look like this:
v. In the Add Rule screen in the box Copy From Existing Rule (Optional), type
Size.
This will show the Size1KB and Size10KB rules previously configured.
w. Select the existing Size10KB rule to use as a template for this rule and click
ADD.
y. Click in the Value entry for the filename attribute and change the string
medium to be large.
z. Click SAVE.
Your rules should now look like this:
aa. Click the X in the upper right corner of the Rules screen.
c. Set the Compression Format property to gzip and Update Filename to true.
This will compress the file using the gzip format and add a gz extension to the
filename automatically.
12. Select all the processors of the dataflow and start them.
13. Note the number of files and amount of data shown on the surface of the
ControlRate processor.
Refresh the canvas frequently in order to see current statistics as you run this
dataflow.
14. Observe that when the amount of data flowing to the ControlRate processor
exceeds 120KB, data starts to be queued in the connection just preceding it.
Recall that the rate of 1KB file generation is every 10 seconds, that of the 10KB files
is every 30 seconds, and that of the 100KB files is every 60 seconds. One file of each
size is generated immediately upon starting the dataflow, then the timer generates
files per this schedule. Therefore, 111KB of data is sent through the dataflow right
away (one 100KB, one 10KB, and one 1KB file).
Over the next 30 seconds, two (or three, depending on timing) more 1KB files are
sent (10 seconds each) and one 10KB file is sent (at the 30 second mark). This totals
six (or seven) files with 123KB (or 124KB) of data, and at that point, additional files
should be queued.
15. Note when queued files begin to be released to continue through the dataflow.
16. Observe the number of files and amount of data shown on the surface statistics of
the remaining processors in the dataflow. Compare this to the number of files and
amount of data generated by the three GenerateFlowFile processors.
$ ls -l /tmp/nifi/opt
The filenames reflect file sizes and have timestamps, per the configuration in the
UpdateAttribute processor.
Note the number of files shown and compare this to the file count that has passed
through the dataflow.
Compare the file count for each file size with the numbers generated for the
dataflow.
18. To verify the timing of files sent through the dataflow, sort the file listing by
modification time by adding the t option:
$ ls -lt /tmp/nifi/opt
In this exercise, you will use a remote process group to implement a Site-to-Site
dataflow.
You will use the existing flows to collect the output in nifi-app.log and nifi-
user.log, and send the data to a remote NiFi instance.
Important: This exercise depends on completion of Hands-On Exercise: Simplify
Dataflows Using Process Groups. If you did not complete that exercise, use the
Process Groups with I/O Ports Solution process group provided for you
on the Solutions area of your canvas. Copy the process groups and ports inside this
process group to an empty area of the root canvas prior to beginning this exercise.
3. Click the peer host thumbnail to open a new window showing the remote host
desktop. If the VM was not already running, wait 5-10 minutes for the VM and its
services to start.
$ check-health.sh
6. Confirm that all required services are noted as “good.” Required services: HDFS,
Hive, Kafka, NiFi, NiFi Registry, Zookeeper.
7. If any of the required services are “bad”, wait several minutes and then try again. (It
may take up to 10 minutes for all services to start fully.) If they are still not running
correctly, try restarting all the services by running the following script:
$ start-cluster.sh
Wait until the script completes and then check the health again.
8. Start the Firefox browser and click on the NiFi bookmark. The URL is http://
peer.example.com:8080/nifi.
Verify that you can see the NiFi canvas.
10. Configure a PutFile processor on the peer system to write data received from the
master (local) system.
d. Set the Conflict Resolution Strategy to replace and ensure that Create
Missing Directories is set to true.
11. Drag the input port icon to the canvas and name it Remote-InputPort-1.
14. Locate the dataflow you created for the exercise entitled Hands-On Exercise:
Simplify Dataflows Using Process Groups. This dataflow includes three process
groups: Put app log Group, Put user log Group, and Save app-user
log Group.
15. Select all components of the dataflow, then copy and paste the dataflow to a new
area of the canvas.
The copied process group names will now have a prefix of Copy of followed by
the remainder of the process group name.
16. Rename the Put app log and Put user log process groups to be Remote
Put app log Group and Remote Put user log Group, respectively. Leave
the Save app-user log Group name as is, because it will be deleted.
17. Delete the connections between the Save app-user log Group and the other
two process groups. You should now have three unconnected process groups.
19. Drag the remote process group icon onto the canvas. When prompted for a URL,
enter the URL of the remote NiFi instance: http://peer.example.com:8080/
nifi/.
20. Connect the Put app log process group to the remote process group. The Create
Connection dialog window should show Put app log OutputPort for From Output
and Remote-Input-Port-1 for To Input.
21. Similarly, connect the Put user log process group to the remote process group.
The Create Connection dialog window should show Put user log OutputPort for
From Output and Remote-Input-Port-1 for To Input.
Your dataflow on the master should now look something like this:
23. In the master (local) NiFi UI, right-click on the remote process group and select
Enable Transmission to activate the remote flow:
24. Start the Put app log and Put user log process groups.
25. In the master NiFi UI, observe the statistics for the remote flow process.
26. In a terminal window on the peer, verify that files were saved in the specified
destination directory.
$ ls -l /tmp/nifi/remote-data
27. Stop the dataflows running on both the remote and local NiFi systems.
28. In the master NiFi UI, disable transmission for the remote process group
component.
29. Move all the components you created in the master NiFi instance to a new process
group called Site-to-Site Dataflow.
30. Go to the URL for your exercise environment portal and shut down the peer host
using the "Power options for this VM" icon. Select the Shut down option.
$ df -h /
Take note of the value in the Use% column, indicating the percentage of the current
disk space used.
2. In the NiFi UI, select Controller Settings from the global menu.
4. Click the “Create a new reporting task” button (indicated by a plus sign).
6. Notice that the warning icon displays next to the new reporting task, indicating that
it is misconfigured. Hover your pointer over the warning icon on the left side of the
entry to see the problem. In this case, it is indicating that you must set a value for
Directory Location.
Click the “Edit” button (indicated by a pencil icon) and configure the task as follows:
• Run schedule: 1 min
• Properties
◦ Threshold: 30% (This is not a typical threshold you would set in a production
environment, but to be able to test this task, the value must be lower than the
used disk space you noted earlier.)
◦ Directory Location: / (the local filesystem root directory)
◦ Directory Display Name: root (this is how the statistics will labeled in the
report)
8. Start the MonitorDiskUsage reporting task by using the “Start” button to the
right of the task in the list.
9. Let the task run for a few minutes, then refresh the list of tasks using the refresh
icon ( ) in the bottom left.
You should see a red outline next to the task indicating that reporting events have
occurred—that is, the per-minute check of the disk usage was over the configured
threshold of 30%.
Hover your pointer over the indicator to see details.
Note that the directory display name you configured above (root) is used to
identify the disk that is being monitored.
10. Stop the reporting task using the “Stop” icon to the right of the task list, then close
the NiFi Settings window.
12. The list of bulletins should include the same warnings you saw in the notification
pop-up on the NiFi Settings page above.
13. Click the X in the upper right of the screen to close the Bulletin Board.
flow again to generate some statistics to view. After running the flow, wait several
minutes for the statistics to become available.
15. You should see a graph showing statistics for the selected metric (FlowFiles Out by
default).
Try dragging a rectangle around a small portion of the lower graph. The larger
graph above will zoom in on the selected area.
16. Try selecting different types of metrics. You can view and select from the available
metrics by opening the selection box at the top of the upper graph.
17. When you are done exploring, close the Status History window.
5. Filter on the string hive, select the HiveConnectionPool controller, and click
ADD.
7. Click APPLY.
9. Click on the X at the upper-right of the screen to exit NiFi Flow Configuration.
In this exercise, you will explore NiFi integration with Kafka and HDFS.
1. On the NiFi UI, add a ListFile processor to the canvas with an Input Directory
of /home/training/training_materials/nifi/data/integrate. Name
it Integrate ListFile.
2. Add a FetchFile processor to the canvas and check failure, not.found, and
permission.denied for Automatically Terminate Relationships. Name it
Integrate FetchFile.
3. Drag a connection from the ListFile processor to the FetchFile processor for
the success relationship.
4. The FetchFile processor is to obtain a JSON file containing sample data that will
be sent to Kafka. You may examine the file if desired using a terminal window:
$ cat ~/training_materials/nifi/data/integrate/mydata.json
{
"id":1,
"name":"Jobin",
"title":"consultant"
},
{
"id":2,
"name":"Sam",
"title":"developer"
},
{
"id":3,
"name":"Mary",
"title":"software engineer"
}
]
5. In a terminal window, create a Kafka topic named nifidata and then list Kafka
topics:
$ kafka-topics \
--create --zookeeper master.example.com:2181/kafka \
--replication-factor 1 --partitions 1 \
--topic nifidata
$ kafka-topics --list --zookeeper master.example.com:2181/
kafka
Toward the end of the output from the command to list Kafka topics,
you should see the string nifidata after a line that contains INFO
zookeeper.ZooKeeperClient: [ZooKeeperClient] Connected.
d. Click in the Value field for the Record Reader property to configure a Record
Reader controller service.
g. Click CREATE.
The JsonTreeReader should now be listed as the value for Record Reader.
h. Click on the arrow at the right of the Record Reader property. Click YES when
asked to Save changes before going to this Controller Service?.
On the PROPERTIES tab, note that, by default, the schema of the data to be used
by the PublishKafkaRecord_2_0 processor is inferred. This means that the
controller will automatically create a schema by examining the data it finds in
the FlowFile.
m. Click the X in the upper-right of the NiFi Flow Configuration screen to return
to the canvas.
o. Click in the Value field for the Record Writer property to configure a Record
Writer controller service.
r. Click on the arrow to the right of the Record Writer property to configure the
JsonRecordSetWriter controller service.
Click YES to Save changes before going to this Controller Service?.
v. Click the X in the upper-right of the NiFi Flow Configuration screen to return
to the canvas.
$ kafka-console-consumer \
--bootstrap-server master.example.com:9092 \
--topic nifidata
The prompt will not return as this command waits to display messages sent to
Kafka.
{"id":1,"name":"Jobin","title":"consultant"}
{"id":2,"name":"Sam","title":"developer"}
{"id":3,"name":"Mary","title":"software engineer"}
11. Stop the processors. Leave the terminal window open where the kafka-
console-consumer command is running.
Note that Offset Reset is set to latest. This picks up the most
recent data from the Kafka topic. If it is set to earliest, it will
pick up all data from the specified topic from the beginning.
This can result in a very large amount of data processed by the
ConsumeKafka_2_0 processor.
/home/training/training_materials/nifi/data/hdfs/core-
site.xml
• Directory - /tmp/nifidata
• Conflict Resolution Strategy - replace
19. In a different terminal window, run the following command to view the data written
to HDFS.
20. Choose some files and view their contents to verify rows written. Use the following
command for each filename, substituting alphanumeric filenames returned in the
previous command for filename in your commands:
Each file should contain one of the lines returned in the terminal window running
the kafka-console-consumer command, for example:
{"id":1,"name":"Jobin","title":"consultant"}
If the prompt comes back on the same line as the output, just press
the Enter key to get the prompt on a line by itself.
You have now set up different ways to interact with CDP Services: