You are on page 1of 3

Step 1.

Create a topic in Kafka so that consumers and producers can enqueue/dequeue data
respectively from the topic.

 Creating new topic ‘telecom_test’ by using following command:

kafka-topics --create --zookeeper ip-10-1-1-204.ap-south-1.compute.internal:2181 --replication-


factor 3 --partitions 3 --topic telecom_test

 Creating kafka producers by using following command:

kafka-console-producer --topic telecom_test --broker-list ip-10-1-1-204.ap-south-


1.compute.internal:9092, ip-10-1-2-40.ap-south-1.compute.internal:9092, ip-10-1-2-24.ap-south-
1.compute.internal:9092

 Creating kafka consumers by using following command:

kafka-console-consumer --topic telecom_test --bootstrap-server ip-10-1-1-204.ap-south-


1.compute.internal:9092

Step 2. Configure a flume agent to use Kafka as the channel and HDFS as the sink.

 Creating telecom_test.conf file by using below configurations:

agent1.sources.kafka-source.type = org.apache.flume.source.kafka.KafkaSource

agent1.sources.kafka-source.zookeeperConnect = ip-10-1-1-204.ap-south-1.compute.internal:2181

agent1.sources.kafka-source.topic = telecom_test
agent1.sources.kafka-source.channels = memory-channel

agent1.sources.kafka-source.interceptors.i1.type = timestamp

agent1.sources.kafka-source.kafka.consumer.timeout.ms = 100

agent1.channels.memory-channel.type = memory

agent1.channels.memory-channel.capacity = 10000

agent1.channels.memory-channel.transactionCapacity = 1000

agent1.sinks.hdfs-sink.type = hdfs

agent1.sinks.hdfs-sink.hdfs.path = /user/sharmaansh83edu/telecom_test

agent1.sinks.hdfs-sink.hdfs.rollInterval = 5

agent1.sinks.hdfs-sink.hdfs.rollSize = 0

agent1.sinks.hdfs-sink.hdfs.rollCount = 0

agent1.sinks.hdfs-sink.hdfs.fileType = DataStream

agent1.sinks.hdfs-sink.channel = memory-channel

agent1.sources = kafka-source

agent1.channels = memory-channel

agent1.sinks = hdfs-sink

 Place the conf file on ftp server in home/<username> directory.

Step 3. Start flume agent and test the output to HDFS.

 Start the flume agent by using below command in the new web-shell:

flume-ng agent --conf conf --conf-file telecom_test.conf --name agent1 -


Dflume.root.logger=INFO,console
Step 4. Test the complete pipeline:
 Sent from kafka producer

 Received in HDFS

You might also like