Pipeline Nifi Aws Elk

Uploaded by

BigData Lille

0% found this document useful (0 votes)

3 views2 pages

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

3 views2 pages

Pipeline Nifi Aws Elk

Uploaded by

BigData Lille

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Build a Data Pipeline in AWS using NiFi, Spark, and ELK Stack

Agenda
In this project, we are going to make a Data Pipeline including various AWS services and
Apache products such as Apache NiFi, Apache Spark, AWS S3, Amazon EMR cluster, Amazon
OpenSearch, Logstash and Kibana etc. We will fetch data from an API using Apache NiFi,
transform it and load it in an AWS S3 bucket. Using Logstash we will ingest data from an AWS
S3 bucket into Amazon OpenSearch. From Amazon OpenSearch we will pass data into Kibana
to perform Data visualization on data. Along with this, we will also perform data analysis using
PySpark.

Tech stack:
➔Language: Python
➔Package: Pyspark
➔Services: AWS NiFi, AWS EC2, Apache Spark, AWS S3, Amazon EMR cluster, Amazon
OpenSearch, Logstash, Kibana

AWS NiFi:
Apache NiFi is a data logistics technology that automates data transportation across diverse
systems. Real-time control is provided, making it simple to regulate the transfer of data between
any source and any destination. It supports buffering of all Queued data.

Amazon EMR cluster:

To process and analyze enormous volumes of data on AWS, big data frameworks like Apache
Hadoop and Apache Spark may be easily operated on Amazon EMR, a managed cluster
platform. You may process data for analytics purposes and business intelligence tasks using
these frameworks and associated open-source projects.

Amazon OpenSearch:

OpenSearch is a distributed, open-source search and analytics package used for a variety of
use cases, including online search, log analytics, and real-time application monitoring. With the
help of an integrated visualisation tool called OpenSearch Dashboards, OpenSearch offers a
highly scalable system for giving quick access and reaction to massive amounts of data. This
tool makes it simple for users to examine their data.

Logstash:

Logstash is a server-side, open-source, lightweight data processing pipeline that enables you to
gather data from many sources, alter it as you go, and deliver it where you want.
Kibana:

Kibana is a tool for data visualisation and exploration that is used for operational intelligence
use cases, log and time-series analytics, and application monitoring. The popular analytics and
search engine Elasticsearch is tightly integrated with Kibana, making Kibana the go-to tool for
viewing Elasticsearch data.

Key Takeaways:
● Understanding the project overview
● Understanding the Data Pipeline
● Understanding the flow of Data Pipeline
● Create AWS EC2 instance
● Install Apache NiFi on EC2 instance
● Fetch data from an API
● Understanding Apache NiFi tool
● Transform data in Apache NiFi
● Convert data from json into csv using Apache NiFi
● Create AWS S3 bucket
● Transfer data from Apache NiFi to AWS S3 bucket
● Understand ELK stack
● Understand the use of OpenSearch, Logstash and Kibana
● Install Logstash
● Inject data from AWS S3 into Amazon OpenSearch
● Visualize data in Kibana
● Perform data analysis using PySpark

Introduction To Elasticsearch.: Ruslan Zavacky
Document75 pages
Introduction To Elasticsearch.: Ruslan Zavacky
Anonymous 1zCvIIjSc
No ratings yet
AWS Machine Learning Specialty
Document67 pages
AWS Machine Learning Specialty
Trans7 Jakarta
100% (1)
Openstack Object Storage Datasheet
Document1 page
Openstack Object Storage Datasheet
Michael Birk
No ratings yet
Azure Synapse Analytics
Document7,794 pages
Azure Synapse Analytics
Prasenjit Patnaik
No ratings yet
All About Tarlac
Document12 pages
All About Tarlac
Anonymous uLb5vOjX
No ratings yet
Data Mining in IoT
Document29 pages
Data Mining in IoT
Rohit Mukherjee
100% (1)
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Data Engineering Nanodegree Program Syllabus PDF
Document5 pages
Data Engineering Nanodegree Program Syllabus PDF
Ovidiu Eremia
No ratings yet
Elastic Search
Document18 pages
Elastic Search
Mihai Ilie
0% (1)
Kibana, Grafana and Zeppelin On Monitoring Data
Document21 pages
Kibana, Grafana and Zeppelin On Monitoring Data
nicolas
No ratings yet
Rest API Best Practices
Document15 pages
Rest API Best Practices
deepak.angrula506
No ratings yet
Data Lake Foundation WTH Zeppelin and Amazon Rds On The Aws Cloud
Document23 pages
Data Lake Foundation WTH Zeppelin and Amazon Rds On The Aws Cloud
s.zeraoui1595
No ratings yet
Spark Project Report: Streaming
Document22 pages
Spark Project Report: Streaming
testyy testt
No ratings yet
Azure Databricks Overview
Document4 pages
Azure Databricks Overview
tsoh
100% (1)
Ram Madhav Resume
Document6 pages
Ram Madhav Resume
ramu_uppada
No ratings yet
TESP12201R0
Document20 pages
TESP12201R0
Muhammad Ali
No ratings yet
Stream Processing Using Kafka
Document46 pages
Stream Processing Using Kafka
1himaniarora
No ratings yet
Kafka
Document50 pages
Kafka
Emanuele Parente
No ratings yet
Strategies For Migrating Oracle Database To Aws 2
Document38 pages
Strategies For Migrating Oracle Database To Aws 2
Daniel Cabarcas M.
No ratings yet
Dice Resume CV Vijay Krishna
Document4 pages
Dice Resume CV Vijay Krishna
RAJU P
No ratings yet
Solution Architect Notes
Document21 pages
Solution Architect Notes
ARULKUMAR KANDASAMY
No ratings yet
CopyofResume 1
Document2 pages
CopyofResume 1
pihuu3421
No ratings yet
Data Ingestion Use Cases: Moving Big Data Into Hadoop
Document2 pages
Data Ingestion Use Cases: Moving Big Data Into Hadoop
GG
No ratings yet
Aws Services
Document59 pages
Aws Services
Johana Kelly
No ratings yet
Donald Ngandeu 1
Document6 pages
Donald Ngandeu 1
Noor Ayesha Iqbal
No ratings yet
Uplokm
Document3 pages
Uplokm
k2sh
No ratings yet
Devops Project
Document6 pages
Devops Project
ravi_kishore21
No ratings yet
Deepak (Sr. Data Engineer)
Document10 pages
Deepak (Sr. Data Engineer)
ankul
No ratings yet
The 30 Most Useful Python Libraries For Data Engineering - by ODSC - Open Data Science - Medium
Document23 pages
The 30 Most Useful Python Libraries For Data Engineering - by ODSC - Open Data Science - Medium
ravinder singh
No ratings yet
AWS Data Lake
Document13 pages
AWS Data Lake
Suvankar Chakraborty
No ratings yet
Using Oracle Autonomous Database Serverless 41 50
Document10 pages
Using Oracle Autonomous Database Serverless 41 50
hammadyazan16
No ratings yet
Athffna
Document8 pages
Athffna
k2sh
No ratings yet
Cloud RanganathJasti
Document6 pages
Cloud RanganathJasti
Harshvardhini Munwar
No ratings yet
Amazon Elastic Container Service (ECS) Is A Highly Scalable, High Performance Container
Document8 pages
Amazon Elastic Container Service (ECS) Is A Highly Scalable, High Performance Container
rohanrajsn1208
No ratings yet
Design A Google Analytic Like Backend System
Document3 pages
Design A Google Analytic Like Backend System
Abdul Rehman
No ratings yet
OPENSTACK (Cloud Computing)
Document32 pages
OPENSTACK (Cloud Computing)
nikita
No ratings yet
Evan - Big Data Architect
Document5 pages
Evan - Big Data Architect
Madhav Garikapati
No ratings yet
Monitoring of IaaS and Scientific Applications On
Document6 pages
Monitoring of IaaS and Scientific Applications On
hobihiw
No ratings yet
Subject: A Glance To Elasticsearch in The Era of Analytics and Machine Learning
Document8 pages
Subject: A Glance To Elasticsearch in The Era of Analytics and Machine Learning
Suchismita Sahu
No ratings yet
4 Building Blocks of A Streaming Data Architecture
Document11 pages
4 Building Blocks of A Streaming Data Architecture
Ulises Carreon
No ratings yet
Affinity
Document7 pages
Affinity
k2sh
No ratings yet
Vinodsingh CloudDataEngineer 900 (1) (1)
Document5 pages
Vinodsingh CloudDataEngineer 900 (1) (1)
HARSHA
No ratings yet
Lambda Architecure On For Batch Aws
Document12 pages
Lambda Architecure On For Batch Aws
nanich
No ratings yet
Vedanth Kunchala Data Integration Engineer
Document4 pages
Vedanth Kunchala Data Integration Engineer
Dummy Gammy
No ratings yet
Thing Speak
Document28 pages
Thing Speak
Gerson Vitoriano
No ratings yet
Azure Data Factory
Document5 pages
Azure Data Factory
vr.sf99
No ratings yet
Establishment: and The Data Warehousing AND They Use
Document18 pages
Establishment: and The Data Warehousing AND They Use
Maria Anne Marianne Ayang
No ratings yet
AWS Glue
Document36 pages
AWS Glue
clouditlab9
No ratings yet
Aws Glue Interview
Document259 pages
Aws Glue Interview
Rick V
No ratings yet
Designing A Production-Ready Kappa Architecture For Timely Data Stream Processing - Uber Engineering
Document1 page
Designing A Production-Ready Kappa Architecture For Timely Data Stream Processing - Uber Engineering
wilhelmjung
No ratings yet
BDA Lab A7
Document10 pages
BDA Lab A7
the.quote.villa
No ratings yet
Kanishk Resume
Document5 pages
Kanishk Resume
Harshvardhini Munwar
No ratings yet
CV Template
Document2 pages
CV Template
tultulii
No ratings yet
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
Document100 pages
Scaladayslambda Architecture Spark Cassandra Akka Kafka 150609194508 Lva1 App6891 PDF
Bubu Tripathy
No ratings yet
Anuja Himanshu Runwal: Angular 4.0, HTML, Javascript, Highcharts - Js
Document4 pages
Anuja Himanshu Runwal: Angular 4.0, HTML, Javascript, Highcharts - Js
Himanshu Runwal
No ratings yet
Q: How Do You Use Snowflake To Design and Develop Data Warehouse Solutions?
Document1 page
Q: How Do You Use Snowflake To Design and Develop Data Warehouse Solutions?
shubham tiwari
No ratings yet
Apache Kafka Introduction
Document21 pages
Apache Kafka Introduction
Umer Farooq
No ratings yet
Amazon Web Services Cheat Sheet: Cloud Computing
Document6 pages
Amazon Web Services Cheat Sheet: Cloud Computing
Renu B
No ratings yet
Unit 4 Hadoop Eco System PDF
Document78 pages
Unit 4 Hadoop Eco System PDF
january
No ratings yet
School of Computing Indian Institute of Information Technology UNA Himachal Pradesh
Document10 pages
School of Computing Indian Institute of Information Technology UNA Himachal Pradesh
Chiraag Mittal
No ratings yet
Async Python Service With FastAPI & SQLAlchemy
Document7 pages
Async Python Service With FastAPI & SQLAlchemy
Leon
No ratings yet
Bala Krishna A - 5 Year(s) 11 Month(s)
Document3 pages
Bala Krishna A - 5 Year(s) 11 Month(s)
Lavi764 Barbie
No ratings yet
Professional Summary
Document5 pages
Professional Summary
Naresh HIT
No ratings yet
Very Hungry Caterpillar Clip Cards
Document5 pages
Very Hungry Caterpillar Clip Cards
ARTGRAVETO ART
No ratings yet
Calendar of Activities A.Y. 2015-2016: 12 Independence Day (Regular Holiday)
Document3 pages
Calendar of Activities A.Y. 2015-2016: 12 Independence Day (Regular Holiday)
Beny Tawan
No ratings yet
Some Studies On Structure and Properties of Wrapped Jute (Parafil) Yarns
Document5 pages
Some Studies On Structure and Properties of Wrapped Jute (Parafil) Yarns
Vedant Mahajan
No ratings yet
What Is The Difference Between Newtonian and Non-Newtonian Fluid and Give Example For Each Case?
Document11 pages
What Is The Difference Between Newtonian and Non-Newtonian Fluid and Give Example For Each Case?
MOHAMED ABD ELGHANY
No ratings yet
12 Logarithm Approximate Floating
Document6 pages
12 Logarithm Approximate Floating
Philippe Englert Velha
No ratings yet
Circuit Construction: Assignment 3
Document45 pages
Circuit Construction: Assignment 3
ali morisy
No ratings yet
Jy992d66901 C
Document6 pages
Jy992d66901 C
Maitry Shah
No ratings yet
Lecture Bouffon
Document1 page
Lecture Bouffon
Carlos Enrique Guerra
No ratings yet
Action Plan in T.L.E Project Title Objectives Activities Person-In-Charge Time Frame Success Indicator
Document1 page
Action Plan in T.L.E Project Title Objectives Activities Person-In-Charge Time Frame Success Indicator
Edelmar Benosa
No ratings yet
Shoshana Bulka Pragmatica
Document17 pages
Shoshana Bulka Pragmatica
Jessica Jones
No ratings yet
Arc Hydro - Identifying and Managing Sinks
Document35 pages
Arc Hydro - Identifying and Managing Sinks
kbal
No ratings yet
00022443the Application of A Continuous Leak Detection System To Pipelines and Associated Equipment
Document4 pages
00022443the Application of A Continuous Leak Detection System To Pipelines and Associated Equipment
Faizal Abdullah
No ratings yet
Nursing Assessment in Family Nursing Practice
Document22 pages
Nursing Assessment in Family Nursing Practice
Hydra Olivar - Pantilgan
No ratings yet
Ishrana Studenata I Nastavnika Visoke Škole U Subotici Tokom Pandemije COVID-19
Document4 pages
Ishrana Studenata I Nastavnika Visoke Škole U Subotici Tokom Pandemije COVID-19
Dejan
No ratings yet
Ferrero A.M. Et Al. (2015) - Experimental Tests For The Application of An Analytical Model For Flexible Debris Flow Barrier Design PDF
Document10 pages
Ferrero A.M. Et Al. (2015) - Experimental Tests For The Application of An Analytical Model For Flexible Debris Flow Barrier Design PDF
Enrico Massa
No ratings yet
WoundVite®, The #1 Most Comprehensive Wound, Scar and Post-Surgical Repair Formula Receives Amazon's Choice High Ratings
Document3 pages
WoundVite®, The #1 Most Comprehensive Wound, Scar and Post-Surgical Repair Formula Receives Amazon's Choice High Ratings
PR.com
No ratings yet
DatuinMA (Activity #5 - NSTP 10)
Document2 pages
DatuinMA (Activity #5 - NSTP 10)
Marc Alen Porlaje Datuin
No ratings yet
Chain: SRB Series (With Insulation Grip)
Document1 page
Chain: SRB Series (With Insulation Grip)
shankar
No ratings yet
MCC333E - Film Review - Myat Thu - 32813747
Document8 pages
MCC333E - Film Review - Myat Thu - 32813747
Myat Thu
No ratings yet
PID Marcado Operación Del Paquete Del Compresor de Hidrogeno PHP-K-002 PDF
Document7 pages
PID Marcado Operación Del Paquete Del Compresor de Hidrogeno PHP-K-002 PDF
Denis
No ratings yet
S3 U4 MiniTest
Document3 pages
S3 U4 MiniTest
Đinh Thị Thu Hà
No ratings yet
Code of Practice For Design Loads (Other Than Earthquake) For Buildings and Structures
Document39 pages
Code of Practice For Design Loads (Other Than Earthquake) For Buildings and Structures
Ishor Thapa
No ratings yet
Midterm Reviewer
Document20 pages
Midterm Reviewer
Jonnafe Ignacio
No ratings yet
Batron: 29 5 MM Character Height LCD Modules 29
Document1 page
Batron: 29 5 MM Character Height LCD Modules 29
Diego Oliveira
No ratings yet
Dress Code19sep
Document36 pages
Dress Code19sep
api-100323454
No ratings yet
CHAPTER I Lesson II Seven Environmental Principles
Document17 pages
CHAPTER I Lesson II Seven Environmental Principles
Trixie jade Dumot
No ratings yet
2nd Second Sun of Advent (B)
Document4 pages
2nd Second Sun of Advent (B)
Max Polak
No ratings yet