Welcome to Scribd!

Big Data Hadoop

Uploaded by

0% found this document useful (0 votes)

19 views14 pages

This document discusses key concepts in Hadoop including: 1. The Secondary NameNode periodically joins the NameNode image and edit logs to create a new image, uploading it back to the NameNode. 2. Rack awareness stores data blocks across different racks to maximize bandwidth and improve availability by preventing total data loss if a rack fails. 3. The JobTracker accepts jobs, locates data and available TaskTrackers to run tasks, while TaskTrackers accept tasks and report status/available slots back to the JobTracker.

Original Description:

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

19 views14 pages

Big Data Hadoop

Uploaded by

priyaranjan

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 14

Search inside document

Big Data

Hadoop SNN & JT,TT

X1 x2 x3 x4
dn2 dn4 dn1 dn3

x NN

1.Location
2.N/w Traffic
3.Data
Redudancy

DN1

DN4
DN2 DN3
SECONDARY NAME NODE

The Secondary namenode is a helper node in hadoop

Public key Encryption
• 2 key
• Public
• private

HAI/ KHW

A B
Architecture of SNN
NN
SNN

DN-1
RACK AWARNESS
Data Replication
• HDFS is designed to handle large scale data in distributed
environment
• Hardware or software failure, or network partition exist
• Therefore need replications for those fault tolerance
• Replication factor
• Decided by users, and can be dynamically tuned.
• Hadoop default is – 1
• Yarn Default is - 3
Rack Awareness
• Benefits of Implementing Rack Awareness in our Hadoop Cluster:
• With the rack awareness policy’s we store the data in different
Racks so no way to lose our data.
• Rack awareness helps to maximize the network bandwidth
because the data blocks transfer within the Racks.
• It also improves the cluster performance and provides high data
availability.
Secondary Name Node
• The only purpose of the secondary name-node is to perform
periodic checkpoints. The secondary name-node periodically
downloads current name-node image and edits log files, joins them
into new image and uploads the new image back to the (primary
and the only) name-node
• Default checkpoint time is one hour. It can be set to one minute on
highly busy clusters where lots of write operations are being
performed.
• Data Node Failure
• Condition If a data node failed, Name Node could know the blocks it
contains, create same replications to other alive nodes, and unregister
this dead node

• Data Integrity
• Corruption may occur in network transfer, Hardware failure etc. Apply
checksum checking on the contents of files on HDFS, and store the
checksum in HDFS namespace If checksum is not correct after
fetching, drop it and fetch another replication from other machines
Architecture of JT & TT

X=A+B
JT

TT
TT TT
JobTracker
• JobTracker is the daemon service for submitting and tracking
MapReduce jobs in HadooP
• JobTracker performs following actions in Hadoop :
• It accepts the MapReduce Jobs from client applications
• Talks to NameNode to determine data location
• Locates available TaskTracker Node
• Submits the work to the chosen TaskTracker Node
TaskTracker
• A TaskTracker node accepts map, reduce or shuffle operations from
a JobTracker
• Its configured with a set of slots, these indicate the number of
tasks that it can accept
• JobTracker seeks for the free slot to assign a job
• TaskTracker notifies the JobTracker about job success status.
• TaskTracker also sends the heartbeat signals to the job tracker to
ensure its availability, it also reports the no. of available free slots
with it.
PROBLEMS

Module 2 Hadoop
Document23 pages
Module 2 Hadoop
additiladdha
No ratings yet
Bigdata and Hadoop - Unit III
Document24 pages
Bigdata and Hadoop - Unit III
garima khasdeo
No ratings yet
BDP 2024 06
Document14 pages
BDP 2024 06
Jay Nagwani
No ratings yet
Module 2
Document17 pages
Module 2
Aditya Raj
No ratings yet
Module 3 - Mapreduce
Document40 pages
Module 3 - Mapreduce
Aditya Raj
No ratings yet
Introduction To The Big Data Ecosystem
Document13 pages
Introduction To The Big Data Ecosystem
Rico Martenstyaro
No ratings yet
Cheat Sheet 1
Document2 pages
Cheat Sheet 1
Anusha Gupta
No ratings yet
BDL8 PDF
Document41 pages
BDL8 PDF
Mrs. Usha Naidu S
No ratings yet
DSECL ZG 522: Big Data Systems: Session 6: Hadoop Architecture and Filesystem
Document56 pages
DSECL ZG 522: Big Data Systems: Session 6: Hadoop Architecture and Filesystem
Swati Bhagavatula
No ratings yet
Lecture 2
Document28 pages
Lecture 2
ahmed hossam
No ratings yet
BDP 2023 03
Document59 pages
BDP 2023 03
Aayush Airan
No ratings yet
Lecture4 IntroMapReduce PDF
Document75 pages
Lecture4 IntroMapReduce PDF
Ronald Bbosa
No ratings yet
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Document37 pages
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Dank Boii
No ratings yet
Hadoop Intro1
Document15 pages
Hadoop Intro1
King Bavisi
No ratings yet
PPT04-Hadoop Infrastructure Layer
Document40 pages
PPT04-Hadoop Infrastructure Layer
TsabitAlaykRidhollah
No ratings yet
BDS Session 5
Document57 pages
BDS Session 5
R Krish
No ratings yet
Architecting Hadoop: Chapter - 03
Document11 pages
Architecting Hadoop: Chapter - 03
Subha
No ratings yet
Hadoop Distributed File System Basics
Document30 pages
Hadoop Distributed File System Basics
ashuvasuma
No ratings yet
9 Hadoop PDF
Document59 pages
9 Hadoop PDF
Amine Hamdouchi
No ratings yet
BY K.Karthikeyan: Hadoop & Map Reduce
Document7 pages
BY K.Karthikeyan: Hadoop & Map Reduce
Karthik S
No ratings yet
Slide 2 GFS and Hadoop
Document95 pages
Slide 2 GFS and Hadoop
Hưng
No ratings yet
UNIT 4 Notes by ARUN JHAPATE
Document20 pages
UNIT 4 Notes by ARUN JHAPATE
Ankit “अंकित मौर्य” Mourya
No ratings yet
Chap 6 - MapReduce Programming
Document37 pages
Chap 6 - MapReduce Programming
Harshitha Raaj
No ratings yet
2b Hadoop Ecosystem PDF
Document18 pages
2b Hadoop Ecosystem PDF
23522020 Danendra Athallariq Harya P
No ratings yet
Cloveretl Versus Hadoop: in Light of Transforming Very Large Data Sets
Document23 pages
Cloveretl Versus Hadoop: in Light of Transforming Very Large Data Sets
Vijay Kumar
No ratings yet
Intro ToHadoop-Unit 04
Document24 pages
Intro ToHadoop-Unit 04
Asiya Khan
No ratings yet
Hadoop Building Blocks
Document30 pages
Hadoop Building Blocks
Kavya
No ratings yet
Cloud Computing Unit 5updated
Document43 pages
Cloud Computing Unit 5updated
Prajjwal Singh Rathore
No ratings yet
Hadoop 1.0 Vs 2.0
Document18 pages
Hadoop 1.0 Vs 2.0
Piyush Jangir
No ratings yet
Chotu Meat
Document11 pages
Chotu Meat
emraan anwar
No ratings yet
Untitled
Document37 pages
Untitled
asha
No ratings yet
Introduction To Big Data and Hadoop
Document29 pages
Introduction To Big Data and Hadoop
Manoj K Upadhyaya
100% (1)
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
Document71 pages
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
VALANARR COMPUTERS
No ratings yet
Session Title: Bob Johnston / IPS Grid & HA Solutions
Document31 pages
Session Title: Bob Johnston / IPS Grid & HA Solutions
nirmal_it
No ratings yet
Chapter4 PDF
Document50 pages
Chapter4 PDF
Vard Farrell
No ratings yet
l1 Hadoop Introduction 2022s2
Document46 pages
l1 Hadoop Introduction 2022s2
Comp Scif
No ratings yet
Week 02
Document115 pages
Week 02
muhammad shoaib
No ratings yet
Bigdata-7 Ahs Merged
Document45 pages
Bigdata-7 Ahs Merged
1SI20CS076 PRADEEP H V
No ratings yet
Hadoop, A Distributed Framework For Big Data
Document55 pages
Hadoop, A Distributed Framework For Big Data
sonia choudhary
No ratings yet
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Document62 pages
Hadoop Overview: Open Source Framework Processing Large Amounts of Heterogeneous Data Sets Distributed Fashion
Mousoomi Baruah
No ratings yet
NYOUG Hadoop Presentaton
Document47 pages
NYOUG Hadoop Presentaton
V Kalyan
No ratings yet
Unit2 HDFS and Map Reduce
Document119 pages
Unit2 HDFS and Map Reduce
Smith Tuscano
No ratings yet
Hadoop Important Lecture
Document38 pages
Hadoop Important Lecture
affanabbasi015
No ratings yet
Shortnotes For Cloud
Document22 pages
Shortnotes For Cloud
Mahi Mahi
No ratings yet
Core Specification Documents
Document1 page
Core Specification Documents
alvin me
No ratings yet
Presentation: Hadoop Technology
Document15 pages
Presentation: Hadoop Technology
Rahul Singh
No ratings yet
Lecture 5 - Hadoop and Mapreduce
Document30 pages
Lecture 5 - Hadoop and Mapreduce
mba20238
No ratings yet
Module - 4-2
Document32 pages
Module - 4-2
Raj Kumar
No ratings yet
HDFS 79
Document74 pages
HDFS 79
bhargavi
No ratings yet
BD Merged
Document330 pages
BD Merged
Aman CSE050
No ratings yet
5.apache Hadoop
Document28 pages
5.apache Hadoop
MSD 007
No ratings yet
Lecture 1
Document55 pages
Lecture 1
George Okemwa
No ratings yet
6 - Hadoop - Hdfs and Map Reduce
Document43 pages
6 - Hadoop - Hdfs and Map Reduce
Wong pi wen
No ratings yet
CHAPTER - 1 - MapReduce
Document27 pages
CHAPTER - 1 - MapReduce
Dhana Lakshmi Boomathi
No ratings yet
Data Analysis With Python
Document49 pages
Data Analysis With Python
TBN1
100% (3)
Big Data
Document67 pages
Big Data
tamizhanps
No ratings yet
HadoopMapreduce Summerization
Document24 pages
HadoopMapreduce Summerization
Atharv Chaudhari
No ratings yet
Big Data Preprocessing: Enabling Smart Data
From Everand
Big Data Preprocessing: Enabling Smart Data
Julián Luengo
No ratings yet
Computer Jargon Explained
From Everand
Computer Jargon Explained
Nicholas Enticknap
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Piglocal and Sqoop
Document1 page
Piglocal and Sqoop
priyaranjan
No ratings yet
SQL Table and Data
Document1 page
SQL Table and Data
priyaranjan
No ratings yet
Pigoperations
Document7 pages
Pigoperations
priyaranjan
No ratings yet
HDFS Installation Guide-Anju
Document4 pages
HDFS Installation Guide-Anju
priyaranjan
No ratings yet
HDFS Commands
Document3 pages
HDFS Commands
priyaranjan
No ratings yet
HDFS Environment
Document2 pages
HDFS Environment
priyaranjan
No ratings yet
Cuda CLI Ref
Document652 pages
Cuda CLI Ref
jpibase
No ratings yet
Learner Guide Troubleshooting HP Networks 1041 No Watermark
Document108 pages
Learner Guide Troubleshooting HP Networks 1041 No Watermark
Ибрагим Озден Думан
No ratings yet
MQTT Based Home Automation Stsyem
Document6 pages
MQTT Based Home Automation Stsyem
Muhammad Sameer Farooq FastNU
No ratings yet
BGP Cheat Sheet
Document1 page
BGP Cheat Sheet
Davi Sadaseeven Saminaden
0% (1)
CVP SRND
Document204 pages
CVP SRND
Arun Kishore
No ratings yet
TL WR940N (EU) 6.0 Datasheet
Document5 pages
TL WR940N (EU) 6.0 Datasheet
Valentin Lopez Jacobo
No ratings yet
QUIZ
Document2 pages
QUIZ
Kurt Audrey Delute Montera
No ratings yet
Shaikh Zayed University (Khost) Computer Science Faculty Information Technology Department
Document4 pages
Shaikh Zayed University (Khost) Computer Science Faculty Information Technology Department
Asad
No ratings yet
At&T Voip Nortel Cs 1000 (Release 5.00W / 5.50J) Sip Configuration Guide For Use With At&T Voevpn Services
Document36 pages
At&T Voip Nortel Cs 1000 (Release 5.00W / 5.50J) Sip Configuration Guide For Use With At&T Voevpn Services
no_thing_1320
No ratings yet
Vendor: Cisco Exam Code: 400-201 Exam Name: CCIE Service Provider Written Exam V4.1 New Questions
Document4 pages
Vendor: Cisco Exam Code: 400-201 Exam Name: CCIE Service Provider Written Exam V4.1 New Questions
Yogesh Khollam
No ratings yet
IGNITE - Troubleshooting Juniper IPSEC VPN
Document5 pages
IGNITE - Troubleshooting Juniper IPSEC VPN
voucher01 jncia
No ratings yet
Using Openbsd 3.3 Asa Firewall/Gateway For Home DSL or Cable
Document16 pages
Using Openbsd 3.3 Asa Firewall/Gateway For Home DSL or Cable
luzivanmorais
No ratings yet
6.0A - Audiocodes Mp-124D 24 Port Fxs Mediapack
Document9 pages
6.0A - Audiocodes Mp-124D 24 Port Fxs Mediapack
Gheorghiu Naumel
No ratings yet
Configuring MPLS Multi-VRF (VRF-lite)
Document18 pages
Configuring MPLS Multi-VRF (VRF-lite)
chema1975
No ratings yet
CM3G-301-E1 Introduction To The ZXC10 MGW Data Configuration
Document116 pages
CM3G-301-E1 Introduction To The ZXC10 MGW Data Configuration
Kumar swamy BS
No ratings yet
Lec-3 Layer OSI Model
Document34 pages
Lec-3 Layer OSI Model
Divya Gupta
No ratings yet
Cyber Security MCQ
Document5 pages
Cyber Security MCQ
preeti
No ratings yet
Learning The POP3 Protocol
Document13 pages
Learning The POP3 Protocol
Kristijan Zbiljski
No ratings yet
Lab - 6 - VTP Client and Server Configurations
Document7 pages
Lab - 6 - VTP Client and Server Configurations
Juan Manuel Pinto
No ratings yet
ISO14229-4 2012 12-En
Document24 pages
ISO14229-4 2012 12-En
Korkut Erden
No ratings yet
Secu Extender
Document9 pages
Secu Extender
Alvaro De Paz Fernandez
No ratings yet
Aws
Document8 pages
Aws
kiran
No ratings yet
DGS 1510 Series Datasheet en EU
Document7 pages
DGS 1510 Series Datasheet en EU
Hector Loaiza
No ratings yet
Flexy: Modular Industrial Iot Router and Data Gateway
Document5 pages
Flexy: Modular Industrial Iot Router and Data Gateway
Monika Motiyani
No ratings yet
Access of Paso Neo Ext
Document38 pages
Access of Paso Neo Ext
Sallamty
No ratings yet
The Evolution From PPPoE To IPoE Sessions PDF
Document34 pages
The Evolution From PPPoE To IPoE Sessions PDF
Daniel Vieceli
No ratings yet
Apache Web Server Myanmar Version
Document2 pages
Apache Web Server Myanmar Version
HRmyanmar
No ratings yet
Topic 2.1 Network Communication Using OSI Model - Part1
Document18 pages
Topic 2.1 Network Communication Using OSI Model - Part1
محمدجوزياي
No ratings yet
BRKSPG 2204
Document84 pages
BRKSPG 2204
Hafedh Esseyeh
No ratings yet