Professional Documents
Culture Documents
• Please do a check on your network connection and audio before the class to have a smooth session
• All participants will be on mute, by default. You will be unmuted when requested or as needed
• Please use the “Questions” panel on your webinar tool to interact with the instructor at any point during the
class
• Please have the support phone number (US : 1855 818 0063 (toll free), India : +91 90191 17772) and raise
tickets from LMS in case of any issues with the tool
• Most often logging off or rejoining will help solve the tool related issues
Node 1 Node 2
Node 1 Node 2
Duplicate
(small table)
Duplicate
Blog: http://www.edureka.in/blog/map-side-vs-join/
Distributed Cache
Mapper
Mapper
Record
Mapper Record
Record
b Record Big Table Data
.
Output
.
▪ Counters are used to gather information about the data we are analysing, like how many types of records were processed,
how many invalid records were found while running the job, etc.
Counters
Files are copied only once per job and should not be modified
by the application or externally while the job is executing.
Distributed Cache can be used to distribute simple, read-only HDFS – Hadoop Distributed Cache
data/text files and/or more complex types such as archives, jars
etc via the JobConf.
1-5