You are on page 1of 6

Installing and Configuring Java and Eclipse - 

To learn Java for Hadoop, you will first need to install Eclipse and Java. 

Eclipse is an Integrated Development Environment (IDE) which is used for


building applications in languages like Java, C, C++, C#, etc. It is built from
ground-up just to facilitate other languages. Eclipse does not have a great
design for end-use functionality by itself. It is designed to provide a robust
integration with each Operating System and has a common user interface
model. The Eclipse platform is composed of plug-ins. For example, the JDT -
Java Development Tools project allows Eclipse to be used as Java IDE. 

System Requirements for Installing Java: Now that you know, that learning Java
for Hadoop will help you in gaining expertise in this new technology, let us get
started from the beginning. Since Eclipse and Java can be integrated in any
OS, let us understand what are the system requirements to install Java:
Java for Windows : Windows 7, Windows 8 or Windows 10; 64-bit OS, 128 MB
RAM, Disk Space should be 124MB for JRE and 2MB for Java Update.
Minimum requirement for processor should be Pentium 2 266MHz. You have
to use these browsers - Internet Explorer 9 and above or Firefox.
Java for Mac OS X : Your system should be an Intel based Mac running Mac
OS X 10.8.3+ or 10.9+. You need to have administrator privilege for
installation and a 64-bit browser, either Safari or Firefox. 
These requirements are which Java 8 supports.

Arrays - Arrays are container type objects, or a data structure in Java, that


holds a fixed number of elements of a single type. Or  like you studied in Math
- you can define Array as a collection of variables of one type. When creating
an Array, the length of the Array is fixed. Each item or variable in the Array is
called an ‘element’.  Arrays is a very powerful concept used in programming.
Since the goal is to analyse data, arrays provide a good base on large data
can be broken and categorized with assigned values. 
Get Started with Arrays in Java through this "Learn Java for Hadoop
Tutorial:Arrays"
Objects and Classes - Java is an Object Oriented programming Language,
where any program is designed using objects and classes. An Object is
defined as a physical as well as logical entity, whereas a Class is just a logical
entity. For example - any object that we see around us will have a state, a
behaviour and an identity. A Class can be defined as a template on which
describes the type of the object, the state and the behaviour of it. A group of
Objects having common properties will constitute a class.
Get Started with Classes and Objects in Java through this "Learn Java for
Hadoop Tutorial:Classes and Objects"
Control Flow Statements - In Java, the statements inside any source file are
executed in an ascending order, i.e from top to bottom. Control flow
statements are commands that allow breaks in the execution pattern. You can
actually customize and execute particular blocks of code in your source file -
using control flow statements. 
If-then-else statement is the most basic and popular control flow statement. If
you want a particular block of code to be executed only If - certain conditions
are ‘true’, then the If-then-else clause will return the value ‘false’, once the
condition is not met. 

These statements in Java are crucial for data analysis and for writing
MapReduce jobs suitable for conditional big data analysis.

Interfaces and Inheritance - An interface is a platform which allows different


systems or programs to interact with each other. Similar to say a person
interacting with a computer - where we type in commands or instruction for the
computer by way of the keyboard. Here, the keyboard is an interface.
Similarly, in programming, it is necessary that different groups of
programmers should be able to write a code which is understandable by
disparate groups without specific instructions. Programmers need to have a
contract that lays out the rules of software interaction. 
Interfaces are such “contracts” which allows each group of programmers to
write their code even if they do not know how the other group is writing its
code. In a software programming language - interface is a service contract
between a library that has the services and the code that calls the services to
be implemented.

For example, let’s say the programmer wants to call the I/O service - the Java
program will obtain the I/O (input/output) services by creating objects and
classes from the Java class library and calling the methods. These classes
and methods are known as interfaces. Interfaces are reference types and
contain constants, default methods, static methods, method signatures and
nested types. 

Every class in Java has a superclass or a subclass - this is because in Java -


each class can be derived from another class. In doing so - the derived class
retains  the properties, method, fields of the other superclass or the base
class. This is known as inheritance which allows information to be stored in a
hierarchical order. 

The concept of inheritance is simple yet it is very useful. Say you want to
create a new class, but you know that there is an existing class library in Java
that already has some properties, methods and code that you need.

Get Started with understanding the concept of Inheritance and implemntation


of interfaces in Java through this "Learn Java for Hadoop Tutorial:Inheritance and
Interfaces"
Exception Handling
The mechanism to handle runtime malfunctions is referred to as Exception
Handling. The block of java code that handles the exception is known as
Exception Handler. When there is an Exception, the flow of the program is
disturbed or abruptly terminated. Exceptions occur due to various reasons-
hardware failures, programmer error,a file that needs to be opened cannot be
found, resource exhaustion etc.

Throwable class is considered to be on the top in the classification of


exceptions. 

There are three types of Exceptions which come under it -

1.Checked Exception

2.Unchecked Exception

3.Error

Checked Exception: These kind of exceptions can be predicted and resolved


by the programmer.
This is something that the programmer will be aware of. It will be checked
during compile time.
Unchecked Exception: This class is the extension of RuntimeException.This
type of exception is checked at the runtime and ignored during the compile
time.
Error: Errors cannot be recovered and neither can be handled in the program
code.The only solution to exit from errors is to terminate execution of the
program.
Serialization
Serialization is a mechanism in which an object is represented as a sequence
or stream of bytes.The stream of bytes contains information about the type of
the object and the kind of data stored in it. The type of information and bytes
that represent the object and its data can be used to recreate the object in
memory and this process is the reverse process of serialization known as
deserialization. The whole process is JVM independent. An object can
serialized in one platform and can be deserialized in a completely different
platform.

Two classes which contain methods for serializing and deserializing an object.

1) ObjectInputStream

2) ObjectOutputStream

ObjectInputStream class deserializes objects and primitive data types that


have been serialized using ObjectOutputStream.

Collections-

An object that groups multiple elements into a single unit is called a


Collection. A collection object in java holds references to other objects. It is
used to store, retrieve, manipulate, and communicate aggregate data.

All collections frameworks contain the following:

1) Interfaces : These are abstract data types that represent collections.


Interfaces usually form a hierarchy in object- oriented languages. Collections
can be manipulated independently irrespective of their representations.
Interfaces include Set, List, Queue, SortedSet, Enumeration, Map, Map.Entry,
Deque etc.
 

2) Implementations : Implementations in Java are  concrete implementations


of classes i.e. they are reusable data structures. Commonly used
Implementations include ArrayList, Vector, LinkedList, PriorityQueue,
HashSet, LinkedHashSet, TreeSet etc.
3) Algorithms: Computations like searching and sorting of data on objects
which implement collection interfaces are performed using Algorithms.
Algorithms are polymorphic in nature i.e. programmers can use the same
method with different implementations for a particular collections interface.
Spending few hours on Java basics will act as a great catalyst to learn
Hadoop.

If you are interested in becoming a Hadoop developer, but you are concerned
about mastering Java concepts for Hadoop, then you can talk to one of our
career counsellors. Please send an email to rahul@dezyre.com

You might also like