You are on page 1of 6

Bharath

Email Id: rmehta@aksinfotech.net


Phone No: +17325701393

Professional summary:
 Around 8+ years of strong experience in software development using Big Data, Hadoop, Apache Spark Java/J2EE, Scala,
Python technologies
 Strong understanding of Software Development Lifecycle (SDLC)
 Experience in dealing with log files to extract data and to copy into HDFS using flume.
 Developer test classes using MR unit for checking Input and Output
 Installing, configuring and managing of Hadoop Clusters and Data Science tools.
 Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, Hue.
 Good Knowledge and understanding of Hadoop Architecture and various components in Hadoop ecosystems - HDFS,
Map Reduce, Pig, Sqoop and Hive.
 Hands on experience in developing Map Reduce programs using Apache Hadoop for analyzing the Big Data
 Experience administering and configuring NoSQL Databases like Cassandra, MongoDB etc.
 Knowledge in handling Kafka cluster and created several topologies to support real-time processing requirements.
 Good working knowledge in creating Hive tables and worked using Hive QL for data analysis to meet the business
requirements.
 Experience working with JAVA, J2EE, JDBC, ODBC, JSP, Java Eclipse, MS SQL Server.
 Extensive experience with SQL, PL/SQL and database concepts.
 Expertise in debugging and optimizing Oracle and java performance tuning with strong knowledge in Oracle 11g and SQL.
 Experience working with Distributions such as MAPR, Horton works and Cloudera.
 Experience working with NoSQL databases such as HBase and MongoDB.
 Experience in managing and reviewing Hadoop log files. Experience in NoSQL database HBase.
 Experience with Testing Map Reduce programs using MR Unit, J unit and Easy Mock
 Experienced in performing real time analytics on HDFS using HBase
 Exposure in working with data frames and optimized the SLA's.
 Developed custom MapReduce programs for data analysis and data cleaning using pig Latin scripts.
 Good understanding of end-to-end content lifecycle, web content management, content publishing/deployment, and
delivery processes.
 Hands on experience with build tools like ANT, Maven

Education:
Bachelors in Electronics and Communication Engineering, JNTU Kakinada University, Kakinada, India.

Technical Skills:

Big Data & Hadoop Hadoop, MapReduce, HDFS, HBase, Hive, Pig, Oozie, Scoop, Spark,
Impala, Zookeeper, Flume, Kafka, Cloudera, Mongo DB,AWS
Programming Languages Java JDK 1.7/1.8, SQL, PL/SQL,SCALA
Java/J2EE Technologies Servlets, JSP, JSTL, JDBC, JMS, JNDI, RMI, EJB, JFC/Swing, AWT,
Applets, Multi-threading, Java Networking
Frameworks Struts 2.x/1.x, Spring 2.x, Hibernate 3.x
IDEs Eclipse 3.x, IntelliJ
Web technologies JSP, JavaScript, jQuery, AJAX, XML, XSLT, HTML, DHTML, CSS
Web Services SOAP, REST, WSDL
XML Tools JAXB, Apache Axis, AltovaXMLSpy
Methodologies Agile, Scrum, RUP, TDD, OOAD, SDLC
Modeling Tools UML, Visio
Testing technologies/tools JUnit
Database Servers Oracle 8i/9i/10g, DB2, SQL Server 2000/2005/2008, MySQL
Version Control CVS, SVN
Build Tools ANT, Maven
Platforms Windows 2000/98/95/NT4.0, UNIX

At& t, Dallas, TX October 2015 –Present


Sr. Hadoop Developer

Description: AT&T collects and analyzes large amount of data from customers 24×7 from several data points – websites,
mobiles, data cards, social media and bill payments. Data from these data points could be structured and unstructured in
few cases. All these data are collected, aggregated and analyzed in the Hadoop cluster to find customer usage patterns
which helps make cross sell, up sell business decisions and device targeted marketing strategies.

Responsibilities:
 Gathering business requirements from the Business Partners and Subject Matter Experts.
 Used Zookeeper for providing coordination services to the cluster
 Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS and
SOLR.
 Experience in using Zookeeper technologies.
 Expertise in integrating Kafka with Spark streaming for high speed data processing.
 Used various Spark Transformations and Actions for cleansing the input data
 Load and transform large sets of structured, semi structured and unstructured data.
 Scheduled Oozie workflow engine to run multiple Hive and Pig jobs, which independently run with time and data
availability
 Working with data delivery teams to setup new Hadoop users. This job includes setting up Linux users, setting up
Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users.
 Used Sqoop tool to load data from RDBMS into HDFS.
 Involved in loading data from UNIX file system to HDFS
 Work on code reviews, Bug fixes and documentation
 Developed the Pig UDF'S to pre-process the data for analysis
 Created Produce, consumer and Zookeeper setup to Kafka replication
 Migrated data from Hadoop to AWS S3 bucket using DISTCP. Also migrated data across new and old clusters using
DISTCP.
 Having good knowledge in writing scripts using shell, Python & Perl in Linux.
 Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS), Teradata
and vice versa.
 Hands on knowledge of writing code in Scala
 Hands on experience in large scale data processing using Spark
 Responsible for developing data pipeline by implementing Kafka producers and consumers and configuring brokers
 Familiar with installation of Oozie workflow engine to run multiple Hive and Pig jobs that run independently with time
and data availability.
 Worked in AWS environment for development and deployment of Custom Hadoop Applications.
 Real time streaming the data using Spark with Kafka
 Uses Pig UDF's in Python, Java code and uses sampling of large data sets.
 Performed POC’s using latest technologies like spark, Kafka, Scala.
 Worked on migrating the old java stack to Type safe stack using Scala for backend programming.
 Deployment of Qliksense and Zeppelin Dashboard for Hadoop data ingestion project at HPE (Ambari, MySQL, Qliksense
hub, Apache Zeppelin):
 Created Qliksense and Apache Zeppelin Dashboards of Oozie and Falcon data ingestion jobs for efficient monitoring.

 Established relationship between various data and processing elements on a Hadoop environment using Apache Falcon
 Used Apache Falcon framework to simplify data pipeline processing and management on Hadoop clusters
 Involved in Analysis, Design, Development, Integration and Testing of application modules and followed agile methodology.
 Continuously monitored and managed Hadoop cluster using Cloudera Manager.
 Developed Scala programs to perform data scrubbing for unstructured data
 Leading three internal initiatives related to AWS, lift and shift of existing production system, Architecting the next
generation micro services based crediting system using native AWS technologies like Lamda, Step functions, S3, EC2,
etc.

Environment: Java, Horton works, Hadoop, HDFS, Hive, Tez, Pig, Sqoop, Hue, HBase, Kafka, Storm, Oozie, Zookeeper,
yarn Map Reduce, Hcatalog, Avro, Parquet, Tableau, JSP, Oracle, Teradata, SQL, Log4J, RAD, Web sphere, Eclipse, AJAX,
JavaScript, jQuery, CSS3, SVN, Putty, FTP, Linux, Cronjob, Shell Script and SQL Developer, Ambari,Falcon,Denode

Express scripts, Franklin lakes, NJ August 2014 – September 2015


Sr. Hadoop Developer

Description: The main purpose of this project is to store the data in Hadoop cluster and extract the data using hive and pig
Latin for processing according to the user requirements
Responsibilities:
 Installed and configured HDFS, Hadoop Map Reduce, developed various Map Reduce jobs in Java for data cleaning and
preprocessing.
 Analyzed various RDDS using Scala, Python with Spark.
 Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
 Experience in implementing Spark RDD's in Scala.
 Involved in data ingestion into HDFS using Sqoop and Flume from variety of sources.
 Responsible for managing data from various sources.
 Worked on the conversion of existing MapReduce batch applications for better performance.
 Big data analysis using Pig and User defined functions (UDF).
 Worked on both External and Managed HIVE tables for optimized performance.
 Developed HIVE scripts for analyst requirements for analysis
 Storing, processing and analyzing huge data-set for getting valuable insights from them.
 Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
 Used Apache Spark and Scala language to find patients with similar symptoms in the past and medications used for them
to achieve best results.
 Worked RDS service in AWS with MySQL, Oracle RDBMS database.
 Worked on Deployment on EC2 instance
 Worked on Hadoop cluster in EMR in s with Hadoop component pig, hive, sqoop, MapReduce
 Worked on S3 to download the data from S3 to spark and upload images to S3 in spark cluster in AWS
 Worked on DynamoDB NoSQL db. to store the invoice transactions on AWS.
 Worked on AWS RedShift to store data and write queries to retrieve data faster from Redshift
 Migration of ETL processes from MySQL to Hive to test the easy data manipulation
 Familiarity with a NoSQL database such as MongoDB, Cassandra.
 Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
 Analyzed the customer data by performing Hive queries to know user behavior.
 Worked on the conversion of existing MapReduce batch applications for better performance
 Developed mappings using Informatic to load data from sources such as Relational tables, Flat files, Oracle tables into
the target Oracle tables.
 Working with data delivery teams to setup new Hadoopusers. This job includes setting up Linux users, setting up Kerberos
principals and testing HDFS, Hive.
 Co-ordinate with the QA lead for development of test plan, test cases, test code and actual testing responsible for defects
allocation and those defects are resolved.
 Involved in testing and deployment of the application on Web logic Application Server during integration and QA testing
phase.

Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, flume, Kafka, Linux Shell Scripting, Java, J2EE, JSP, JSF, Eclipse,
Maven, J2EE, SQL, HTML, XML, XSLT, Oracle, MYSQL, Ajax/JavaScript, web services API, putty

EOG Resources, Houston TX. September 2013 – July 2014


Hadoop Developer

Description: EOG Resources is mainly collect data from various sources like Free Wheels, DFPP, Omniture and Digital
Data. The data will be clean up and ingested into Hadoop. The main purpose of Big Data is to store large amount
data in HDFS daily and provide the clean data to Business Intelligence team for analysis.

Responsibilities:
 Hadoop installation, configuration of multiple nodes in Cloudera platform.
 Setup and optimize Standalone-System/Pseudo-Distributed/Distributed Clusters.
 Developed Simple to complex MapReduce streaming jobs
 Used Impala to query the Hadoop data stored in HDFS.
 Manage and review Hadoop log files
 Worked extensively on creating Oozie workflows for scheduling different jobs of hive, map reduce and shell scripts.
 Worked on migrating tables in SQL to Hive using Sqoop.
 Implemented Kafka messaging services to stream large data and insert into database.
 Used Hive and created Hive tables and involved in writing Hive UDFs and data loading.
 Imported data into HDFS and Hive from other data systems by using Sqoop.
 Installed Oozie Workflow engine to run multiple Hive and Pig Jobs.
 Worked on partitioning and Bucketing the Hive table and running the scripts in parallel to reduce the run time of the
script
 Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop
cluster
 Involved in loading data from UNIX file system to HDFS.
 Analyzed data using Pig Latin, Hive QL, HBase and custom MapReduce programs in Java
 Wrote SOLR queries for various search documents.
 Involved in gathering business requirements and prepared detailed specifications that follow project guidelines required to
develop written programs.
 Actively participating in the code reviews, meetings and solving any technical issues.
 Worked on Control M to rhiveun Informatica and Hadoop jobs parallel.

Environment: HDFS, Pig, Hive, HBase, Sqoop, Spark, Oozie, flume, Kafka, AWS, Linux Shell Scripting, Linux, Java, J2EE,
JSP, JSF, Eclipse, Maven, J2EE, SQL, HTML, XML, XSLT, Oracle, MYSQL, Ajax/JavaScript, web services API.

Intermountain Healthcare (IHC), Salt Lake City, Utah December 2012 – August 2013
Hadoop Developer

Description: Intermountain is the one of the best Health Care company in the United Stated Which is Helping people to live
Healthier. To meet the requirement for the growing market applications and some of the use cases an Enterprise Data Lake
powered by the Hadoop is under development which in turn provides services for various applications and data requirements
for majority of the applications in upcoming future.

Responsibilities:
 Involved in loading data from UNIX file system to HDFS.
 Installed and configured Hadoop Map Reduce, HDFS and Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
 Hands on writing Map Reduce code to make unstructured data as structured data and for inserting data into HBase
from HDFS
 Used Pig and Hive in the analysis of data.
 Used all complex data types in Pig for handling data
 Experienced with NoSQL database and handled using the queries
 Experienced in managing and reviewing Hadoop log files.
 Extracted files from Couch DB through Sqoop and placed in HDFS and processed
 Worked on Kafka to produce the streamed data into topics and consumed that data.
 Created Data model for Hive tables
 Developed the LINUX shell scripts for creating the reports from Hive data.
 Involved in creating Hive tables, loading the data using it and in writing Hive queries to analyze the data.
 Implemented Pig jobs to clean, parse and structure the event data to facilitate effective downstream analysis.
 Built re-usable Pig UDFs for business requirements which enabled developers to use these UDFs in data parsing and
aggregation.
 Created Hive Managed Tables and External tables and loaded the transformed data to those tables.
 Implemented Hive’s Dynamic partitions and Hive Buckets depending on the downstream business requirements.
 Used Oozie to automate interdependent Hadoop jobs.

Environments: Java 7, Eclipse IDE, Hive, HBase, Map Reduce, Oozie, Sqoop, Pig, Spark, flume, Impala, Java,MySQL,
PL/SQL, Kafka, Linux

Met life, Bridgewater, NJ October 2010 – November 2012


Sr. Java Developer

Description: Help2 Clinical Desktop (CD) is a web application based on J2EE architecture. It provides web-based access to
inpatient and outpatient data. Users can review as well as update or enter new data through CD. Many clinicians use CD daily
for accessing patient data, which aids in making important clinical decisions. CD is mainly composed of a “shell” or “core”,
and various modules running inside of the “shell”. Intended audiences for this document are mainly programmers.

Responsibilities:
 Developed the application under JEE architecture, developed Designed dynamic and browser compatible user
interfaces using JSP, Custom Tags, HTML, CSS, and JavaScript.
 Used Spring, Hibernate, and Web Services Frameworks.
 Developed and Deployed SOA/Web Services (SOAP and RESTFUL) using Eclipse IDE.
 Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
 Implemented Model View Controller (MVC) architecture using Jakarta Struts frameworks at presentation tier.
 Contributed significantly in designing the Object Model for the project as senior developer and Architect.
 Responsible for development of Business Services.
 Deployed J2EE applications in Web sphere application server by building and deploying ear file using ANT script
 Involved in the testing and integrating of the program at the module level.
 Worked with production support team in debugging and fixing various production issues.
 Used stored procedures and Triggers extensively to develop the Backend business logic in Oracle database.
 Involved in performance improving and bug fixing.
 Involved in code review and designed prototypes
 Responsible for preparing the foundation of web projects by coding specific Java data objects, Java source files, XML files
and SQL statements designed for graphical presentation, data manipulation and security.
 Used Oracle WebLogic application server to deploy application.

Environments: Java 1.5, JSP, AJAX, XML, Spring 3.0, Hibernate 2.0, Web Services, WebSphere7.0, JUnit, Oracle 10g, SQL,
PL/SQL, log4j, RAD 7.0/7.5,ClearCase, UNIX, HTML, CSS, JavaScript

Prosum Technologies, Bangalore India June 2009 – September 2010


Java Developer

Description: The aim of the project was to develop a web-based application to track and manage data affiliated with
employee information. The project provides the opportunity to involve in analyzing, developing, testing, and supporting both
frontend and back-end solutions that meets the business needs.

Responsibilities:

 Worked with the business community to define business requirements and analyze the possible technical solutions.
 Requirement gathering, Business Process flow, Business Process Modeling and Business Analysis.
 Responsible for system analysis, design and development using J2EE architecture.
 Interacted with Business Analyst for the requirement gathering
 Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax
 Developed custom tags for table utility component
 Used various Java, J2EE APIs including JDBC, XML, Servlets, and JSP.
 Carried out integration testing & acceptance testing
 Involved in Java application testing and maintenance in development and production.
 Participated in the team meetings and discussed enhancements, issues and proposed feasible solutions.
 Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
 Involved in mentoring specific projects in application of the new SDLC based on the
 Agile Unified Process, especially from the project management, requirements and architecture perspectives.
 Designed and developed Views, Model and Controller components implementing MVC
 Designed, developed and documented stored procedures and functions for the project and reviewing the code to check
for errors if any.

Environment: JDK 1.3, J2EE, JDBC, Servlets, JSP, XML, XSL, CSS, HTML, DHTML, JavaScript, UML, Eclipse 3.0, Tomcat 4.1,
MySQL