You are on page 1of 5

Abhilash Reddy L

Sr.Data Engineer | Email :lallapatiabhilashreddy91@gmail.com | PH : (559) 644 - 3122


BigData | Cloud |DevOps

Certified professional data engineer with overall IT experience more than 7 years in different domain
clients and technology spectrums. Experienced in working in highly scalable and large-scale applications
building with different technologies using Cloud, BigData, DevOps and Spring boot. Also, expert in
working in different working environments like Agile and Waterfall. Experience in working multi cloud,
migration, and scalable application projects.

Professional Summary

BigData

 Experience in working various Hadoop distributions like Cloudera, Hortonworks and MapR.
 Expert in ingesting batch data for incremental loads from various RBMS tools using Apache Sqoop.
 Developed scalable applications for real-time ingestions into various databases using Apache Kafka.
 Developed Pig Latin scripts and MapReduce jobs for large data transformations and Loads.
 Experience in using optimized data formats like ORC, Parquet and Avro.
 Experience in building optimized ETL data pipelines using Apache Hive and Spark.
 Implemented various optimizing techniques in Hive scripts for data crunching and transformations.
 Experience in building ETL scripts in Impala for faster access for reporting layer.
 Built spark data pipelines with various optimization techniques using python and scala.
 Experience in loading transactional and delta loads into NoSQL databases like HBase.
 Developed various automation flows using Apache Oozie, Azkaban, and Airflow.
 Experience in working with NoSQL Databases like HBase, Cassandra and MongoDB.
 Experience in various integration tools like Talend, NiFi for ingesting batch and streaming data.

Cloud

 Experience in working with various cloud distributions like AWS, Azure and GCP.
 Developed various ETL applications using Databricks Spark distributions and Notebooks.
 Implemented streaming applications to consume data from Event Hub and Pub/Sub.
 Developed various scalable bigdata applications in Azure HDInsight’s for ETL services .
 Experience in building data pipelines using Azure Data Factory, Azure Data Bricks.
 Developed scalable applications using AWS tools like Redshift, DynamoDB, Kinesis.
 Worked on building pipelines using snowflake for extensive data aggregations.
 Working knowledge on GCP tools like BigQuery, Pub/Sub, Cloud SQL, and Cloud functions.
 Experience in visualizing reporting data using tools like PowerBi, Google analytics.

DevOps

 Experience in building continuous integration and deployments using Jenkins, Drone, Travis CI.
 Expert in building containerized apps using tools like Docker, Kubernetes and terraform.
 Developed reusable application libraries using docker containers.
 Experience in building metrics dashboards and alerts using Grafana and Kibana.
 Expert in java and Scala built tools like Maven, Pom and SBT for application development.
 Experience in working with tools like GitHub, GitLab and SVN for code repository.
 Expert in writing various YAML scripts for automation purpose.

Education

 Master of Science from University of Central Missouri, USA (2016)


 Bachelor of Engineering from Jawaharlal Institute of Technology and Sciences (2012)

Experience Summary

Client Five Below Location Philadelphia, Pennsylvania


Designation Sr.Data Engineer Duration February 2020 - Present

Responsibilities

 Experience in migrating existing legacy applications into optimized data pipelines using Spark with
Scala and Python, supporting testability and observability.
 Experience in developing scalable real-time applications for ingesting clickstream data using Kafka
Streams and Spark Streaming.
 Developed optimized and tuned ETL operations in Hive and Spark scripts using techniques such as
partitioning, bucketing, vectorization, serialization, configuring memory and number of executors.
 Worked on Talend integrations to ingest data from multiple sources into Data Lake.
 Developed an MVP on exporting data to Snowflake to understand usages and benefits for migration.
 Experience in automating end to end Hadoop jobs using Oozie applications in optimized way.
 Implemented cloud integrations to GCP and Azure for bi-directional flow setups for data migrations.
 Developed various scripting functionality using Shell Script and Python.
 Developed APIs for quick real-time lookup on top of HBase tables for transactional data.
 Built Jupyter notebooks using PySpark for extensive data analysis and exploration.
 Implemented code coverage and integrations using Sonar for improving code testability.
 Pushed application logs and data streams logs to Kibana server for monitoring and alerting purpose.
 Worked on migrating data from HDFS to Azure HD Insights and Azure Databricks.
 Experience designing solutions in Azure tools like Azure Data Factory, Azure Data Lake, SQL DWH,
Azure SQL & Azure SQL Data Warehouse, Azure Functions.
 Migrated existing processes and data from our on-premises SQL Server and other environments to
Azure Data Lake
 Implemented multiple modules in microservices to expose data through Restful API’s.
 Developed Jenkins pipelines for continuous integration and deployment purpose.
 Implemented various optimization techniques for Spark applications for improving performance.
 Developed Jenkins and Drone pipelines for continuous integration and deployment purpose.
 Built SFTP integrations using various VMWare solutions for external vendors on boarding.
 Developed automated file transfer mechanism using python from MFT, SFTP to HDFS.
Technologies: HDFS, Hive, Spark, Oozie, Python, Scala, Shell, Talend, Snowflake, Azure, Azure
HDInsight’s, Databricks, Grafana, Jenkins, Azure Data Lake, Azure SQL

Client FINRA Location Washington, DC


Designation Sr Data Engineer Duration November 2018 – January 2020

Responsibilities

 Experience in building PySpark applications for ingestion data from data sources into Data Lake.
 Developed scalable streaming applications using Kafka and Spark Streaming for real time ingestions.
 Built Spring Boot microservices as part of real time data ingestion pipelines, and ingested data to
Data Lake using Spark Kafka consumers developed using Java high level API’s.
 Experience with data migration from traditional RDBMS to Big Data and Cloud using tools such as
Sqoop, Spark JDBC connectors and AWS Glue.
 Extensively used Spark Core and Spark SQL for data processing, data enrichments and also for
generating reports as per business user requirement.
 Developed custom UDF’s in Hive and Spark.
 Exposure with performance tuning for Hive scripts and Spark applications.
 Experience in designing data solutions in AWS including data distributions and partitions,
scalability, disaster recovery and high availability.
 Experience in monitoring and optimizing data solutions in AWS including usage of AWS
CloudWatch.
 Used AWS Lambda functions as triggers for event-based Glue jobs.
 Experience working with AWS Glue components such as data catalog, crawlers and developing
scripts in Glue using Spark and Python.
 Used AWS Athena for ad-hoc querying.
 Developed lot ETL operations using RedShift and Glue for business analysis purpose.
 Implemented several modules in microservices application for streaming pipeline using Kafka.
 In depth understanding and proficiency in automation of cloud platforms and data platforms.
 Experience in automating end to end production and development jobs using Airflow.
 Developed end to end deployment pipelines using CI/CD using Jenkins.

Technologies: PySpark, Kafka, Spark ,Sqoop, Hive, AWS, Aws Glue, RedShift, Airflow, Jenkins, Grafana,
Python, Shell, Microservices, Java, Restful API’s

Client Staples Location Framingham, MA


Designation Sr.Data Engineer Duration September 2017 – October 2018

Responsibilities

 Worked on building and developing ETL pipelines using Spark-based applications.


 Worked in migration of RDMS data into Data Lake applications.
 Build optimized hive and spark jobs for data cleansing and transformations.
 Developed spark scala applications in an optimized way to complete in time.
 Worked on various optimizations techniques in Hive for data transformations and loading.
 Expert in working with dynamic data schema evolutions like Avro formats.
 Built API on top of HBase data to expose for external teams for quick lookups.
 Experience in building impala script for quick retrieval of data to expose through tableau.
 Experience in developing various oozie actions for automation purpose.
 Developed a monitoring platform for our jobs in Kibana and Grafana.
 Developed real-time log aggregations on Kibana for analyzing data.
 Worked in developed Ni-Fi pipelines for extracting data from external sources.
 Developed Jenkins pipelines for data pipeline deployments.
 Worked on building different modules in spring boot scalable applications.
 Developed Docker container for automating run time environments for various applications.
 Expert in building ingestion pipelines for reading real time data from Kafka.
 Worked in Poc for setup Talend environments and custom libraries for different pipelines.
 Developed various python and shell scripting for various operations.
 Worked in Agile environment with various teams and projects in fast phase environments.

Technologies: HDP, HDFS, Hive, Spark , oozie, NiFi, Kibana, Grafana, Talend, Sqoop, Kafka, Scala, Python,
shell, spring boot, Avro

Client Netpay Advance Location Kansas


Designation Data Engineer Duration January 2016 – August 2017

Responsibilities

 Worked on optimized file formats like Avro, Orc and Parquet.


 Developed oozie automations using custom MapReduce, Pig, Hive, Sqoop.
 Built reusable Hive UDF libraries for business which enables users to reuse.
 Expertise in performance tuning on Hive queries, Joins and different configuration parameters
to improve query response time.
 Created Partitions, Buckets based on state to further process using Bucked based Hive joins.
 Used Cassandra CQL with Java API’s to retrieve data from Cassandra table.
 Developed applications on spark as part of Next gen platform implementation.
 Implemented Data Ingestion in real time processing using Kafka.
 Developed Data pipeline using Kafka and Storm to store Data into HDFS.
 Used Apache Maven extensively while developing MapReduce program.
 Extensively worked on Pig Scripts and Pig UDF’s to perform ETL activities.
 Worked on data cleansing scripts using Apache Pig for data preprocessing steps.
 Designing Technical Design documents documenting requirements and getting client sign-off.
 Developed workflow in Oozie to automate the tasks.
 Collected Logs data from web servers and loaded into HDFS using Flume.

Technologies: CDH, MapReduce, HDFS, Pig, Hive, Spark, Python, Scala, Java , Bash, Cassandra, Kafka,
Jenkins, Storm, Oozie, Sql
Client HSBC Location Hyderabad, India
Designation Data Engineer Duration August 2012 – June 2014

Responsibilities

 Developed Hive queries as per required analytics for the report generation in Qlikview
 Involved in developing the Pig scripts to process the data coming from different sources.
 Developed Custom Map Reduce code in Java for data cleansing and crunching for further usage.
 Worked on data cleaning using Pig scripts and storing in HDFS.
 Worked on Pig user defined functions (UDF) using Java language for external functions.
 Scheduling jobs to automate the process for regular executing jobs worked on using OOZIE.
 Worked on building custom UDF’s in Hive using Java.
 Expertise in building custom alerts and validations using Python scripting.
 Implemented Pl/SQL stored procedures, functions, triggers for persistence layer.
 Used SVN for source code versioning and code repository.

Technologies: Cloudera, HDFS, MapReduce, Pig, Hive, SQL, Impala, Java, Python, Qlikview, Oozie

You might also like