Professional Documents
Culture Documents
BigData Engineer
LinkedIn:-https://www.linkedin.com/in/pranjal-soni-367844106
Professional Summary:
• 3.4 years of IT experience working as BigData Developer in Big Data Engineering starting with technical
requirements, design and development of projects on platforms - Hadoop and Spark.
• Having experience in building data-ingestion pipeline and building EDW in Hadoop and Spark.
• Having certified as Google Certified Professional Data Engineer.
• Having 1.5 Years of working experience in Google Cloud Platform.
• Hands on experience on major components of Hadoop Ecosystem - HDFS , Hive, Pig, HBase, Sqoop,
Map Reduce ,YARN and Spark with Scala.
• Worked on real-time messaging system – Kafka with Spark Structured Streaming.
• Experience in end to end data-pipeline implementations - data ingestion, data cleansing ,data
processing and data loading in Hadoop and Spark.
• Experience around data analytics on Google Cloud Platform - worked on Dataproc , Google Cloud
Storage, BigQuery , BigTable , Dataflow , Apache Airflow , Google Cloud Composer.
• Experience around data analytics on Azure Cloud - worked on Azure Databricks(spark cluster & spark
job),Azure Data Lake Storage(ADLS) .
• Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems
and vice-versa.
• Experience in analyzing data using HiveQL , SparkSql, Pig Latin, and custom MapReduce programs in
Java.
• Having experience in Core Java , Scala ,Shell Scripting and Python .
• Worked with different storage file format such as ORC , Parquet and Avro.
• Experience around data analytics by processing - CSV FILES , JSON FILES AND FIXED LENGTH FILES.
• Implemented SCD 2 and CDC data pipelines.
• Implemented Joins , SerDe and User Defined Functions in Hive.
• Worked for optimizations and tuning of HiveQL and SparkSql.
• Knowledge in job workflow scheduling and monitoring tools like azkaban , autosys and oozie.
• Experience in continuous integration and continuous deployment CICD build tools such as - Jenkins.
• Experience in Code Management using versioning control - Git
Cloud Certification:
• Certified as Google Certified Professional Data Engineer
Technical Skills:
Hadoop Technologies and Distributions Cloudera Hadoop Distribution(CHD4, CHD5) and Horton works Data
Platform (HDP)
Hadoop Ecosystem HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie,Hbase,Spark , Kafka
NoSQL Databases HBase , BigTable
Programming Scala, Core Java, Shell Scripting
Google Cloud Platform Dataproc , BigQuery , Google Cloud Storage, BigTable , Dataflow ,
Cloud Composer
Real-time messaging system Kafka – Spark Structured Streaming
Microsoft Azure Databricks and Azure Datalake Storage
RDBMS ORACLE ,MySQL,NETEZZA,TERADATA
Version Control System Git
Professional experience:
Period Employer Location Designation
July - 2016 - till date Datametica Solutions Pvt Ltd Pune, India Bigdata Engineer
Projects:
Education:
• Bachelor of Technology in CSE (B.Tech) from Institute Of Technology , Central University of Bilaspur(Chhattisgarh)
with 85% aggregate in year 2016
Personal Details:
• Address: A2-201, Ganga Orchad Society , Mundhawa, Pune - 411036.
• Date of Birth : 1st May, 1994
• Marital status : Unmarried
• Languages Known : English, Hindi
Declaration:
I hereby declare that the above information is true to the best of my knowledge.