You are on page 1of 5

VINOD KUMAR SINGH - Senior Cloud Data Engineer

Princeton New Jersey, USA | +13022615859| vinodsingh900400@gmail.com | LinkedIn | GitHub

Professional Summary:
 IT professional with 11 years of experience in SQL, Python, Snowflake Development, AWS Data Engineering, and BI
(Power BI).
 Demonstrated expertise in Cloud technologies, emphasizing a deep understanding of Cloud solutions.
 Skilled in optimizing data pipelines, enhancing performance, and ensuring data integrity.
 Proficient in writing SQL queries and creating stored procedures.
 Experienced in automating data pipelines using streams and tasks.
 Knowledgeable in implementing Snowflake Data Warehouse with a strong grasp of its architecture.
 Proficient in migrating data from various sources to Snowflake Data Warehouse.
 Experienced in using SSIS, AWS Glue, and conducting ETL/ELT processes.
 Strong background in the AWS platform, including S3, RDS, Redshift, AWS Glue, SQS, SNS, EC2, Lambda, VPC, ELB,
IAM, Auto-scaling, CloudWatch, CloudTrail, and Security Groups.
 Collaborated effectively with cross-functional teams, including data engineers, developers, and system
administrators, to investigate and resolve issues impacting data pipelines and ensure seamless data
flow.
 Skilled in using and optimizing relational databases (e.g., Microsoft SQL Server, Oracle, MySQL) and columnar
databases (e.g., Snowflake, Amazon Redshift).
 Designed and implemented efficient data pipelines using Python, AWS services (such as AWS Glue or Lambda), and
Snowflake, facilitating the seamless movement and transformation of data between various sources and Snowflake
data warehouse.
 Ensured data quality and integrity during the ingestion process by implementing robust data validation and
cleansing mechanisms.
 Proficient in developing and utilizing event-driven and scheduled AWS Lambda functions to trigger various AWS
resources.
 Experienced in Agile environments, consistently delivering efficient project outcomes.
 Knowledgeable in Data Warehousing Concepts, OLTP/OLAP System Analysis, and adept at developing Database
Schemas, including Star Schema and Snowflake Schema, for Relational and Dimensional Modeling.
 Provided 24/7 technical support to Production and development environments.

TECHNICAL SKILLS
Languages & Frameworks Python, T-SQL, PL/SQL, PGSQL, Scala, Object-Oriented Design Pattern, XML, ASP.Net (VB & C#),
NoSQL, JavaScript, Java, jQuery, CSS
Machine Learning Library Scikit-Learn, Pandas, Matplotlib, Seaborn, NumPy, SciPy
Web Services REST API (Java), SOAP API (ASP.NET), OData API
Databases MSSQL, Redshift, Snowflake, Databricks, Oracle11g, MySQL, PostgreSQL
Version Controls GIT, SVN, Bitbucket, TFS (Team Server Foundation)
IDE (Language Anaconda Distribution, DBeaver, SQL Workbench, Aginity Workbench, VS Code, PyCharm, Toad,
& Database) Visual Studio, Eclipse, NetBeans,
Data Modeler tools Erwin data modeler, MySQL Workbench, SQL Server Management Studio
ETL Tools / Data Pipeline Apache Airflow, MSBI, DBT
Web Server/ Application Apache Tomcat, Microsoft IIS
Data Science Tools Domino data science, Data Kitchen, Sciece@Scale, EMR
Project Methodology Agile Project Development
Reporting Tools MSBI, Tableau, Power BI, PPT, Excel
Other Tools WinSCP, Putty, Cloudberry
Professional Experience:

Bayer- Whippany, NJ, Remote Jan 2022 to Present


Senior Snowflake/AWS Cloud Data Engineer
Responsibilities:
 Designed the architecture of ELT/ETL data pipelines using AWS glue from source to target, following best practices.
This involved batch processing, near real-time data ingestion, and micro-batch processing.
 Extracted data from multiple sources, including SFTP, flat files through email, Redshift, S3 buckets, and APIs, and
ingested it into Snowflake.
 Developed Python scripts to perform data quality checks and validation on incoming data, ensuring data
accuracy and consistency within Snowflake.
 Created a resource monitor to manage data platform costs and monitor Snowflake credit consumption.
 Implemented dynamic data masking at query runtime for enhanced security.
 Shared data with other consumers by creating reader accounts.
 Migrated enterprise data warehouse from a Denodo database to Snowflake, improving scalability and performance.
 Improved performance by implementing clustering keys, automating warehouse size adjustments, and tuning
queries.
 Followed and implemented Snowflake's best practices for securing data within the Snowflake environment.
Environment: Snowflake EDW, Python, Py-Spark, DBT, AWS Glue, AWS S3, SQS, SNS, Lambda, AWS EC2, Airflow.

Pliant Therapeutics, CA, Remote Jan 2021 to Dec 2021


Snowflake Consultant
Responsibilities:
 Managed Role-Based Access Control (RBAC) in building a security model for the Snowflake Data Warehouse and
utilized Resource Monitors.
 Facilitated data sharing with customers by creating shares and granting usage permissions.
 Focused on optimizing the performance of the Snowflake Data Warehouse by leveraging dedicated and multi-
clustered warehouses. This optimization also involved using cluster keys for partitioning large tables.
 Configured Snow pipes for continuous data loading from AWS S3 buckets into Snowflake, utilizing SQS events to
trigger Snow pipe.
 Conducted bulk data loading into Snowflake using the COPY command.
 Migrated enterprise data warehouse from a MS-SQL to Snowflake, improving scalability and performance
 Implemented streams and tasks for continuous ELT workflows to process recently changed data (CDC).
 Ensured data security by implementing data masking when sharing confidential information with users.
 Successfully migrated logic and data from a legacy database to Snowflake by creating new tables based on
stakeholder requirements using Snow SQL.
 Created both internal and external stages and performed data transformations during the load process.
 Executed data transformations using AWS Glue and Py-Spark/Python scripts to ingest data into Snowflake.
 Utilized functions such as Lateral Flatten to convert loaded JSON data into columns in Snowflake.
 Leveraged both Maximized and Auto-Scale functionality.
 Employed Temporary and Transient tables on different datasets.
 Conducted cloning of Production data for code modifications and testing.
 Facilitated user acceptance testing (UAT) by sharing sample data with customers.
 Utilized the time-traveling feature for data recovery purposes.
Environment: Snowflake EDW, Python, DBT, AWS Glue, AWS S3, AWS EC2, Tableau, MS SQL Server.

Pacira Pharmaceuticals Inc, CA, Remote Jan 2020 to Dec 2020


Cloud Data Engineer
Responsibilities:
 Designed and established an Enterprise Data Lake to support a wide range of use cases, including analytics, data
processing, storage, and reporting of large and rapidly changing datasets.
 Implemented Azure services such as Azure Virtual Machines, Azure Functions, and Azure Data Factory to
support data processing and workflows.
 Developed and optimized Databricks notebooks for data exploration and visualization.
 Designed, developed, and maintained Microsoft SQL Server databases for data storage and retrieval.
 Managed and provisioned Azure Virtual Machines to run data-related workloads, optimizing performance and
resource allocation.
 Created serverless Azure Functions to automate data processing tasks, improving efficiency and reducing
operational costs.
 Designed and configured data pipelines in Azure Data Factory to orchestrate data workflows, including data
ingestion, transformation, and loading.
 Utilized Azure Data Lake Storage for storing and managing large volumes of structured and unstructured data.
 Containerized applications and ETL processes using Docker for consistency and portability across different
environments.
 Maintained code repositories, including branching, merging, and resolving conflicts.
Environment: Azure, Data Bricks, MS-SQL, Blob, Azure VM, Azure function, ADF, Azure Data Lake, ADFS, Python, MySQL,
Docker, GIT

Bristol Myers Squibb-Princeton, NJ, Remote Nov 2017 to Dec 2019


Redshift/AWS Cloud Data Engineer
Responsibilities:
 Developed and maintained data ingestion pipelines to ensure the timely and accurate transfer of data from AWS
RDS PostgreSQL to Redshift using AWS Glue workflows.
 Enhanced the performance of data processing in AWS Glue by implementing data partitioning strategies.
 Created and optimized PY-Spark jobs within AWS Glue to transform and process large-scale datasets, ensuring
efficient data extraction, transformation, and loading processes.
 Utilized Python's data processing libraries, such as Pandas and NumPy, for data transformations, data cleansing, and
data enrichment as part of the ETL (Extract, Transform, Load) process.
 Implemented real-time streams to capture and track all changes made to the data, providing a reliable source of
change data for further processing.
 Developed jobs and workflows to orchestrate the execution of queries based on the captured change data, ensuring
efficient and synchronized data processing.
 Ensured data quality and integrity during the ingestion process by implementing robust data validation and
cleansing mechanisms.
 Collaborated with stakeholders to identify and implement appropriate tags for different types of data, including
source system, data sensitivity, business unit, etc.
 Conducted performance analysis and tuning of Redshift queries and data processing tasks to optimize query
execution time and resource utilization.
 Implemented query and workload optimization techniques, such as query rewriting, sorting, partitioning, and
materialized views, to enhance Redshift's performance.
 Collaborated with database administrators and data engineers to identify and resolve performance bottlenecks,
ensuring efficient data processing and responsive query execution.
 Implemented comprehensive logging and error handling mechanisms within the data pipelines to facilitate
troubleshooting and debugging.
Environment: Redshift, Python, AWS Glue, AWS RDS, AWS SQS, AWS EC2, SQS, SNS, Tableau, Bitbucket, AWS EMR

BioMarin Pharmaceutical Inc-Pune, Offshore Apr 2016 to Oct 2017


Database Consultant
Responsibilities:
 Conducted performance tuning, which involved removing redundant logic, optimizing slow-running code, utilizing
optimizer hints, creating suitable indexes, and optimizing SQL queries.
 Integrate data from various sources, ensuring data accuracy and consistency, and handle complex data
transformations and workflows.
 implemented Change Data Capture (CDC) to track source data changes, storing transaction logs for comprehensive
auditing, and enabling fast issue resolution through log tables and alert systems
 Develop interactive dashboards that combine multiple visualizations and allow users to explore data, make data-
driven decisions, and gain insights.
 Optimize Tableau reports and dashboards for performance, including data extraction, loading times, and query
execution.
 Managed and configured cloud resources, ensuring security and compliance, optimizing resource utilization,
monitoring and troubleshooting, and collaborating with teams to develop and deploy cloud-based solutions.
Environment: MS SQL 2017, SSMS, Informatica, Azure, Tableau

IDBI BANK-Mumbai, Onsite Mar 2015 to Mar


2016
Database Developer
Project Name: MIS Data Warehouse
Responsibilities:
 Managed database administration tasks on SQL Server 2012, including database maintenance, backup and restore
operations, and performance tuning.
 Designed and implemented complex SQL Server 2012 stored procedures, triggers, and functions to support critical
business processes.
 Developed and maintained PL/SQL procedures and packages for data manipulation and business logic
implementation on Oracle 11g.
 Designed and developed Informatica ETL (Extract, Transform, Load) processes to efficiently extract data from diverse
sources, perform required transformations, and load the processed data into SQL Server databases.
 Implemented data cleansing and transformation logic within informatica to ensure data quality and consistency.
 Created and maintained SSIS job schedules and automated data integration tasks, improving overall efficiency.
 Engaged in the deployment, configuration, and resolution of issues in Informatica workflows, ensuring the smooth
and efficient transfer of data.
 Created dynamic and visually engaging Power BI reports and dashboards tailored for business users, offering
valuable insights into essential key performance indicators.
 Created and developed Informatica workflows and mappings to facilitate complex data transformations and
integrations, enabling advanced data analysis and reporting
 Leveraged Power BI to conduct data mining and predictive analytics, delivering valuable insights to inform decision-
making processes.
Environment: SQL Server 2012, Oracle 11g, Informatica, Azure, Power BI

Bank of America-Mumbai, Offshore Sep 2012 to Mar 2015


MS-SQL Developer
Project Name: SBP Data Warehouse
Responsibilities:
 Gathered requirements from the business team and translated them into technical architecture and design
documents.
 Conducted estimation, analysis, and solution documentation for creating dashboards using SSRS.
 Played a key role in creating the analysis design and building dashboards using SSRS by establishing information links
from MS-SQL to display data in reports.
 Designed and generated dashboards and reports with various chart types, including Bar Charts, Scatter Plots, Map
Charts, Pie Charts, Cross Tables, and Graphical Tables in SSRS. Implemented filtering schemes (Global Filter) and
(Local Filter).
 Created dynamic visualizations using advanced features such as Trellis, Scatter Plots, 3D Scatter Plots, Map Charts,
Tree Maps, Text Areas, Input Fields, Lists, drop-down lists, Properties, Document Properties, On-Demand Data, and
In-Database functionality.
 Developed visualizations with complex SSRS features, including calculations and functions.
 Managed code migration across environments (Dev > QA > Prod).
 Designed and developed information links, Data Sources, Columns, Joins, Filtering, and Procedures.
 Communicated problems, solutions, updates, and project status to the project manager in a timely and regular
manner.
 Conducted testing and debugging of dashboards for quality assurance.
 Acted as the primary point of contact for Production Support issue resolution.
 Analyzed and documented processes, providing swift resolutions for typical production issues, especially those
critical to the process.
Environment: SQL Server 2008, SSIS, SSRS, SSAS

Education:
 Bachelors in computer science engineering, KIIT Bhubaneshwar – Odisha, India, 2012

You might also like