Professional Documents
Culture Documents
I’m a data engineer with 8+ years of software development experience having strong fundamentals in
clean
code, design principles of software systems. I design and architect data pipelines using state of the art
data
engineering techniques using technologies like Spark, Kafka, Apache Airflow, Databricks, Cloud (AWS,
Azure, Snowflake, GCP) to enable seamless data flow for teams. Along with this, I also have experience
in
data cleansing, building ML Models, realizing AI strategy from design to code using Pandas, Numpy,
Flask, Scikitearn, NLP techniques. For visualization, I use Tableau, PowerBI, Metabase and any of the
Python visualization libraries.
Apart from this, I was a Professor at Centennial College & Montreal College of Information Technology
teaching Advanced Databases, Python, Tableau, Software Development to students.
Skillset:
Work Experience
Professor
Centennial College
January 2022 to April
2022
- Advanced Databases
- Software Development Project
Professor
Montreal College of Information Technology
November 2019 to December 2021
- Data Analysis with Tableau (Tableau Desktop)
- Python (Django & Flask)
Data Developer
Montreal College of Information Technology
July 2020 to June 2021
Project: Data Integration and Reporting
Description: Data among different departments – Academics, Finance, Marketing need to be
communicated
with seamless flow along with respective dashboards to take decisions effectively.
Tech Stack: Python (Flask, Scikit Learn, PySpark) | Jupyter | Hadoop, HDFS
Responsibilities:
- Created ETL pipeline to synchronize data flow across academic, marketing, finance departments
and stored it in a destination database
- Created RESTful service end points using Flask to make data accessible across departments
- Developed Tableau dashboards for respective departments to learn more about the KPIs
- Built Django site with access to internal course reviews and modelled the review data
- Setup Hadoop, HDFS, Hive in-house for the institution to utilize for academic & administrative
purposes
- Created workflows to synchronize jobs using Airflow to migrate data from various systems
Data Scientist
Bombardier Transport
January 2020 to March 2020
Project: Applying Predictive Analytics towards maintenance of Fleets of Train (Aventra | NAT )
Description: There’s huge fleet of trains which generate sensor data across many cities. Predictive
Analytical solutions are catered to those teams to help them understand and gain insights from that
valuable
data. Later shall be used by Engineering teams towards maintenance.
Tech Stack: Python (Scikit Learn, Pyplot, PySpark) | Jupyter Notebook | Azure
Responsibilities:
- Automated Pre-processing of data using pandas in python to calculate thresholds for signals that
are highly correlate
- Scripts to extract the data from large csv files using PySpark
- Built Airflow jobs from scratch to migrate on-prem data to cloud
- Implementing best software engineering practices to clean the code to prep it for analysis
- Using best graphical methods to analyze time-series data to identify patterns & correlations with
Plotly & forecasting models
- Design an architecture for a pipeline to consume data from engineering teams and process it for
analysis
Scrum Master
CDK Global
November 2015 to December 2017
Responsibilities:
- Facilitated Scrum Collaborations and coached Teams in Project Management Best Practices
- Created Kanban boards for releases and bug fixes for smooth releasing after sprints
Education
Montreal College of Information Technology - Montréal, QC
March 2019 to March 2020
Skills
• Languages: Python, JavaScript
Front End: VueJs
Databases: SQL, Hive, MongoDB, DynamoDB, CosmosDB, PostGtes
Packages: PySpark, Kafka, Plotly, Flask, Scitkitlearn, Pandas
Streaming: Kafka, Spark
Data Warehouse: Redshift, Snowflake
Data Migration: AWS Data Pipeline, Google Cloud Data Flow
Workflow Orchestration: Prefect, Apache Airflow
Cloud: AWS, Azure, Snowflake, Databricks, GCP
Machine Learning: Supervised, Unsupervised, NLP