You are on page 1of 4

Shreyak

SUMMARY:
• Over 5 years’ experience in machine learning / data science for Automobile, Banking, Retail, and
Manufacturing industry. Completed developing deep learning / machine learning Models using
Artificial Neural Network (ANN), Support Vector Machine (SVM), Linear and Logistic Regression
methods, etc. for various applications including speech recognition in deep learning Long Short -
Term Memory (LSTM) models.
• Data-driven and highly analytical with working knowledge of statistical modelling approaches
and methodologies (Clustering, Segmentation, Dimensionality Reduction, Regression Analysis,
Hypothesis testing, Time Series Analysis, Decision trees, Random Forests), rules, and ever
evolving regulatory environment
• IBM Certified ML/DS expert
• Strong skills in statistical methodologies such as A/B test, experimental
design, hypothesis test, ANOVA
• Knowledge on time series analysis using AR, MA, ARIMA, GARCH and ARCH model.

• Proficient in Developing linear/non-linear/heuristic optimization algorithms to solve various


business problems
• Proficient in Machine Learning techniques (LDA, Decision Trees, Linear, Logistics, Random
Forest, SVM, Bayesian, XG Boost, K-Nearest Neighbors, Clustering) and Deep Learning
techniques (CNNs, RNNs) and Statistical Modeling in Forecasting/ Predictive Analytics,
Segmentation methodologies, Regression based models, Ensembles .
• Knowledge of CRISP - DM methodology for prediction
• Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical
testing, normal distribution and other advanced statistical and econometric techniques .
• Experience in large datasets of structured and unstructured data, data visualization, data
acquisition, predictive modeling
• Experience in Cloud, Big Data, DevOps, Analytics, Business Intelligence, Data mining, Machine
learning, Algorithm development, Distributed computing, Programming and Scripting languages
• Extensively worked with Python 3.6 (NumPy, Pandas, Matplotlib, NLTK, spaCy, and Scikit -
learn).
• Strong foundation of knowledge around Azure and or AWS cloud services and the cloud
ecosystem.
• Explicitly fashioned in writing SQL queries for various RDBMS such as SQL Server, MySQL,
Microsoft SQL, Postgre SQL, Teradata and Oracle , NoSQL databases such as MongoDB, HBase
and Cassandra to handle unstructured data.
• Strong experience in Software Development Life Cycle (SDLC) including Requirements Analysis,
Design Specification and Testing as per Cycle in both Waterfall and Agile methodologies.
• Expertise designing the web crawlers for data gathering and application of LDA.(Selenium)
• Experienced in Manipulating, merging and creating restructured datasets from big data using
Hadoop, SAS, Python, R and SQL and built predictive models.
• Experienced in Working with database like MongoDB, SQL, NoSQL and Oracle in writing complex
SQL queries like stored procedures, triggers, joins and subqueries.

TECHNICAL SKILLS:
Big Data : Spark, AWS, EMR, S3, Kinesis, Lambdas, State-machines, Step-functions, SageMaker, IAM,
DynamoDB, Hive, Genie, SQS, SNS, EC2, Spark Sql, Kafka, Scoop, Yarn, HDP, Machine Learning, AI,
Cassandra, Druid, Redshift, Hadoop, Azure
Languages: Python, R, UNIX scripting, C++/C, SQL.
Skills: Computer Vision, UI/Visualization, Natural Language Processing, Web, Machine Learning.
Python: NLTK, spaCy, matplotlib, NumPy, Pandas, SAS, Scikit-Learn, Keras, stats models, SciPy.
Version Control: GitHub, Git, SVN, BitBucket, Source Tree, Mercurial.
IDE: Jupyter Notebook.
Project Tools: Trello, KanBan, Confluence.
Data Stores: Query and manipulate data in big data Hadoop HDFS, RDBMS, SQL and noSQL, data
lake. Proficient in working with Hadoop, Cloudera Hadoop.
Data query and Data manipulation: Hive Impala, ETL, Spark-SQL, Scala and MapReduce.
Cloud Data Systems: Azure, Google, AWS (Redshift, Kinesis, EMR).
Machine Learning Methods: Classification, regression, prediction, dimensionality reduction, density
estimation and clustering to problems that arise in retail, manufacturing, market science, finance
and banking.

PROFESSIONAL EXPERIENCE:

Client: Iteris, Inc.


Location: Oakland, CA July 2019 to Present
Role: PRINCIPAL DATA SCIENTIST/MACHINE LEARNING ENGINEER

Responsibilities:
• Lead and direct technical team, providing SME support, assessing technical approach, database
access, data cleaning, data analytics, feature creation/extraction, model selection & ensemble,
performance metrics & visualization
• Performs complex pattern recognition of IOT time series data and forecast demand through the
ARMA and ARIMA models and exponential smoothening for multivariate time series data.
• Led the implementation of end-to-end machine learning pipeline that reduced computation-
related costs for training and deployment by 90% on two core AI models, while improving recall
at the desired precision threshold by up to 10%.

• Reduced the log-loss error to below 1.0 for text classification problem using the machine
learning & deep learning algorithms

• Implement ML/NLP solutions for clients including models for process optimization, image
detection, fraud detection, text/sentiment analysis
• Design, embed/train NLP models for text, speech, natural language using NLTK,
Word2Vec,SpaCy, Gensim
• Using Linear Programming based CPLEX Model to make better decisions that, improve efficiency
of operation by 32% and reduced Costs by 16%
• Developed ML/NLP pipelines of large data sets, both structured & unstructured; Statistical
modeling in Big Data architecture with strong understanding and proficiency of predictive
modeling techniques.
• Tackled highly imbalanced dataset using under sampling with ensemble methods, oversampling
with SMOTE and cost sensitive algorithms with Python Scikit-learn
• Deployed PyTorch sentiment analysis model and created a gateway for accessing it from a
website. Used tsne, bag-of-words, and deployed the model using Amazon SageMaker.
• Built a Document Classification system using convolutional neural networks using keras. With a
classification accuracy of 95%
• Developed a sentiment analysis model to find out the user sentiment about the product using
machine learning algorithms & deep learning RNN's.
• Wrote scripts in Python using Apache Spark and ElasticSearch engine for use in creating
dashboards visualized in Grafana..
• Used the AWS SageMaker to quickly build, train and deploy the machine learning models.

Environment: NumPy, seaborn, SciPy, NLTK, Scikit, SAS, Python, C#, Hadoop,SQL, Tensor Flow, ETL,
SSIS, Tableau, R.

Client: Zokos,
Location: Boston, MA Nov 2018 to July 2019
Role: DATA SCIENTIST

Responsibilities:

• Directed and provided the vision and design for a robust, flexible, and scalable business
intelligence (BI) solution.
• Implemented Predictive analytics and machine learning algorithms to forecast key metrics in the
form of designed dashboards on to AWS (S3/EC2) and Django platform for the company's core
business
• Developed models and produced optimization software based in CPLEX resulting in significant
savings for the Business
• Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift
•  Assisted senior data scientist to do text mining on customer review/comment data, using topic
modeling and sentimental classification, using deep learning algorithms like CNN, RNN, LSTM,
GRU, to remediate according financial products using Python. 
• Applied logistic regression in Python and SAS to understand the relationship between different
attributes of dataset and causal relationship between them.
• Designing and developing data ingestion, aggregation and advanced analytics in Hadoop
environment from MySQL and Oracle databases
• Under supervision of Sr. Data Scientist performed Data Transformation method for Re scaling
and Normalizing Variables
• Tuned various currently(using a combination of Grid Search, Randomised) running Models and
deployed them to Production using H20 Sparkling water
• Worked with SQL, SQL PLUS, Oracle PL/SQL Stored Procedures, Triggers, SQL queries and
loading data into Data Warehouse/Data Marts.
• Identified Real-time KPIs, Statistical Analytics, Baselining and Notification for better Actions and
Decisions.
• Writing pig scripts for ETL jobs, to acquire data from multiple sources and convert them into
uniform format.
• Implemented Predictive analytics and machine learning algorithms to forecast key metrics in the
form of designed dashboards on to AWS (S3/EC2) and Django platform for the company's core
business.
• Involved in Designing, building, installing, configuring and supporting Hadoop.
• Translated complex functional and technical requirements into detailed design.
• Performed analysis of vast data stores and uncovers insights.

Environment: Hortonworks HDP 2.3.1,Python, NLTK, CNN, Tensor Flow, Pandas, REST Services,
Scikit, Hadoop, ETL, Oracle, SAS, MySQL, SPARK, SCALA, XML and JSON.
Client: AU Small Finance Bank,
Location: Jaipur, India Jan 2015 to Jan 2018
Role: DATA SCIENTIST/MACHINE LEARNING ENGINEER

Responsibilities:

• Ran Multiple GlMs(Linear, Logistic) models to correctly quantify the relation between factors
affecting Customer Retention and Financial Markers, Boosting the Market share by 7% and
Increased Platform users by 35%
• Built forecasting models by applying ARIMA models and come up with statistical analysis on the
big data
• Actively develop predictive models and strategies for effective fraud detection for credit and
customer banking activities using Kmeans clustering using Python. 
• Analyzed and processed complex data sets using advanced querying, visualization, and analytics
tools.
• Created a recommendation system based on customer purchasing history using Machine
Learning algorithms such as K-NN and associate rule mining (ARM)
• Developed Star and Snowflake schemas based dimensional model to develop the data
warehouse.
• System was built in python using text mining (Tokenizing, POS-tagging, Lemmatizing, Tf-Idf,
Glove) and classified using Logistic regression, Decision-trees, ensemble techniques, SVM
(Linear, Radial) and deep learning algorithms (single/multi-layer perceptron, CNN).
• Built the machine learning model includes: SVM, random forest, XGBoost to score and identify
the potential new business case with Python Scikit-learn.
• Designed Context Flow Diagrams, Structure Chart and ER- diagrams.
• Worked on database features and objects such as partitioning, change data capture, indexes,
views, indexed views to develop optimal physical data mode.
• Tested Complex ETL Mappings and Sessions based on business user requirements and business
rules to load data from source flat files and RDBMS tables to target tables.
• Generated ad-hoc SQL queries using joins, database connections and transformation rules to
fetch data from legacy SQL Server database systems.
• Reviewed business requirements and analyzing data sources form Excel/Oracle SQL Server for
design, development, testing, and production rollover of reporting and analysis projects.
• Designed data model, analysed data for online transactional processing (OLTP) and Online
Analytical Processing (OLAP) systems.

Environment:Python, Erwin, SQL Server 2005, PL/SQL, SQL, SVM, T-SQL, ETL, OLAP, OLTP, SAS,
Oracle 9i, DQ Analyzer, XML, and Clear Quest.

Education
Master’s in Machine Learning and Artificial Intelligence LJMU
Masters in Business Analytics (Major Applied Statistics) HULT.
Bachelors in Mechanical Engineering (Major Automation) JECRC univ.

You might also like