Professional Documents
Culture Documents
10+ Years IT experience with 6+ years of progressive experience and emphasis on Data Analytics, Text Mining,
Machine Learning, Statistic Modeling, Predictive Modeling and Natural Language Processing (NLP).
Expertise in transforming business requirements into analytical models, designing algorithms, building models,
developing data mining and reporting solutions that scales across massive volume of structured and unstructured
data.
Data Driven and highly analytical with experience in using Statistical procedures and Machine Learning
Algorithms such as ANOVA, Clustering, Classification and Regression and Time Series Analysis to analyze
data for further Model Building.
Proficient at building robust Machine Learning, Deep Learning models, Convolution Neural Networks (CNN),
Recurrent Neural Networks (RNN), LSTM using Tensor Flow and Keras. Adept in analyzing large datasets
using Apache Spark, PySpark, Spark ML and Amazon Web Services (AWS).
Experience with data visualization using tools like ggplot, Matplotlib, Seaborn, Tableau and using Tableau
software to publish and presenting dashboards, storyline on web and desktop platforms.
Worked on Text Mining and Sentimental analysis for extracting the unstructured data from various platforms –
Clinical and Insurance Datasets.
Natural Language Understanding (Sentiment Analysis, Custom Analyzers, Entity Analysis, Word embedding)
Natural Language Processing-NLP (LSA, LDA, TF-IDF, Markov Models, Tokenizers, Analyzers, POS tagging).
Develop intricate algorithms based on deep-dive statistical analysis and predictive analytics as machine learning and
data mining techniques to forecast company sales of the products with a 95% accuracy.
Improved the data mining process, resulting in a 20% decrease in time required to infer insights from customer data
used to develop marketing strategies.
Built the system utilizing NLP knowledge including text mining, regex, bag-of-words, TF-IDF, Word2Vec, PCA,
LSTMs, cosine similarity, sentiment analysis, and information extraction.
Apply linear models, machine learning algorithms, times series forecasting, and optimization methods to understand
and predict events impacting various business operations.
Built Data Processing Pipeline and performed data cleaning, features scaling, features engineering using Pandas and
NumPy packages in python 3.7.
Enhancement of models using techniques such as forecasts, simulations and optimize the model leveraging Gradient
Boosting with XGBoost
Performed data cleaning and feature selection using MLLib package in PySpark and working with deep learning
frameworks .
Built Artificial Neural Network using Tensor Flow in Python to identify the customer's probability of canceling the
connections.
Create Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data.
Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting various data
sources.
Designed and implemented a graph-based approach using Neo4j,allowing for sophisticated real-time product
recommendations with smaller code base.
Familiarity with language models for Text Mining and Language Processing libraries -SpaCy, NLTK, Stanford NLP
and leverage them to operationalize chatbot for Automated answering systems, using Watson Assistant.
Make recommendations that aid in improving the production factors and make better financial decision using data and
text mining techniques.
Technology :Linear, Non-Linear Models, Statistical Analysis, Text Analysis, Python, Jupyter ,R,Numpy,Sci-py,
Neo4J,Tableau,AWS S3
Cognizant, Consulting | Chicago, IL Sept’2014 – May’2018
Data Engineer
Development, application and implementation of data-driven software solutions and data science projects in
pharmaceutical research (e.g. prototyping, data analysis, statistical modelling, machine learning).
Utilize Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications,
executed machine Learning use cases under Spark ML and Mllib.
Interpret problems and provide solutions to business problems using data analysis, data mining, optimization tools, and
machine learning techniques and statistics. Designed and developed NLP models for sentiment analysis.
Used Data Warehousing Concepts like Ralph Kimball Methodology, Bill Inmon Methodology, OLAP, OLTP, Star
Schema, Snowflake Schema, Fact Table and Dimension Table. Modernize the data streaming processes, resulting in a
25% redundancy reduction.
Led the implementation of statistical algorithms and operators on Hadoop and SQL platforms and utilized
optimizations techniques, linear regressions, K-means clustering, Native Bayes and other approaches.
Technology : Statistical Analysis- Supervised & Unsupervised Learning, Natural Language Processing, Python,
Jupyter ,R,Numpy,Sci-py, AWS,GCP,Apache Spark
Tata Consultancy Services, Banking and Finance |Minneapolis, MN Jan’2010 – Sept’2014
Application Engineer
Design and implement multi-tier applications using Application developed using Java, J2EE, JDBC, JSP, JSTL,
HTML, JSF, Struts, Hibernate, JavaScript, Servlets, JavaBeans, CSS, EJB, XSLT, AJAX with RichFaces, and EJB 3.0,
Liferay.
Experience in solving software design issues by applying design patterns including Model-View-Controller (MVC),
Singleton Pattern, Proxy Pattern, Factory Pattern, Abstract Factory Pattern, DAO Pattern and Command Pattern.
Worked on creating batch jobs using Autosys as the job scheduler and technologies like SQL Invoker, UNIX shell
scripting and core java.
Experience in database design, development, and maintenance of SQL queries using Joins and Stored Procedures using
Oracle PL/SQL.
Develop batch processes for financial reporting applications and modules using Perl and Unix shell scripts on Oracle
database, with partitions and sub-partitions.
Technology : Java/J2EE, Struts, Spring-MVC , Hibernate, JavaScript, Angular JS , IBM MQ, Autosys, REST API,
Oracle-SQL/PL-SQL.
ACADEMIC EXPERIENCE
Chicago Crash Analysis / Data Mining | Chicago, IL Oct’2019 – Dec’2019
Chicago Crash Analysis- – “Alert today Alive tomorrow” :
Perform data manipulation, normalization, data preparation that included Exploratory Analysis & Feature Engineering
and predictive modelling for the raw dataset .
Tackled highly imbalanced Fraud dataset using under sampling, oversampling with smote and cost sensitive algorithms
with Python Scikit-learn.
Used Pandas, NumPy, seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine
learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression, naive
Bayes, Random Forests, K-means, & KNN for data analysis in python and R.
Demonstrated experience in design and implementation of Statistical models, Predictive models, enterprise data model,
metadata solution and data life cycle management in both RDBMS, Big Data environments.
Design built and deployed a set of python modelling APIs for traffic analytics, that integrates multiple machine
learning techniques for driver behavior prediction and support multiple crash segmentation.
Leverage supervised and unsupervised data mining techniques on crash data to predict crash and improvised the
predictions to integrate with maps into AI enabled vehicles depending on the terrain’s crash probability
Built an advanced algorithm to identify the most crash locations within Chicagoland that uses both collaborative and
content-based filtering approaches using Python and Open Refine.
Explored different regression- Linear Regression, Lasso Regression and ensemble models in machine learning – SVM
to perform forecasting. Used classification techniques including Random Forest and Logistic Regression to quantify
the likelihood of each crash.
Performed Boosting method – XGradient Boosting on predicted model for the improve efficiency of the model.