Pattanaik

PATTANAIK
 10+ Years IT experience with 6+ years of progressive experience and emphasis on Data Analytics, Text Mining,
Machine Learning, Statistic Modeling, Predictive Modeling and Natural Language Processing (NLP).
 Expertise in transforming business requirements into analytical models, designing algorithms, building models,
developing data mining and reporting solutions that scales across massive volume of structured and unstructured
data.
 Data Driven and highly analytical with experience in using Statistical procedures and Machine Learning
Algorithms such as ANOVA, Clustering, Classification and Regression and Time Series Analysis to analyze
data for further Model Building.
 Experience in implementing LDA, Naive Bayes , Random Forests, Decision Trees, Linear and Logistic

Regression, SVM, Clustering, neural networks, Principal Component Analysis, and good knowledge on
Recommender Systems.
 Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees,

Random Forest, SVM, K - Nearest Neighbors, Bayesian, XG Boost) in Forecasting/ Predictive Analytics,
Segmentation methodologies, Regression based models, Hypothesis testing, Factor analysis/ PCA,
Ensembles.
 Proficient at building robust Machine Learning, Deep Learning models, Convolution Neural Networks (CNN),
Recurrent Neural Networks (RNN), LSTM using Tensor Flow and Keras. Adept in analyzing large datasets
using Apache Spark, PySpark, Spark ML and Amazon Web Services (AWS).
 Experience with data visualization using tools like ggplot, Matplotlib, Seaborn, Tableau and using Tableau
software to publish and presenting dashboards, storyline on web and desktop platforms.
 Worked on Text Mining and Sentimental analysis for extracting the unstructured data from various platforms –
Clinical and Insurance Datasets.
 Natural Language Understanding (Sentiment Analysis, Custom Analyzers, Entity Analysis, Word embedding)
 Natural Language Processing-NLP (LSA, LDA, TF-IDF, Markov Models, Tokenizers, Analyzers, POS tagging).
SKILLS AND LANGUAGES

Machine Learning Regression, Classification, Decision trees, Random forest, Association Rule Mining (Market Basket
Analysis), Clustering (K-Means, Hierarchal), Gradient decent, SVM (Support Vector Machines),
Deep Learning (CNN, RNN, ANN) using TensorFlow (Keras).
Natural Language Computational Linguistics, Spacy, Named Entity Extraction, Sentiment Analysis, Topic Extraction,
Processing Intent detection, Query Parsing, Semantic mapping
Statistical Model Time Series, Regression models, splines, confidence intervals, principal component analysis,
Dimensionality Reduction, bootstrapping
Web Crawling Python- BeautifulSoup, OpenRefine
Data Engineering Oracle, MySQL, AWS Redshift, MongoDB,Neo4J,AWS S3
Cloud Computing AWS,GCP
Languages Python,R ,Java/J2EE
IDE RStudio, JupyterLab, Google Collab
PROFESSIONAL / ANALYTICS EXPERIENCE
AbbVie Inc, Information Research| Abbott Park, IL Jan’2019 – Present
Data Scientist
 Adequate business acumen to understand the business objectives and dynamics, to leverage the capabilities of
Automations using bots and Machine Learning to conduct study resourcing, budgeting, forecasting, planning and
payment activities
 Perform data manipulation, data preparation, normalization, and predictive modelling to aggregate data at the study
level, therapeutic area level, and division level for clinical study efforts.
 Use a combination of big data, analytics, machine learning models, and visualizations that drive performance and
provide insights to pre-defined reports, simplify study forecasting process, and reconcile the payments entered in the
system with actual payments.
 Provided comprehensive assessment that interpreted patterns and trends with datasets over 1M records with
100+variables and increased the data quality by data wrangling and cleaning with packages like NumPy, pandas and
SciPy.
 Created and implemented a research proposal for the analysis of handwritten clinical studies using NLP techniques.
Applied machine learning algorithms in the fields of text extraction, summarization, classification, and categorization
also in clinical study classification.
 Built source intelligence module for a comprehensive talent acquisition solution using TensorFlow for language
detection, text summarization, text mining and NLP methods like topic modelling- LDA & LSA, word embeddings,
NER and similarity index using in Python.
 Leveraged Gain Adversial Networks (GAN’s) with a pair of ANN’s to identify the fake feedbacks and notes.
Developed statistical and NLP algorithms using machine learning to analyze data for better parsing.
 Used TensorFlow for automatic differentiation capabilities which benefits gradient based machine learning algorithms.
 Performed Word Embedding techniques such as BoW or count vectorizer, TF-IDF, Word2vec to identify relevant,
frequently used and most important words and classify them.
 Performed tokenization, case conversion, word replacement, lemmatizing, stemming during the data preprocessing
stage using NLTK package in python.
 Exploited the NLP functionalities - Parts of Speech tagging to recognize the similarities and differences between
words.
 Used Matplotlib, Seaborn in Python to visualize the data and performed Feature Engineering such as detecting outliers,
missing values, and interpreting variables
 Implemented baseline models using Machine Learning algorithms like Naïve Bayes, Linear and Logistic Regression,
K-Nearest Neighbors, Support Vector Machines, Gradient Boosting XGBM, , Decision Tree, Random Forest,
 Reduced dimensionality of the dataset using Unsupervised Learning algorithms like K-Means Clustering and Principal
Component Analysis (PCA)
 Developed various Sequence to Sequence Encoder-Decoder style models to classify emotions using TensorFlow and
Keras; Integrated Attention Mechanism with the models to improve model performance
 Store and retrieve data from data-warehouses using Amazon Redshift. Update clinical study data warehousing
techniques -data recall and segmentation, resulting in a 20% increase in usability and sustainability of the data across
the streams.
 Participate in Phase 0 of different applications and providing oversight of vendors and feedback related to study
operations, issues, and trends in performance
Technology :Linear, Non-Linear Models, Statistical Analysis, Bayesian Analysis, Natural Language
Processing, Python, Jupyter ,R,Numpy,Sci-py, AWS, Redshift.
Concentrix, Analytics | Greenville, SC June 2018 – Dec 2018
Machine Learning Engineer
 Develop intricate algorithms based on deep-dive statistical analysis and predictive analytics as machine learning and
data mining techniques to forecast company sales of the products with a 95% accuracy.
 Improved the data mining process, resulting in a 20% decrease in time required to infer insights from customer data
used to develop marketing strategies.
 Built the system utilizing NLP knowledge including text mining, regex, bag-of-words, TF-IDF, Word2Vec, PCA,
LSTMs, cosine similarity, sentiment analysis, and information extraction.
 Apply linear models, machine learning algorithms, times series forecasting, and optimization methods to understand
and predict events impacting various business operations.
 Built Data Processing Pipeline and performed data cleaning, features scaling, features engineering using Pandas and
NumPy packages in python 3.7.
 Enhancement of models using techniques such as forecasts, simulations and optimize the model leveraging Gradient
Boosting with XGBoost
 Performed data cleaning and feature selection using MLLib package in PySpark and working with deep learning
frameworks .
 Built Artificial Neural Network using Tensor Flow in Python to identify the customer's probability of canceling the
connections.
 Create Data Quality Scripts using SQL and Hive to validate successful data load and quality of the data.
 Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting various data
sources.
 Designed and implemented a graph-based approach using Neo4j,allowing for sophisticated real-time product
recommendations with smaller code base.
 Familiarity with language models for Text Mining and Language Processing libraries -SpaCy, NLTK, Stanford NLP
and leverage them to operationalize chatbot for Automated answering systems, using Watson Assistant.
 Make recommendations that aid in improving the production factors and make better financial decision using data and
text mining techniques.
Technology :Linear, Non-Linear Models, Statistical Analysis, Text Analysis, Python, Jupyter ,R,Numpy,Sci-py,
Neo4J,Tableau,AWS S3
Cognizant, Consulting | Chicago, IL Sept’2014 – May’2018
Data Engineer
 Development, application and implementation of data-driven software solutions and data science projects in
pharmaceutical research (e.g. prototyping, data analysis, statistical modelling, machine learning).
 Utilize Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications,
executed machine Learning use cases under Spark ML and Mllib.
 Interpret problems and provide solutions to business problems using data analysis, data mining, optimization tools, and
machine learning techniques and statistics. Designed and developed NLP models for sentiment analysis.
 Used Data Warehousing Concepts like Ralph Kimball Methodology, Bill Inmon Methodology, OLAP, OLTP, Star
Schema, Snowflake Schema, Fact Table and Dimension Table. Modernize the data streaming processes, resulting in a
25% redundancy reduction.
 Led the implementation of statistical algorithms and operators on Hadoop and SQL platforms and utilized
optimizations techniques, linear regressions, K-means clustering, Native Bayes and other approaches.
Technology : Statistical Analysis- Supervised & Unsupervised Learning, Natural Language Processing, Python,
Jupyter ,R,Numpy,Sci-py, AWS,GCP,Apache Spark
Tata Consultancy Services, Banking and Finance |Minneapolis, MN Jan’2010 – Sept’2014
Application Engineer
 Design and implement multi-tier applications using Application developed using Java, J2EE, JDBC, JSP, JSTL,
HTML, JSF, Struts, Hibernate, JavaScript, Servlets, JavaBeans, CSS, EJB, XSLT, AJAX with RichFaces, and EJB 3.0,
Liferay.
 Experience in solving software design issues by applying design patterns including Model-View-Controller (MVC),
Singleton Pattern, Proxy Pattern, Factory Pattern, Abstract Factory Pattern, DAO Pattern and Command Pattern.
 Worked on creating batch jobs using Autosys as the job scheduler and technologies like SQL Invoker, UNIX shell
scripting and core java.
 Experience in database design, development, and maintenance of SQL queries using Joins and Stored Procedures using
Oracle PL/SQL.
 Develop batch processes for financial reporting applications and modules using Perl and Unix shell scripts on Oracle
database, with partitions and sub-partitions.
Technology : Java/J2EE, Struts, Spring-MVC , Hibernate, JavaScript, Angular JS , IBM MQ, Autosys, REST API,
Oracle-SQL/PL-SQL.
ACADEMIC EXPERIENCE
Chicago Crash Analysis / Data Mining | Chicago, IL Oct’2019 – Dec’2019
Chicago Crash Analysis- – “Alert today Alive tomorrow” :
 Perform data manipulation, normalization, data preparation that included Exploratory Analysis & Feature Engineering
and predictive modelling for the raw dataset .
 Tackled highly imbalanced Fraud dataset using under sampling, oversampling with smote and cost sensitive algorithms
with Python Scikit-learn.
 Used Pandas, NumPy, seaborn, SciPy, Matplotlib, Scikit-learn, NLTK in Python for developing various machine
learning algorithms and utilized machine learning algorithms such as linear regression, multivariate regression, naive
Bayes, Random Forests, K-means, & KNN for data analysis in python and R.
 Demonstrated experience in design and implementation of Statistical models, Predictive models, enterprise data model,
metadata solution and data life cycle management in both RDBMS, Big Data environments.
 Design built and deployed a set of python modelling APIs for traffic analytics, that integrates multiple machine
learning techniques for driver behavior prediction and support multiple crash segmentation.
 Leverage supervised and unsupervised data mining techniques on crash data to predict crash and improvised the
predictions to integrate with maps into AI enabled vehicles depending on the terrain’s crash probability
 Built an advanced algorithm to identify the most crash locations within Chicagoland that uses both collaborative and
content-based filtering approaches using Python and Open Refine.
 Explored different regression- Linear Regression, Lasso Regression and ensemble models in machine learning – SVM
to perform forecasting. Used classification techniques including Random Forest and Logistic Regression to quantify
the likelihood of each crash.
 Performed Boosting method – XGradient Boosting on predicted model for the improve efficiency of the model.

Pattanaik

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pattanaik

Uploaded by

Copyright:

Available Formats

PATTANAIK

 Experience in implementing LDA, Naive Bayes , Random Forests, Decision Trees, Linear and Logistic

 Proficient in Statistical Modeling and Machine Learning techniques (Linear, Logistics, Decision Trees,

SKILLS AND LANGUAGES

You might also like