Professional Documents
Culture Documents
New York, NY
SUMMARY:
Professional qualified Data Scientist/Data Analyst with over 6 years of experience in Data Science
and Analytics including Machine Learning, Data Mining and Statistical Analysis
Involved in the entire data science project life cycle and actively involved in all the phases including
data extraction, data cleaning, statistical modeling and data visualization with large data sets of
structured and unstructured data
Experienced with machine learning algorithm such as logistic regression, random forest, XGboost,
KNN, SVM, neural network, linear regression, lasso regression and k - means
Implemented Bagging and Boosting to enhance the model performance
Experience in Deep Learning(CNN, RNN, DNN) with an understanding of RNN, LSTM
Proficiency in a Neural Network library TensorFlow, Keras
Strong skills in statistical methodologies such as A/B test, experiment design, hypothesis test,
ANOVA
Extensively worked on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)
Experience in implementing data analysis with various analytic tools, such as Anaconda 4.0Jupiter
Notebook 4.X,R 3.0(ggplot2, Caret, dplyr) and Excel 2010/2013
Strong experience in Big Data technologies like Spark 1.6, Sparksql, pySpark, Hive 1.X, AWS(S3,
Redshift, EC2)
Solid ability to write and optimize diverse SQL queries, working knowledge of RDBMS like SQL Server
2008
Experience invisualization tools like, Tableau 9.X, 10.X for creating dashboards
Excellent understanding Agile and Scrum development methodology
Used the version control tools like Git 2.X
Passionate about gleaning insightful information from massive data assets and developing a culture
of sound, data-driven decision making
Ability to maintain a fun, casual, professional and productive team atmosphere
TECHNICAL SKILLS:
Programming Language: Logistics Regression, Naive Bayes, Decision Tree, Random Forest, KNN, Linear
Regression, Lasso, Ridge, SVM, Regression Tree, XGboost, K - means Python 2.7/3.5 (Numpy, Scipy,
Pandas, Seaborn, scikit learn, NLTK), R 3.0, SAS 9.1, SQL
https://www.hireitpeople.com/resume-database/87-sql-developers-resumes/71521-data-scientist-resume-new-york-ny 1/4
16/12/2021, 12:53 Data Scientist Resume New York NY - Hire IT People - We get IT done
Analytic Tools: Hadoop Ecosystem Anaconda 4.0 (Jupyter NoteBook 4.X, Spyder), Rstudio Hadoop 2.X,
Spark 1.6+ (Pyspark, Sparksql, MLlib), MapReduce, Hive 1.X,
DataBase: Data Visualization SQL Server 2008/2014, MongoDB 3.2 Tableau 9.4, Python - matplotlib
PROFESSIONAL EXPERIENCE:
Confidential, New York NY
Data Scientist
Responsibilities:
Environment: Python 3.5(Numpy. Pandas, Scikit-learn), Tensorflow, Keras, R, Jupyter Notebook 4.3, Spark
(Sparksql, Mllib), AWS, Tableau 9.4
Confidential , New York, NY
Data Scientist
Responsibilities:
Collected historical data and third party data from different data source with Big Data tools
Conducted data exploratory analysis to identify where the P&C insurer has been successful in
writing new business in the past
Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, Numpy
Worked on outliers identification with box-plot, studentized, K-means clustering using Pandas,
Numpy
Participated in features engineering such as feature intersection generating, feature normalize
andlabel encoding with Scikit-learn preprocessing
Performed univariate and multivariate analysis on the data to identify any underlying pattern in the
data and associations between the variables
https://www.hireitpeople.com/resume-database/87-sql-developers-resumes/71521-data-scientist-resume-new-york-ny 2/4
16/12/2021, 12:53 Data Scientist Resume New York NY - Hire IT People - We get IT done
Built the machine learning model include: SVM, random forest, XGboost to score and identify the
potential new business case with Python Scikit-learn
Used F-Score, Precision, Recall, and A/B testing to evaluate Model performance
Designed rich data visualizations to model data into human-readable form with Tableau 9.2
Environment: Python 2.7(Numpy. Pandas, Scikit-learn), R(ggplot2, Shiny), Jupyter Notebook 4.0, SQL
Server 2008, Tableau 9.2
Confidential
Data Analyst
Responsibilities:
Built next-generation prediction model for personal loan, which utilized multiple sources of
information to produce optimal loan risk assessment
Rebuilt personal loan valuation model, reconstructed statistical drivers, provided data science
consulting support to business teams
Selected models implemented include: logistic regression, decision trees, random forest, SVM with
Python 3.3 Scikit-learn
Used F-Score, Precision, Recall, and A/B testing to evaluate Model performance
Designed rich data visualizations to model data into human-readable form (map, Tableau, etc.)
Environment: Python 3.3, Jupyter Notebook 3.0, Rstudio, Git 1.7, SQL Server 2008, Excel 2007
Confidential
Data Analyst
Responsibilities:
Worked on data transformation and accessed raw marketing data in varied formats with different
methods for analyzing and processing
Worked with BI team in data investigation, responsible for interpreting variables and data
visualization
Tested the models for problems like goodness of fit, over-fitting, multicollinearity, residual normality,
etc
Involved in variable selection, model optimization and model selection
Prepared model instructions and detailed report for managers, including explaining variables and
scenario tests of predictive models
Worked on data transformation and accessed raw marketing data in varied formats with different
methods for analyzing and processing
Translated business needs into data analysis, business intelligence data sources and reporting
solution for different types of Client
https://www.hireitpeople.com/resume-database/87-sql-developers-resumes/71521-data-scientist-resume-new-york-ny 3/4
16/12/2021, 12:53 Data Scientist Resume New York NY - Hire IT People - We get IT done
Join tables by employing SQL inner join, intersect, union all and union syntax
Used complex query statements like, sub queries, correlated queries, derived tables, case functions
to insert the data depending.
Developed an A/B test dashboard with data from three different testing platforms
https://www.hireitpeople.com/resume-database/87-sql-developers-resumes/71521-data-scientist-resume-new-york-ny 4/4