Professional Documents
Culture Documents
CAREER OBJECTIVE
To pursue a good position as a Data Scientist\Big Data Analyst in a
challenging environment to leverage my technical skillset and creativity
and bring value to the organization.
To explore and carryout the assigned responsibilities with at most
degree of command and dedication resulting in successful culmination
to the satisfaction of the Organization.
Data-driven Analyst with experience in data analysis and statistical
information.
EDUCATION
PGP (Big Data INSOFE, Hyderabad Jan 2018 - Aug 2018 70%
Optimization)
EXPERIENCE
Methodology: -
After fetching the data, next step was to conduct various pre-processing
steps, analyzing the data and bring the data to cleaned state.
Feature selection was conducted using Step AIC, VIF and regularization
techniques. PCA was used for dimensionality reduction.
Starting with linear regression algorithm further explored various other
algorithms like SVM, random forest, neural nets, and Ensemble models.
Performance of the model was evaluated based on error metrics (MSE,
RMSE)
Platforms used: -
Jupyter note Book: - Extraction, pre-processing and model building of the
data was performed in python language.
Tableau: -For attaining the insights of the data, visualizing them on
dashboards tableau tool was used.
Methodology: -
Fetching the data from RDBMS (SQL) with the help of apache Sqoop tool
and uploading it in the HDFS system. Using apache spark creating schemas
and mapping the data.
Next step was to conduct various pre-processing steps, analyzing the data,
joining various data frames and attain insights of the data.
Feature selection was conducted, scaling the data and bringing it to a
cleaned format.
Using apache sparkML, modeling of the data was performed with various
algorithms includes linear classification, SVM, random forest, neural nets,
and Ensemble models.
Performance of the model was evaluated based on accuracy and recall
(Confusion matrix)
Platforms used: -
Jupyter note Book: - pre-processing and model building of the data was
performed in python language.
Hadoop Ecosystem: - Apache sqoop, HDFS, Spark, Hive, SparkSQL, Spark
ML, SQL.
Tableau: -For attaining the insights of the data, visualizing them on
dashboards tableau tool was used.
Methodology: -
Fetching the data and performing various preprocessing steps like resizing
the images, arranging them in ideal folder structure and labeling them
with their categories.
Building basic Convolution neural net, training and testing the model and
obtain best model by performing hyper tuning.
Using transfer learning technique and applying VGG net to get the best
accuracy and predict on the validation data.
Performance of the model was evaluated based on accuracy and recall
(Confusion matrix).
Platforms used: -
Jupyter note Book: - pre-processing and model building of the data was
performed in python language.
4. Text Mining on data obtained from twitter and perform sentiment
analysis: -
Aim: - To extract data from twitter using API’s and perform sentiment
analysis to obtain insights from data.
Methodology: -
Fetching the data from twitter using API’s preprocessing them to python
panda’s table format.
Using Text Blob library present in python analyzed the data and came up
with the sentiment scores for each individual tweet.
Platforms used: -
Jupyter note Book: - pre-processing and model building of the data was
performed in python language.
CAREER HISTORY
SKILLS
Personal Information