You are on page 1of 4

Candidate3

Data Analyst
Email: Location: London – UK Phone: +44

A Data Analyst having extensive knowledge of data analysis processes, Software Engineering practices and Database
skills and familiar with working on Agile Project delivery. Proficient at using Data Analysis tools, technology,
algorithms and process towards validating the data. Worked in a fast paced environment and always focused towards
quality delivery.

PROFESSIONAL SUMMARY:

● Over 2.5 years of experience in the IT Industry as a Data Analyst

● Highly efficient in Data Analysis, Machine Learning, Data mining with large data sets of Structured and
Unstructured data, Data Acquisition, Data Validation, Predictive modeling, Data Visualization, Web
Scraping. Adept in programming language in Python.

● Proficient in managing entire data analysis project life cycle and actively involved in all the phases of
project life cycle including data acquisition, data cleaning, data engineering, features scaling, features
engineering, statistical modeling (decision trees, regression models, clustering), dimensionality reduction
using Principal Component Analysis and Factor Analysis, testing and validation and data visualization.

● Adept and deep understanding of Statistical modeling, Multivariate Analysis, Model Testing, problem
analysis, model comparison, and validation.

● Expertise in transforming business requirements into analytical models, designing algorithms, building
models, developing data mining and reporting solutions that scale across a massive volume of structured
and unstructured data.

● Skilled in performing data manipulation and data preparation with methods including describe data
contents, compute descriptive statistics of data, regex, split and combine, remap, merge, subset, reindex,
melt and reshape.

● Worked on Agile practice and delivered the project using Scrum methodology, good work experience on
participating Scrum calls, Sprint planning, Backlog refinement.

● Experience in using various packages like Pandas, NumPy, Seaborn, SciPy, Matplotlib, Sci-kit-learn

● Extensive experience in Text Analytics, generating data visualizations using Python and creating
dashboards using tools like Power BI.

● Good Knowledge in Proof of Concepts (PoC's), gap analysis and gathered necessary data for analysis from
different sources, prepared data for data exploration.

● Good industry knowledge, analytical & problem-solving skills and ability to work well within a team as well
as an individual.

● Expertise in transforming business requirements into analytical models, designing algorithms, building
models, developing data mining and reporting solutions that scale across a massive volume of structured and
unstructured data.

● Experience in designing stunning visualizations using Power BI software and publishing and presenting
dashboards and desktop platforms.

● Experience with Data Analytics, Data Reporting, Graphs, Scales, PivotTables and reporting.

● Highly skilled in using visualization tool Power BI.

● Worked and extracted data from various database sources like SQL Server regularly accessing JIRA tool
and other internal issue trackers for the Project development.

● Highly creative, innovative, committed, intellectually curious, business savvy with good communication and
interpersonal skills.

EDUCATION QUALIFICATION
Post-Graduation – Computer Science University of East London
Bachelor’s degree – Electronics and Jawaharlal Nehru Institute of Technology, Telangana, India
Communications Engineering

TECHNICAL SKILLS
Statistical Methods:  Distributions, Central Tendency, Dispersion, Random
Variables and Correlation
Database: SQL Server, MongoDB

Languages: Python, Java Script, SQL


Reporting Tools:  Power BI

Libraries:  Pandas, NumPy, SciPy


Scikit Learn, Statsmodels, Plotly
Data Visualization:  Matplotlib, Seaborn
Cloud Knowledge: AWS

PROJECTS
PROJECT: PATIENT HEALTH COST CLIENT: EMIS HEALTH Sep – 2020 – Till Date

ROLE: DATA ANALYST

Project Description: A UK Nationwide survey of hospital costs conducted by the UK Agency for Healthcare
consists of hospital records of inpatient samples. The given data is restricted to the city of London and relates to
patients in the age group 40-60 years. The agency wants to analyze the data to research on the healthcare costs
and their utilization.
The goals of this project are:
To record the patient statistics, the agency wants to find the age category of people who frequently availed the
facility and has the maximum expenditure.
In order of severity of the diagnosis and treatments and to find out the expensive treatments, the agency wants
to find the diagnosis related group that has maximum hospitalization and expenditure.
To make sure that there is no malpractice, the agency needs to analyze if the race of the patient is related to the
hospitalization costs.
To properly utilize the costs, the agency has to analyze the severity of the hospital costs by age and gender for
proper allocation of resources.
Responsibilities ● Evaluated business requirements and prepared specifications that follow project
guidelines required to develop written programs.

● Implemented Data Exploration to analyze patterns and to select features using


Python SciPy.

● Built Data Analysis model using Python SciPy to classify customers into different
target groups.

● Participated in Data Acquisition with Data Engineer team to extract historical and
real-time data by using SQL Server and Excel.

● Performed Data Enrichment jobs to deal missing values, to normalize data, and to
select features.

● Developed Python script for data cleaning and pre-processing.

● Used Agile methodology and SCRUM process for project developing.

● Creating meta-data and data dictionary for the future data use/data refresh of the
same client.

● Running SQL scripts, creating indexes, stored procedures for data analysis

● Participated in all phases of data mining; data collection, data cleaning,


developing models, validation, visualization.

● Extracted data from SQL Server and prepared data for exploratory analysis

● Built models using Machine Learning classification models like Random Forest

CLIENT: PROJECT:

ROLE: Data Analyst Oct - 2017 – Aug - 2019

Project Description:

● Communicated and coordinated with other departments to collect business


requirement.

● Built scalable and deployable learning models.

● Used analytics libraries Sci-kit learn

● Extensively used Python's multiple data science packages like Pandas, NumPy,
Matplotlib, Seaborn, SciPy, Scikit-learn.

● Performed Exploratory Data Analysis to find trends and clusters.

● Built models using techniques like Regression, Time Series forecasting, and
Clustering.

● Worked on data that was a combination of unstructured and structured data from
multiple sources and automated the cleaning using Python scripts.

● Extensively performed large data read/writes to and from csv and excel files using
pandas.

● Improved loan prediction performance by using random forest and gradient


boosting for feature selection with Python Scikit-learn.

● Implemented machine learning model (logistic regression, with Python Scikit-


learn.

● Worked on the data validation with the help of Univariate, Multivariate analysis.

● Extensively worked with data governance team to maintain data models,


metadata and dictionaries.

● Used Python to pre-process data and attempt to find insights.

● Iteratively rebuild models dealing with changes in data and refining them over
time.

● Created and published multiple dashboards and reports using Power BI.

● Gained expertise in Data Visualization using Matplotlib, Seaborn and Plotly.

Interests & Hobbies


Interests: Actively participated in several fund-raising activities for Orphaned children through my
university.
Hobbies: Yoga, interest in reading about developments in Data Science, Artificial Intelligence
Documentaries.

You might also like