You are on page 1of 21

Data Analytics

The process of INSPECTING , CELANING & TRANSFORMING , MODELING


data for business decision making.
Steps involved
Step 1: Data Cleaning
• not changing values but reshaping data
• removing errors
• validating
• standardizing
Step 2: Visualize Data
• Data mining & querying
• Data Modeling
• Basic db design
• Workflow diagrams & visualization
Step 3: Identifying Data
• identifying Data type
• Recreating or calculating new data
Step 4: Identifying Data
• Existing Data
• Non Existing Data

Step 5: Learning Syntax

Step 6: Interpreting Existing data


• Using JOINS
• Creating work flow Diagrams
Working with Business Data
• Knowing Data Entry rules
• Timing of data
• Creating Data Dictionary
• Create notes in Readme file
• Building Charts & visuals
• Create Pivots for better readability of data
• Remove duplicates
Data Fluency
• The ability to read and write data products
• Change formats of data
• Identifying patterns in data
• Table & chart creating
• Understanding Correlation & causation
Return on investment (ROI)
• Return on investment is the ratio of a profit or loss made in a fiscal
year expressed in terms of an investment
•   Pareto Principle: by focusing on the 20% of work that most matters
to your client, you will produce 80% of your project's results.

• Profit per unit


• Sales per unit
• Problem per unit
Security protocols
 General Data Protection Regulation
(GDPR)

 California Consumer Privacy Act (CCPA)

 Professional regulations

 Organizational policies
Collecting Data
Open Source Data
• https://data.govt
• https://data.gov.in
• https://opendata.Utah.gov
• https://data.un.org
• www.nationmaster.com
• www.statemaster.com
• https://trends.google.corre
late.com
• https://finance.yahoo.com

Or
Third Party Data Supplier
Assess quality of data

• Specialization / Segregation of combined data


• Dealing with outliers
• Missing values
Mathematical operations in excel
Symbol Operation Example
+ Addition =2+3=5
- Subtraction =9-2=7
* Multiplication =6*7=42
/ Division =9/3=3
Exponentiatio
^ =4^2=16
n
() Parentheses =(2+4)/3=2

& Concatenation ="Z"&100="Z100"

: Range =SUM(F5:F7)=6

Range
space =A1:A3 A2:C2=A2
Intersect
Visualizing the Data

• Bar Charts
• Grouped bar charts
• Pie chart
• Dot plots
• Box plots
• Histograms
Types of Data
• Nominal
• Ordinal
• Interval
• Ratio
Covariance
Covariance is a statistical tool that is used
to determine the relationship between the
movement of two asset prices.
• When two stocks tend to move together,
they are seen as having a
positive covariance
• when they move inversely,
the covariance is negative.

• Cov(X,Y) = ΣE((X-μ)(Y-ν)) / n-1


• μ= mean of X
• Ν=mean of Y
DATA VISUALIZATION

• 4×4 Model for Knowledge Content

• A guide to getting people to engage with your website or online


content

• To make sure your content stand outs.


Key factors in 4x4 Model

• Channel your audience

• Know your audience

• Let them know the background story

• Provides better communication


Visualization of Data
• Comparison chart:
• Bar, Pie, Line

Mekko
• Composition of something: how
individual parts make up the whole of
something
• Stacked Bar, Mekko, Stacked Column

• Understand the distribution of your


data
• Line, Column, Bar
Stacked Bar
• Analyzing trends
• Line, Dual-Axis Line, Column

• Relationship between value sets


• Scatter Plot, Bubble, Line Bubble Chart

• Know your Data


• Sample size & methodology
• Correlation vs causation

Dual-Axis Line
RATIO Function
Data Formats
• CSV File

• JSON File

• XML File
Thank you

You might also like