You are on page 1of 23

Identifying Patterns

LORA J. STA. MARIA


Patterns in data are commonly described in terms of:
center, spread, shape, and unusual features.

Some common distributions have


special descriptive labels, such as
symmetric, bell-shaped, skewed, etc.

2
Center

Graphically, the center of a distribution is located at the median of the distribution. This is the


point in a graphic display where about half of the observations are on either side. In the chart
below, the height of each column indicates the frequency of observations. Here, the
observations are centered over 4.

3
Spread
The spread of a distribution refers to the variability of the data. If
the observations cover a wide range, the spread is larger. If the
observations are clustered around a single value, the spread is
smaller.

4
Shape
The shape of a distribution is described
by the following characteristics.

• Symmetry
• Number of
peaks
• Skewness
• Uniform

5
• Symmetry. When it is graphed, a symmetric distribution can be
divided at the center so that each half is a mirror image of the other.

6
• Number of peaks. Distributions can have few or many peaks. Distributions
with one clear peak are called unimodal, and distributions with two clear
peaks are called bimodal. When a symmetric distribution has a single peak
at the center, it is referred to as bell-shaped.

• Skewness. When they are displayed graphically, some distributions have


many more observations on one side of the graph than the other.
Distributions with fewer observations on the right (toward higher values) are
said to be skewed right; and distributions with fewer observations on the
left (toward lower values) are said to be skewed left.

7
8
• Uniform. When the observations in a set of data are equally spread
across the range of the distribution, the distribution is called a uniform
distribution. A uniform distribution has no clear peaks.

9
Unusual Features
Sometimes, statisticians refer to unusual features in a set of data. The two
most common unusual features are gaps and outliers.

• Gaps. Gaps refer to areas of a distribution where there are no observations.


The figure below has a gap; there are no observations in the middle of the
distribution.

10
Outliers. Sometimes, distributions are characterized by extreme values
that differ greatly from the other observations. These extreme values are
called outliers. The figure below illustrates a distribution with an outlier.
Except for one lonely observation (the outlier on the extreme right), all of
the observations fall between 0 and 4. As a "rule of thumb", an extreme
value is often considered to be an outlier if it is at least 1.5 
interquartile ranges below the first quartile (Q1), or at least 1.5
interquartile ranges above the third quartile (Q3)

11
Regression Analysis

12
What is Regression
Analysis?
Regression analysis is a quantitative research method which is used
when the study involves modelling and analysing several variables,
where the relationship includes a dependent variable and one or more
independent variables. 

13
Purpose

Typically, a regression analysis is done for one of two purposes: In


order to predict the value of the dependent variable for individuals
for whom some information concerning the explanatory variables is
available, or in order to estimate the effect of some explanatory
variable on the dependent variable.

14
Linear Regression Analysis
It is one of the most widely known modeling techniques, as it is amongst
the first elite regression analysis methods picked up by people at the
time of learning predictive modeling. Here, the dependent variable is
continuous and independent variable is more often continuous or
discreet with a linear regression line.
Logistic Regression Analysis
Logistic regression is commonly used to determine the probability
of event=Success and event=Failure. Whenever the dependent variable
is binary like 0/1, True/False, Yes/No logistic regression is used. Thus, it
can be said that logistic regression is used to analyze either the 
close-ended questions in a survey or the questions demanding numeric
response in a survey.

15
Polynomial Regression Analysis
Polynomial regression is commonly used to analyze the curvilinear
data and this happens when the power of an independent variable is
more than 1. In this regression analysis method, the best fit line is
never a ‘straight-line’ but always a ‘curve line’ fitting into the data
points.

Stepwise Regression Analysis


This is a semi-automated process with which a statistical model is built either by
adding or removing the variables that are dependent on the t-statistics of their
estimated coefficients. If used properly, the stepwise regression will provide you
with more powerful data at your fingertips than any method. It works well when you
are working with a large number of independent variables. 

16
Ridge Regression Analysis
Ridge regression is based on an ordinary least square method which is used to
analyze multicollinearity data (data where independent variables are highly
correlated).

Lasso Regression Analysis


Lasso (Least Absolute Shrinkage and Selection Operator) is similar to
ridge regression; however, it uses an absolute value bias instead of
square bias used in ridge regression. 

Elastic Net Regression Analysis


It is a mixture of ridge and lasso regression models trained with L1 and
L2 norm. 

17
CREDITS

Special thanks to all the people who made and released


these awesome resources for free:
◉ Presentation template by SlidesCarnival
◉ Photographs by Unsplash

18
PRESENTATION DESIGN
This presentation uses the following typographies and colors:
◉ Titles: Oswald
◉ Body copy: Source Sans Pro
You can download the fonts on this page:
https://www.fontsquirrel.com/fonts/oswald
https://www.fontsquirrel.com/fonts/source-sans-pro

Sky blue #00cef6 / Bright green #aff000 / Blue #3c78d8 / Dark blue #28324a

You don’t need to keep this slide in your presentation. It’s only here to serve you as a design guide if you need
to create new slides or download the fonts to edit the presentation in PowerPoint®

19
SlidesCarnival icons are editable shapes.

This means that you can:


● Resize them without losing quality.
● Change fill color and opacity.
● Change line color, width and style.

Isn’t that nice? :)

Examples:

20
Diagrams and infographics

21
� Now you can use any emoji as an icon!
And of course it resizes without losing quality and you can change the color.


How? Follow Google instructions
https://twitter.com/googledocs/status/730087240156643328

✋👆👉👍👤👦👧👨👩👪💃🏃💑❤😂😉😋
😒😭😸💣
👶😸 🐟🍒🍔💣 📌📖🔨🎃🎈🎨🏈🏰🌏🔌🔑
and many more...

22
Free templates for all your presentation needs

For PowerPoint and 100% free for personal or Ready to use, professional Blow your audience away
Google Slides commercial use and customizable with attractive visuals

23

You might also like