Data Visulization1

The document provides an overview of data visualization, explaining its purpose, advantages, and various techniques such as exploratory data analysis (EDA) and visual encoding. It details different types of visualizations including histograms, scatter plots, bar charts, box plots, and pie charts, along with their respective functions in R programming. Additionally, it highlights how visualization aids in understanding complex data, enhances communication, and improves analytical efficiency.

Uploaded by

ishwariborkar18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views39 pages

Data Visulization1

Uploaded by

ishwariborkar18

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

• DATA VISUALIZATION

• Visualization :
• process of that transforms the representation of real raw data into
meaningful information/insights in a visual representation.
• Data Visualization:
• -It is graphical representation of information and a data.
• -Mapping between original data(numeric data) and graphic elements
(lines, pointers)
• -Visual elements(charts, graphs, maps)
Advantages of Visualization:
1.Simplifies Complex Data: make it easier to interpret and understand
complex datasets by presenting information in a clear and concise
manner.
2.Enhances Communication: They improve communication by allowing
information to be shared quickly and effectively across diverse audiences.
3.Engages Audience: Visual elements are more engaging and can capture
the attention of the audience better than text or tables alone.
4.Analytical Efficiency: Visual tools can help analysts quickly identify key
insights, reducing the time needed to analyze data.
5.Quick Insights: All for faster identification of patterns, trends, outliers.
6.Error detection: Makes it easier to spot error in the data that affect
analysis.
7.Increases Productivity: Reduces time spent on data analysis &
interpretation by making data insights more immediately apparent.
Introduction to Exploratory Data Analysis:
• -process of examining or understanding the data & extracting insights
of the data
• -Process of investigating the dataset to discover patterns, and
anomalies and form hypotheses based on understanding the dataset.
• -EDA involves generating summary statistic for numerical data in the
dataset and creating various graphical representation to understand
the data easy and better.
• -EDA refers to critical process of performing initial investigation on
data so as to discover patterns & check assumptions with the help of
summary statistic & graphical representation.
EDA involves a combination foll. methods:
•Univariate Visualization of and summary statistic for each field in the
raw dataset.
•Bivariate visualization & summary statistic for accessing the
relationship between variable in the dataset and target variable of
interest.
•Multivariate visualizations to understand interactions between
different fields in the data.
•Dimensionality Reduction to understand the fields in the data that
accounts for the most variance between observations and allow for
processing of reduced volume data.
•Clustering of similar to observations in the dataset into differentiated
groupings, which by collapsing the data into a few small data points,
patterns of behavior can be more easily identified.
• DATA VISUALIZATION & VISUAL CODING:
• -Data visualization has the power of illustrating complex data
relationship and patterns with the help of simple designs consisting of
lines, shapes and colors.
• -Visual Encoding is used to map data into visual structures, there by
building an image on the screen.
• Data Visualization can help in:
1.Identify Outliers in Data: Data visualization makes it easy to sport outliers
those data points that look different from the rest.
ex. In chart, an outliers might be a dot that is far away from the other dots,
helps to see quickly.
2. Enhanced Collaboration: Advanced visualization tools make it easier for
teams to collaboratively go through the reports for instant decision making.
3. Business Analysis Made easy: It deals with various sales prediction,
product promotion, customer behavior through the use of correct data
visualization techniques.
4.Improve Response Time
5.Greater Simplicity
6. Easier Visualization of Patterns.
• VISUAL ENCODING:
• -translating the data into a visual element on a chart or map through
position, shape, size, symbol and color.
• -It is way in which data is mapped into visual structure, upon which
we build the images on a screen.
• What is the visualization graph supposed to display?

• Distribution
• Relationship
• Comparison
• Connection
• Composition
• Location
Distribution Visualizations
• These show how data is spread (range, shape, center, and variability).
Type Description R Function/Package

Shows frequency distribution of a

Histogram hist(), geom_histogram()
variable

Density Plot Smoothed version of a histogram density(), geom_density()

Boxplot Shows median, quartiles, outliers boxplot(), geom_boxplot()

• Histogram:
• -Graphical display of data using bars of different heights.
• -Shows accurate representation of the distribution of numeric data.
• -Histogram uses a ‘bin’ for a set or range of values to be distributed.
• -To make histogram w we can use plt.hist() function.
• -First argument is the numeric data & second argument is number of
bins.
• (default value of bin is 10)
• syntax
• hist(v, main, xlab, xlim, ylim, breaks, col, border)
• v <- c(19, 23, 11, 5, 16, 21, 32, 14, 19, 27, 39)
• # Create the histogram.
• hist(v, xlab = "No.of Articles", col = "green",
• border = "black", xlim = c(0, 50),
• ylim = c(0, 5), breaks = seq(0, 50, by = 10))
scatter plot
• A scatter plot is a set of dotted points representing individual data
pieces on the horizontal and vertical axis.
• In a graph in which the values of two variables are plotted along the X-
axis and Y-axis, the pattern of the resulting points reveals a correlation
between them.
• We can create a scatter plot in R Programming Language using
the plot() function.
• Syntax:
• plot(x, y, main, xlab, ylab, xlim, ylim, axes)
• x <- c(1, 2, 3, 4, 5)
• y <- c(2, 4, 6, 8, 10)
• plot(x, y,
• main = "Custom Scatter Plot",
• xlab = "X Axis Label",
• ylab = "Y Axis Label",
• xlim = c(0, 6),
• ylim = c(0, 12),
• axes = TRUE)
//////////////////////////////////////////
pch = 19: Uses solid circle characters (bullet points) for each data
col = "blue": Colors the points blue
Pch->plotting character
pch Value Symbol Description
1 ○ Open circle
2 △ Open triangle
3 + Plus sign
4 × Cross
5 ◻ Open square
6 ◇ Open diamond
15 ■ Filled square
16 ● Filled circle
17 ▲ Filled triangle
18 ◆ Filled diamond
Solid circle (most common for
19 ●
points)
20 • Smaller filled circle
• # Step 1: Create sample student data
• student_data <- data.frame(
• math = c(78, 85, 92, 70, 88, 76, 95, 67, 80, 90),
• science = c(75, 82, 89, 72, 90, 78, 94, 65, 83, 91),
• gender = c("Male", "Female", "Female", "Male", "Female",
• "Male", "Male", "Female", "Female", "Male"))
• # Step 2: Assign colors and point shapes based on gender
• colors <- ifelse(student_data$gender == "Male", "blue", "red")
• shapes <- ifelse(student_data$gender == "Male", 19, 17)
• # Step 3: Create scatter plot
• plot(student_data$math, student_data$science,
• main = "Math vs Science Scores by Gender",
• xlab = "Math Score",
• ylab = "Science Score",
• col = colors,
• pch = shapes,axes=TRUE)
Bar Chart

• A bar chart is a graphical display of data using bars of different heights

(or lengths).
• It’s mainly used to show counts or summaries of categorical data (like
fruits, gender, brands).
• Each bar represents a category, and the height shows the value
(count, sum, etc.) for that category.
• Syntax:
barplot(height, names.arg, col, main, xlab, ylab)

• ggplot(data, aes(x, y)) + geom_bar(stat = "identity")

library("ggplot2")
df <- data.frame(Category = c("A", "B", "C"), Value = c(10, 20,
15))
ggplot(df, aes(x = Category, y = Value, fill = Category)) +
geom_bar(stat = "identity")+
scale_fill_manual(values = c("A" = "red", "B" = "green", "C" =
"blue"))
dent_counts <- c(12, 18, 7)
class_names <- c("Class A", "Class B", "Class C")
# Step 2: Plot bar chart
barplot(student_counts,
names.arg = class_names, # Add class names directly
col = "orange", # Bar color
main = "Number of Students in Each Class",
xlab = "Class",
ylab = "Number of Students",
ylim = c(0, max(student_counts) + 5), # Extra space at top
border = "black") # Optional: border around bars
# Step 3: Add class names on x-axis
axis(1, at = 1:length(class_names), labels = class_names)
months <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun")
product_A <- c(150, 200, 180, 220, 210, 230)
product_B <- c(120, 160, 190, 210, 200, 220)
# Combine into a matrix (products as rows, months as columns)
sales_data <- rbind(Product_A = product_A, Product_B = product_B)
barplot(sales_data,
beside = TRUE, # grouped bars
col = c("skyblue", "orange"),
names.arg = months,
main = "Monthly Sales Comparison",
xlab = "Month",
ylab = "Sales Units",
ylim = c(0, max(sales_data) + 50))
# Add legend
legend("topright",
legend = c("Product A", "Product B"),
• To represent data that involves 3 or more variables, these retinal variables
play a major role. For example:
• 1.Shape: circle, oval, diamond, rectangle may signify different types of data &
is easily recognized by the eye for the distinguished look.
• 2.Size: used for quantitative data as smaller size indicates less values while
bigeerindicates more value.
• 3.Color: satuarationdecides intensity of color and can be used to differentiate
visual elements from their surroundings by displaying diff. scales of value.
• 4.Orientation: (vertical, horizontal, slanted) help in signifying data trends such
as upward trend or downward trend.
• 5.Texture: show differentiation among data and is mainly used for data
comparison.
• 6.Angles: provides a sense of proportion and this characteristics can help
Data Science Fundamentals & Practical Approaches analyst or data scientist
make better data comparison.
Box plot
• A box plot is a good way to show many important features of
quantitative (numerical) data.
• It shows the median of the data. This is the middle value of the data and
one type of an average value.
• It also shows the range and the quartiles of the data. This tells us
something about how spread out the data is.
Syntax
• boxplot(x,
main = "Title",
xlab = "X-axis label",
ylab = "Y-axis label“,
col = "color")
• The median is the red line through the middle of the 'box'. We can see
that this is just above the number 60 on the number line below. So the
middle value of age is 60 years.
• The left side of the box is the 1st quartile. This is the value that separates
the first quarter, or 25% of the data, from the rest. Here, this is 51 years.
• The right side of the box is the 3rd quartile. This is the value that
separates the first three quarters, or 75% of the data, from the rest.
Here, this is 69 years.
• The distance between the sides of the box is called the inter-quartile
range (IQR). This tells us where the 'middle half' of the values are. Here,
half of the winners were between 51 and 69 years.
• The ends of the lines from the box at the left and the right are the
minimum and maximum values in the data. The distance between these
is called the range.
• data <- c(5, 7, 8, 6, 9, 12, 15, 10, 7, 8)

• # Step 2: Create boxplot

• boxplot(data,
• main = "Simple Boxplot",
• ylab = "Values",
• col = "lightblue")
• scores <- c(78, 85, 67, 90, 82, 74, 88, 79, 69, 91, 86, 71, 80, 83, 77)
classes <- c("Class A", "Class A", "Class A",
"Class B", "Class B", "Class B",
"Class C", "Class C", "Class C",
"Class A", "Class B", "Class C",
"Class A", "Class B", "Class C")
• # Create a data frame
data <- data.frame(score = scores, class = classes)
# Create boxplots of scores by class with colors and axis titles
boxplot(score ~ class, data = data, //plot y axis value group by class X
col = c("lightblue", "lightgreen", "lightpink"),
main = "Distribution of Test Scores by Class",
xlab = "Class",
ylab = "Test Scores")
pie chart
• A pie chart is a circular statistical graphic, which is divided into slices to
illustrate numerical proportions.
• It depicts a special chart that uses "pie slices", where each sector shows
the relative sizes of data.
• A circular chart cuts in the form of radius into segments describing relative
frequencies or magnitude also known as a circle graph.
• the function pie() to create pie charts. It takes positive numbers as a
vector input.
• Syntax:
pie(x, labels, radius, main, col, clockwise)
data<- c(23, 56, 20, 63)
labels <- c("Mumbai", "Pune", "Chennai", "Bangalore")
pie(label, labels)
values <- c(25, 30, 20, 25)
labels <- c("Q1", "Q2", "Q3", "Q4")
colors <- c("red", "blue", "green", "yellow")
# Pie Chart
pie(
x = values,
labels = labels, radius = 1,
main = "Quarterly Sales Distribution",
col = colors,
clockwise = TRUE
)
data<- c(23, 56, 20, 63)
labels <- c("Mumbai", "Pune", "Chennai", "Bangalore")
pie(data, labels, main = "City pie chart",
col = rainbow(length(data)))
• # Sample data
data <- c(45, 25, 15, 10, 5)
browsers <- c("Chrome", "Safari", "Firefox", "Edge", "Other")
labels <- paste0(data, " (", market_share, "%)")
# Pie chart using all arguments
pie(x = data,
labels = labels,
radius = 1,
main = "Browser Market Share (2025)",
col = rainbow(length(market_share)),
clockwise = TRUE)
• install. Packages("plotrix")
library(plotrix)
brands <- c("Brand A", "Brand B", "Brand C", "Brand D", "Brand E")
market_share <- c(30, 25, 20, 15, 10)
# Create labels with brand and percentage
labels <- paste0(brands, " (", market_share, "%)")
# Create 3D pie chart
pie3D(market_share,
labels = labels,
explode = 0.1, # separates slices slightly
main = "Smartphone Market Share",
col = rainbow(length(market_share)),
labelcex = 0.8)
Library Purpose Description Common Functions Explanation

Best for building complex, layered

ggplot2 Custom static Grammar-of-graphics-based plotting ggplot(), geom_bar(), visualizations (e.g., scatter + regression line)
plots system, part of tidyverse. geom_point() using a consistent syntax. Ideal for
publication-quality plots.

Hover, zoom, and dynamic charts in just a

Interactive Builds interactive versions of static
plotly visualizations plots; integrates with ggplot2. plot_ly(), ggplotly() few lines. Great for dashboards and data
exploration.

Built-in support for grouped data and

Multivariate Designed for plotting data multiple panels (e.g., plot by gender, region).
lattice conditioned on one or more xyplot(), bwplot()
data plots variables (panel plots). Less flexible than ggplot2 but more concise
for some tasks.

Great for quick visual checks and learning.

base R Quick and Comes built into R; no need for
plotting simple plots additional libraries. plot(), hist(), boxplot() Not as polished or customizable, but very
fast.
Key Function for Bar
Library Purpose Chart Type Support Best For
Chart

Static, layered geom_bar(), Bar, Line, Pie, Scatter, Most widely used for
ggplot2
visualizations geom_col() etc. custom, quality plots

Web, dashboard,
plotly Interactive charts plot_ly(type = "bar") Bar, Line, Scatter, etc.
hover/zoom support

Bar, Histogram, Line,

base R Basic, built-in plotting barplot() Quick, simple visuals
Boxplot

Interactive charts hchart(type = Column/Bar, Line, Pie, Business dashboards,

highcharter
(business) "column") Area finance apps

Interactive
Animated & web- Bar, Pie, Timeline,
echarts4r e_bar() dashboards,
friendly plots Map
storytelling

Histograms and Density Plots in R
No ratings yet
Histograms and Density Plots in R
9 pages
Data Visualization Techniques Overview
No ratings yet
Data Visualization Techniques Overview
120 pages
Unit 3 DATA VISUAIZATION
100% (1)
Unit 3 DATA VISUAIZATION
25 pages
R Data Visualization Techniques Guide
No ratings yet
R Data Visualization Techniques Guide
73 pages
Matplotlib Basics
No ratings yet
Matplotlib Basics
27 pages
R Programming for Data Visualization
No ratings yet
R Programming for Data Visualization
21 pages
Unit Iii (R)
No ratings yet
Unit Iii (R)
75 pages
Data Visualization Techniques Guide
No ratings yet
Data Visualization Techniques Guide
79 pages
Graphical Analysis
No ratings yet
Graphical Analysis
64 pages
Data Visualization in MSc DS Module 2
No ratings yet
Data Visualization in MSc DS Module 2
56 pages
Unit 2
No ratings yet
Unit 2
52 pages
R Programming for Students
No ratings yet
R Programming for Students
10 pages
DV Lab Manual (Ex - No.1-10)
No ratings yet
DV Lab Manual (Ex - No.1-10)
23 pages
Chapt-3 Data Visualization
No ratings yet
Chapt-3 Data Visualization
73 pages
Data Visualization Techniques for IX Grade
No ratings yet
Data Visualization Techniques for IX Grade
37 pages
MA304 - Lecture 4
No ratings yet
MA304 - Lecture 4
60 pages
DV Unit 2 Update
No ratings yet
DV Unit 2 Update
13 pages
Visual Data Presentation Techniques
No ratings yet
Visual Data Presentation Techniques
26 pages
Data Visualization Techniques in R
No ratings yet
Data Visualization Techniques in R
50 pages
Essential Data Visualization Techniques
No ratings yet
Essential Data Visualization Techniques
15 pages
Basics of Data Analysis and Graphics in
No ratings yet
Basics of Data Analysis and Graphics in
103 pages
EDA Techniques in R: Charts Overview
No ratings yet
EDA Techniques in R: Charts Overview
60 pages
Data Visualization Techniques in R
No ratings yet
Data Visualization Techniques in R
57 pages
Data Visualization Techniques in R
100% (1)
Data Visualization Techniques in R
20 pages
Exploratory Data Analysis Reference
No ratings yet
Exploratory Data Analysis Reference
50 pages
Data Visualization Guide: 1. Common Types of Data Visualizations
No ratings yet
Data Visualization Guide: 1. Common Types of Data Visualizations
11 pages
Module 4
No ratings yet
Module 4
91 pages
Exploratory Data Analysis Techniques
100% (1)
Exploratory Data Analysis Techniques
48 pages
Unit 4 Actual Notes BA
No ratings yet
Unit 4 Actual Notes BA
24 pages
Data Visualization Essentials
No ratings yet
Data Visualization Essentials
32 pages
EDA & Data Visualization Guide
No ratings yet
EDA & Data Visualization Guide
49 pages
Chapter 4 Common Visualization Idioms
No ratings yet
Chapter 4 Common Visualization Idioms
39 pages
09 Plotting and Visualization
No ratings yet
09 Plotting and Visualization
97 pages
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
100% (1)
David Gerbing - R Visualizations Derive Meaning From Data (2020) - 1 - CRC Press (9780429894923)
252 pages
Visualizing A Single Variable Using R
No ratings yet
Visualizing A Single Variable Using R
9 pages
Data Basics For ML
No ratings yet
Data Basics For ML
23 pages
Customer Shopping Data Visualization
No ratings yet
Customer Shopping Data Visualization
26 pages
Data Analytics: Exploratory Analysis in R
No ratings yet
Data Analytics: Exploratory Analysis in R
84 pages
Sab Theek Ho Jaega Unit 4 BRM
No ratings yet
Sab Theek Ho Jaega Unit 4 BRM
34 pages
R - Charts and Graphs
No ratings yet
R - Charts and Graphs
21 pages
Big Data Visualization Techniques Guide
No ratings yet
Big Data Visualization Techniques Guide
34 pages
Data Manipulation and Visualization in R
No ratings yet
Data Manipulation and Visualization in R
58 pages
5.1 Exploratory Analysis en
No ratings yet
5.1 Exploratory Analysis en
79 pages
Effective Data Visualization Techniques
No ratings yet
Effective Data Visualization Techniques
6 pages
Data Visualization and Communication Introduction
No ratings yet
Data Visualization and Communication Introduction
14 pages
Graph Plotting Techniques in R
No ratings yet
Graph Plotting Techniques in R
12 pages
Data Visualization Basics in R
No ratings yet
Data Visualization Basics in R
42 pages
Introduction To Data Science Module 1
No ratings yet
Introduction To Data Science Module 1
32 pages
Data Visualization Techniques in R
No ratings yet
Data Visualization Techniques in R
2 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
4 pages
Data Science: Exploratory Analysis Guide
No ratings yet
Data Science: Exploratory Analysis Guide
42 pages
Common Visualization Idioms
0% (1)
Common Visualization Idioms
95 pages
Unit V
No ratings yet
Unit V
24 pages
Data Visualization Techniques and Tools
No ratings yet
Data Visualization Techniques and Tools
33 pages
Data Visualization
No ratings yet
Data Visualization
19 pages
Two Dimensional Plots Visualizing Data Relationships.pptx 20250418 170623 ٠٠٠٠
No ratings yet
Two Dimensional Plots Visualizing Data Relationships.pptx 20250418 170623 ٠٠٠٠
10 pages
Data+Visualization+in+Python
No ratings yet
Data+Visualization+in+Python
17 pages
Data Visualization Techniques and Tools
No ratings yet
Data Visualization Techniques and Tools
89 pages
Higher Engineering Mathematics Bs Grewal-Page25
No ratings yet
Higher Engineering Mathematics Bs Grewal-Page25
1 page
MCQ On Digital Signal Processing
No ratings yet
MCQ On Digital Signal Processing
3 pages
BNF Assignment
No ratings yet
BNF Assignment
3 pages
4 Panel Data Regression
No ratings yet
4 Panel Data Regression
59 pages
Class X AI Sample Paper 1 Guide
No ratings yet
Class X AI Sample Paper 1 Guide
6 pages
HAZOP Study: Process Safety Analysis
No ratings yet
HAZOP Study: Process Safety Analysis
25 pages
Complexity Theory Exercises with Solutions
No ratings yet
Complexity Theory Exercises with Solutions
19 pages
Order Book Features Extracted by Quantitative Strategies
No ratings yet
Order Book Features Extracted by Quantitative Strategies
3 pages
Chapter 3 Forecasting
No ratings yet
Chapter 3 Forecasting
44 pages
R Programming Course Curriculum
No ratings yet
R Programming Course Curriculum
3 pages
S2 Cheat Sheet: Usual Types of Questions Tips What Can Go Ugly
No ratings yet
S2 Cheat Sheet: Usual Types of Questions Tips What Can Go Ugly
12 pages
Final Ta
No ratings yet
Final Ta
18 pages
MS 5642 Lec 6
No ratings yet
MS 5642 Lec 6
103 pages
21csc305p ML Unit 2
No ratings yet
21csc305p ML Unit 2
115 pages
Binary Classification
No ratings yet
Binary Classification
2 pages
Data Cleansing Presentation
No ratings yet
Data Cleansing Presentation
12 pages
Enhanced SMC for PMSM Control
No ratings yet
Enhanced SMC for PMSM Control
11 pages
Optimal Vector Control
No ratings yet
Optimal Vector Control
10 pages
Inverted File
No ratings yet
Inverted File
20 pages
Deep Learning Imaging Assignment
No ratings yet
Deep Learning Imaging Assignment
3 pages
R Programming Sessional-2 (2024)
No ratings yet
R Programming Sessional-2 (2024)
3 pages
AI Agents and Environments Overview
No ratings yet
AI Agents and Environments Overview
21 pages
ME830-Lecture-4B Analysis Flexible Manufacturing Systems
No ratings yet
ME830-Lecture-4B Analysis Flexible Manufacturing Systems
11 pages
Generative AI Roadmap 1740183235
No ratings yet
Generative AI Roadmap 1740183235
15 pages
Unit7 Autocorrelation
No ratings yet
Unit7 Autocorrelation
11 pages
Interpolating Between Optimal Transport and MMD Using Sinkhorn Divergences
No ratings yet
Interpolating Between Optimal Transport and MMD Using Sinkhorn Divergences
15 pages
Python Lecture 12-Sorting
No ratings yet
Python Lecture 12-Sorting
22 pages
Control Systems II Course Overview
No ratings yet
Control Systems II Course Overview
24 pages
CT4 Models Exam Report April 2008
No ratings yet
CT4 Models Exam Report April 2008
20 pages
Exp - 08 Writeup
No ratings yet
Exp - 08 Writeup
6 pages