Plot

Graphical representation of data
Visualizing single variable

Plot(data)
• The most used plotting function in R
programming is the plot() function. It is a
generic function, meaning, it has many
methods which are called according to the
type of object passed to plot()
• plot(x, y, ...) Arguments
• xthe coordinates of points in the plot. Alternatively, a single plotting structure,
function or any R object with a plot method can be provided.
• ythe y coordinates of points in the plot, optional if x is an appropriate structure.
• ...Arguments to be passed to methods, such as graphical parameters (see par).
Many methods will accept the following arguments:
• Type what type of plot should be drawn. Possible types are
• "p" for points,
• "l" for lines,
• "b" for both,
• "c" for the lines part alone of "b",
• "o" for both ‘overplotted’,
• "h" for ‘histogram’ like (or ‘high-density’) vertical lines,
• "s" for stair steps,
Dot and bar plot
• Dotchart and barplot portray continuous
values with labels from a discrete variable.
• A dotchart can be created in R with the
function dotchart(x, label=…), where x is a
numeric vector and label is a vector of
categorical labels for x.
• A barplot can be created with the
barplot(height) function, where height
represents a vector or matrix.
Dot...
data(mtcars)
dotchart(mtcars$mpg,labels=row.names(mtca
rs),cex=.7,
• main=“Miles Per Gallon (MPG) of Car Models
• “,
• xlab=“MPG”)
• dotchart(iris$Petal.Length, main="IRIS data")
Dot chart
Bar plot
• barplot(table(mtcars$cyl), main=“Distribution
of Car Cylinder Counts”,xlab=“Number of
Cylinders”)
• Barplot(table(iris$Petal.Length, main="IRIS
data"))
barplot
histogram
• Income<-rlnorm(4000,meanlog=4,sdlog = 0.7)
• hist(income,breaks=500,xlab=“income”,ylab=“
freq”, main=”histogram”)
Density plot
• density plots are usually a much more effective way to view the distribution of a
variable. Create the plot using plot(density(x)) where x is a numeric vector.
• d <- density(mtcars$mpg)
plot(d, main="Kernel Density of Miles Per Gallon")
• a<-c(10,12,15,18,20,21,33)
• > stem(a)
• The decimal point is 1 digit(s) to the right of the |
• 1 | 02
• 1 | 58
• 2 | 01
• 2|
• 3|3
Cont..
• Income<-rlnorm(4000,meanlog=4,sdlog = 0.7)
• plot(density(log10(income),adjust=0.5),main=
"distribution")
• rug(log10(income))- creates one dimentional
density plot on the bottom of the graph to
emphasize the distribution of observation.
After applied rug funtion
Multiple variables-Examining 2
variables with regression
Cont..
• Regrssion line does not fit the data well
• Linear regression model – not suitable for the
relationship between 2 var. In the above
graph.
• Loess() curve can be used to fit a non linear
line to the data.
• It fits the data better than linear regression.
Dot chart – Multiple variables
Dot chart with 3 groups
Bar plot- multiple variables
Bar plot
Box Whisker plot
• Distribution of a continuous variable for each
value of a discrete variable
• > install.packages("ggplot2")
• > library(ggplot2)
• >p <- ggplot(mtcars, aes(factor(cyl), mpg))
• >p + geom_boxplot() + geom_jitter()
Box Whisker –cars data set
Box whisker plot
Code..
Cont..
• Box-> first quartile, third quartile
• Whisker-length: upper winch->1.5*First
quadrant, lower winch->1.5*third quadrant
• Median
• Based on the above graph , zip 0 and 9 is
having more house hold income based on
their median value.
Hex bin plot
• Scatter plot not suitable for large data sets
• Structure of the data become difficult to see
in scatterplot.
• Combines the feature of scatter plot and
histogram.
• Shading- to represent Concentration of data in
each hex bin.
Code..
• > install.packages("hexbin")
• >library(hexbin)
• >x <- rnorm(2000)
• y <- rnorm(2000)
• hbin <- hexbin(x,y, xbins = 40)
• > plot(hbin)
Hexbin plot
Hex bin plot
Scatter plot matrix
• Scatter plot matrix shows many scatter plots
in a compact, side by side fashion.
• Visually represent multiple attributes of data
set to explore relationships, magnifies
differences
• > colors <- c("orange", "black", "yellow")
• >pairs(iris[1:4], main = "Fisher’s Iris Dataset",pch
= 21, bg = colors[unclass(iris$Species)] )
• >legend(0.2, 0.02, horiz = TRUE,
as.vector(unique(iris$Species)),fill = colors, bty =
"n")
Scatterplot matrix
Data Exploration Vs presentation
Density plots- Data scientists
Histograms- stakeholders

Plot

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Plot

Uploaded by

Copyright:

Available Formats

Graphical representation of data

Visualizing single variable

• The decimal point is 1 digit(s) to the right of the |

You might also like