You are on page 1of 131

DOCUMENTATION GOALS

SRM Institute of Science and Technology 2


KNITR PROCESS SCHEMATIC

SRM Institute of Science and Technology 3


KNITR
 you maintain a master file that contains both user
readable documentation and chunks of program
source code.
 The document types supported by knitr include
LaTeX, Markdown, and HTML.
 LaTeX format is a good choice for detailed typeset
technical documents.
 Markdown format is a good choice for online
documentation and wikis.
 Direct HTML format may be appropriate for
some web applications.
SRM Institute of Science and Technology 4
KNITR
 knitr’s main operation is called a knit

 knitr extracts and executes all of the R code and


then builds a new result document that assembles
the contents of the original document plus pretty-
printed code and results

SRM Institute of Science and Technology 5


A SIMPLE KNITR MARKDOWN EXAMPLE

SRM Institute of Science and Technology 6


A SIMPLE KNITR LATEX EXAMPLE

LaTeX is a powerful document preparation system


suitable for publication-quality typesetting both for articles
and entire books

SRM Institute of Science and Technology 7


A SIMPLE KNITR LATEX EXAMPLE

SRM Institute of Science and Technology 8


A SIMPLE KNITR LATEX EXAMPLE

SRM Institute of Science and Technology 9


SOME USEFUL KNITR OPTIONS
OPTION
CACHE

Controls whether results are cached. With cache=F (the


default), the code chunk is always executed.
With cache=T, the code chunk isn’t executed if valid cached
results are available from previous runs.
ECHO

Controls whether source code is copied into the document.


With echo=T (the default), pretty formatted code is added to
the document. With echo=F, code isn’t echoed (useful when
you only want to display results).
SRM Institute of Science and Technology 10
SOME USEFUL KNITR OPTIONS
OPTION
Eval

Controls whether code is evaluated. With eval=T (the


default), code is executed With eval=F, it’s not (useful for
displaying instructions).

Message

Set message=F to direct R message() commands to the


console running R instead of to the document.

SRM Institute of Science and Technology 11


SOME USEFUL KNITR OPTIONS
OPTION
Result

A useful option is results='hide', which suppresses


output.

Tidy

Controls whether source code is reformatted


before being printed. You almost always want to
set tidy=F, as the current version of knitr often
breaks code due to mishandling of R comments
when reformatting.
SRM Institute of Science and Technology 12
DOCUMENTATION
buzzdata <- read.table(infile, header=F, sep=",")
rtest <- data.frame(truth=buzztest$buzz,
pred=predict(fmodel, newdata=buzztest))
print(accuracyMeasures(rtest$pred, rtest$truth))
## [1] "precision= 0.809782608695652 ; recall=
0.84180790960452"
## pred
## truth 0 1
## 0 579 35
## 1 28 149
## model accuracy f1 dev.norm
## 1 model 0.9204 0.6817 4.401
SRM Institute of Science and Technology 13
DOCUMENTATION
Save prepared R environment.

% Another way to conditionally save, check for file.


% message=F is letting message() calls get routed to console
instead
% of the document.
<<save,tidy=F,cache=F,message=F,eval=T>>=
fname <- 'thRS500.Rdata'
if(!file.exists(fname)) {
save(list=ls(),file=fname)

SRM Institute of Science and Technology 14


DOCUMENTATION
message(paste('saved',fname)) # message to running R
console
print(paste('saved',fname)) # print to document
} else {
message(paste('skipped saving',fname)) # message to
running R console
print(paste('skipped saving',fname)) # print to document
}
paste('checked at',date())
system(paste('shasum',fname),intern=T) # write down file
hash
SRM Institute of Science and Technology 15
KNITR DOCUMENTATION

SRM Institute of Science and Technology 16


WRITING EFFECTIVE DOCUMENTS

Example:

# Return the pseudo logarithm of x, which is close to


# sign(x)*log10(abs(x)) for x such that abs(x) is large

SRM Institute of Science and Technology 17


WRITING EFFECTIVE DOCUMENTS

Example:

# Return the pseudo logarithm of x, which is close to


# sign(x)*log10(abs(x)) for x such that abs(x) is large

SRM Institute of Science and Technology 18


THANK YOU

SRM Institute of Science and Technology 19


18CSE396T– DATASCIENCE

Unit –V : Session –1 : SLO -1&SLO2

SRM Institute of Science and Technology 1


DOCUMENTATION GOALS

SRM Institute of Science and Technology 2


KNITR PROCESS SCHEMATIC

SRM Institute of Science and Technology 3


KNITR
 you maintain a master file that contains both user
readable documentation and chunks of program
source code.
 The document types supported by knitr include
LaTeX, Markdown, and HTML.
 LaTeX format is a good choice for detailed typeset
technical documents.
 Markdown format is a good choice for online
documentation and wikis.
 Direct HTML format may be appropriate for
some web applications.
SRM Institute of Science and Technology 4
KNITR
 knitr’s main operation is called a knit

 knitr extracts and executes all of the R code and


then builds a new result document that assembles
the contents of the original document plus pretty-
printed code and results

SRM Institute of Science and Technology 5


A SIMPLE KNITR MARKDOWN EXAMPLE

SRM Institute of Science and Technology 6


A SIMPLE KNITR LATEX EXAMPLE

LaTeX is a powerful document preparation system


suitable for publication-quality typesetting both for articles
and entire books

SRM Institute of Science and Technology 7


A SIMPLE KNITR LATEX EXAMPLE

SRM Institute of Science and Technology 8


A SIMPLE KNITR LATEX EXAMPLE

SRM Institute of Science and Technology 9


SOME USEFUL KNITR OPTIONS
OPTION
CACHE

Controls whether results are cached. With cache=F (the


default), the code chunk is always executed.
With cache=T, the code chunk isn’t executed if valid cached
results are available from previous runs.
ECHO

Controls whether source code is copied into the document.


With echo=T (the default), pretty formatted code is added to
the document. With echo=F, code isn’t echoed (useful when
you only want to display results).
SRM Institute of Science and Technology 10
SOME USEFUL KNITR OPTIONS
OPTION
Eval

Controls whether code is evaluated. With eval=T (the


default), code is executed With eval=F, it’s not (useful for
displaying instructions).

Message

Set message=F to direct R message() commands to the


console running R instead of to the document.

SRM Institute of Science and Technology 11


SOME USEFUL KNITR OPTIONS
OPTION
Result

A useful option is results='hide', which suppresses


output.

Tidy

Controls whether source code is reformatted


before being printed. You almost always want to
set tidy=F, as the current version of knitr often
breaks code due to mishandling of R comments
when reformatting.
SRM Institute of Science and Technology 12
DOCUMENTATION
buzzdata <- read.table(infile, header=F, sep=",")
rtest <- data.frame(truth=buzztest$buzz,
pred=predict(fmodel, newdata=buzztest))
print(accuracyMeasures(rtest$pred, rtest$truth))
## [1] "precision= 0.809782608695652 ; recall=
0.84180790960452"
## pred
## truth 0 1
## 0 579 35
## 1 28 149
## model accuracy f1 dev.norm
## 1 model 0.9204 0.6817 4.401
SRM Institute of Science and Technology 13
DOCUMENTATION
Save prepared R environment.

% Another way to conditionally save, check for file.


% message=F is letting message() calls get routed to console
instead
% of the document.
<<save,tidy=F,cache=F,message=F,eval=T>>=
fname <- 'thRS500.Rdata'
if(!file.exists(fname)) {
save(list=ls(),file=fname)

SRM Institute of Science and Technology 14


DOCUMENTATION
message(paste('saved',fname)) # message to running R
console
print(paste('saved',fname)) # print to document
} else {
message(paste('skipped saving',fname)) # message to
running R console
print(paste('skipped saving',fname)) # print to document
}
paste('checked at',date())
system(paste('shasum',fname),intern=T) # write down file
hash
SRM Institute of Science and Technology 15
KNITR DOCUMENTATION

SRM Institute of Science and Technology 16


WRITING EFFECTIVE DOCUMENTS

Example:

# Return the pseudo logarithm of x, which is close to


# sign(x)*log10(abs(x)) for x such that abs(x) is large

SRM Institute of Science and Technology 17


WRITING EFFECTIVE DOCUMENTS

Example:

# Return the pseudo logarithm of x, which is close to


# sign(x)*log10(abs(x)) for x such that abs(x) is large

SRM Institute of Science and Technology 18


THANK YOU

SRM Institute of Science and Technology 19


18CSE396T– DATASCIENCE

Unit –V : Session –2 : SLO -1&SLO2

SRM Institute of Science and Technology 1


DEPLOYING MODELS

SRM Institute of Science and Technology 2


PRESENTING YOUR RESULTS TO THE
PROJECT SPONSOR
Steps in presenting the result:
 Summarize the motivation behind the project, and its
goals.
 State the project’s results.
 Back up the results with details, as needed.
 Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 3


THANK YOU

SRM Institute of Science and Technology 4


18CSE396T– DATASCIENCE

Unit –V : Session –2 : SLO -1&SLO2

SRM Institute of Science and Technology 1


DEPLOYING MODELS

SRM Institute of Science and Technology 2


PRESENTING YOUR RESULTS TO THE
PROJECT SPONSOR
Steps in presenting the result:
 Summarize the motivation behind the project, and its
goals.
 State the project’s results.
 Back up the results with details, as needed.
 Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 3


THANK YOU

SRM Institute of Science and Technology 4


18CSE396T– DATASCIENCE

Unit –V : Session –3 : SLO -1&SLO2

SRM Institute of Science and Technology 1


PRESENTING YOUR RESULTS TO THE
PROJECT SPONSOR
Steps in presenting the result:
 Summarize the motivation behind the project, and its
goals.
 State the project’s results.
 Back up the results with details, as needed.
 Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 2


SUMMARIZING THE PROJECT’S GOALS

SRM Institute of Science and Technology 3


THANK YOU

SRM Institute of Science and Technology 4


18CSE396T– DATASCIENCE

Unit –V : Session –3 : SLO -1&SLO2

SRM Institute of Science and Technology 1


PRESENTING YOUR RESULTS TO THE
PROJECT SPONSOR
Steps in presenting the result:
 Summarize the motivation behind the project, and its
goals.
 State the project’s results.
 Back up the results with details, as needed.
 Discuss recommendations, outstanding issues, and
possible future work.

SRM Institute of Science and Technology 2


SUMMARIZING THE PROJECT’S GOALS

SRM Institute of Science and Technology 3


THANK YOU

SRM Institute of Science and Technology 4


18CSE396T– DATASCIENCE

Unit –V : Session –4 : SLO -1&SLO2

SRM Institute of Science and Technology 1


PRESENTING YOUR MODEL TO END USERS

 Introduce the problem.


 Discuss related work.
 Discuss your approach.
 Give results and findings.
 Discuss future work

SRM Institute of Science and Technology 2


PRESENTING YOUR MODEL TO OTHER
DATA SCIENTISTS

 Summarize the motivation behind the project, and its


goals.
 Show how the model fits into the users’ workflow (and
how it improves that workflow).
 Show how to use the model.

SRM Institute of Science and Technology 3


PRESENTING YOUR MODEL TO OTHER
DATA SCIENTISTS

 Summarize the motivation behind the project, and its


goals.
 Show how the model fits into the users’ workflow (and
how it improves that workflow).
 Show how to use the model.

SRM Institute of Science and Technology 4


THANK YOU

SRM Institute of Science and Technology 5


18CSE396T– DATASCIENCE

Unit –V : Session –4 : SLO -1&SLO2

SRM Institute of Science and Technology 1


PRESENTING YOUR MODEL TO END USERS

 Introduce the problem.


 Discuss related work.
 Discuss your approach.
 Give results and findings.
 Discuss future work

SRM Institute of Science and Technology 2


PRESENTING YOUR MODEL TO OTHER
DATA SCIENTISTS

 Summarize the motivation behind the project, and its


goals.
 Show how the model fits into the users’ workflow (and
how it improves that workflow).
 Show how to use the model.

SRM Institute of Science and Technology 3


PRESENTING YOUR MODEL TO OTHER
DATA SCIENTISTS

 Summarize the motivation behind the project, and its


goals.
 Show how the model fits into the users’ workflow (and
how it improves that workflow).
 Show how to use the model.

SRM Institute of Science and Technology 4


THANK YOU

SRM Institute of Science and Technology 5


18CSE396T– DATASCIENCE

Unit –V : Session –4 : SLO -1&SLO2

SRM Institute of Science and Technology 1


INTRODUCTION TO DATA ANALYSIS

 summary () can help analysts easily get an


idea of the magnitude and range of the data

 str displays the internal structure of an R


object and gives a quick overview of the
rows and columns of the dataset.

str(airquality)

SRM Institute of Science and Technology 2


INTRODUCTION TO DATA ANALYSIS

head(data,n) and tail(data,n)

The head outputs the top n elements in the


dataset while the tail method outputs the
bottom n.

SRM Institute of Science and Technology 3


VISUALIZATION BEFORE ANALYSIS

 Data visualization gives us a clear idea of what the


information means by giving it visual context through
maps or graphs.

 This makes the data more natural for the human mind to
comprehend and therefore makes it easier to identify
trends, patterns, and outliers within large data sets.

SRM Institute of Science and Technology 4


VISUALIZATION BEFORE ANALYSIS

 data visualization takes the raw data, models it, and


delivers the data so that conclusions can be reached.
 In advanced analytics, data scientists are creating
machine learning algorithms to better compile essential
data into visualizations that are easier to understand and
interpret.
 Visualized data gives stakeholders, business owners, and
decision-makers a better prediction of sales volumes and
future growth.

SRM Institute of Science and Technology 5


THANK YOU

SRM Institute of Science and Technology 6


18CSE396T– DATASCIENCE

Unit –V : Session –4 : SLO -1&SLO2

SRM Institute of Science and Technology 1


INTRODUCTION TO DATA ANALYSIS

 summary () can help analysts easily get an


idea of the magnitude and range of the data

 str displays the internal structure of an R


object and gives a quick overview of the
rows and columns of the dataset.

str(airquality)

SRM Institute of Science and Technology 2


INTRODUCTION TO DATA ANALYSIS

head(data,n) and tail(data,n)

The head outputs the top n elements in the


dataset while the tail method outputs the
bottom n.

SRM Institute of Science and Technology 3


VISUALIZATION BEFORE ANALYSIS

 Data visualization gives us a clear idea of what the


information means by giving it visual context through
maps or graphs.

 This makes the data more natural for the human mind to
comprehend and therefore makes it easier to identify
trends, patterns, and outliers within large data sets.

SRM Institute of Science and Technology 4


VISUALIZATION BEFORE ANALYSIS

 data visualization takes the raw data, models it, and


delivers the data so that conclusions can be reached.
 In advanced analytics, data scientists are creating
machine learning algorithms to better compile essential
data into visualizations that are easier to understand and
interpret.
 Visualized data gives stakeholders, business owners, and
decision-makers a better prediction of sales volumes and
future growth.

SRM Institute of Science and Technology 5


THANK YOU

SRM Institute of Science and Technology 6


18CSE396T– DATASCIENCE

Unit –V : Session –6 : SLO -1&SLO2

SRM Institute of Science and Technology 1


DIRTY DATA

verify the data with domain knowledge, and decide the most
appropriate approach to clean the data

X<- c(l, 2, 3, NA, 4)

is.na(x)

[1) FALSE FALSE FALSE TRUE FALSE


mean(x, na.rm=TRUE)
[1) 2. 5 mean(x, na.rm=TRUE)
[1) 2. 5
SRM Institute of Science and Technology 2
DIRTY DATA

The na. exclude (} function returns the object with incomplete


cases removed.
DF <- data.frame(x = c(l, 2, 3), y = c(10, 20, NA))
DF
X y
1 1 10
2 2 20
3 3 NA
DFl <- na.exclude(DF)
DFl
X y
1 1 10
SRM Institute of Science and Technology 3
2 2 20
VISUALIZATION

 Visualization gives a holistic view of the data that may be


difficult to grasp from the numbers and summaries alone.

 This visualization is a great example of how data


visualization can help decision.

SRM Institute of Science and Technology 4


VISUALIZATION

The plot() function is a kind of a generic function for plotting


of R objects.

Plot(wine$pH, wine$quality)
Plot(wine$quality)

SRM Institute of Science and Technology 5


VISUALIZING A SINGLE VARIABLE

SRM Institute of Science and Technology 6


THANK YOU

SRM Institute of Science and Technology 7


18CSE396T– DATASCIENCE

Unit –V : Session –6 : SLO -1&SLO2

SRM Institute of Science and Technology 1


DIRTY DATA

verify the data with domain knowledge, and decide the most
appropriate approach to clean the data

X<- c(l, 2, 3, NA, 4)

is.na(x)

[1) FALSE FALSE FALSE TRUE FALSE


mean(x, na.rm=TRUE)
[1) 2. 5 mean(x, na.rm=TRUE)
[1) 2. 5
SRM Institute of Science and Technology 2
DIRTY DATA

The na. exclude (} function returns the object with incomplete


cases removed.
DF <- data.frame(x = c(l, 2, 3), y = c(10, 20, NA))
DF
X y
1 1 10
2 2 20
3 3 NA
DFl <- na.exclude(DF)
DFl
X y
1 1 10
SRM Institute of Science and Technology 3
2 2 20
VISUALIZATION

 Visualization gives a holistic view of the data that may be


difficult to grasp from the numbers and summaries alone.

 This visualization is a great example of how data


visualization can help decision.

SRM Institute of Science and Technology 4


VISUALIZATION

The plot() function is a kind of a generic function for plotting


of R objects.

Plot(wine$pH, wine$quality)
Plot(wine$quality)

SRM Institute of Science and Technology 5


VISUALIZING A SINGLE VARIABLE

SRM Institute of Science and Technology 6


THANK YOU

SRM Institute of Science and Technology 7


18CSE396T– DATASCIENCE

Unit –V : Session –7 : SLO -1&SLO2

SRM Institute of Science and Technology 1


EXAMINING MULTIPLE VARIABLES

 Scatter plot
 Dotchart and Barplot

SRM Institute of Science and Technology 2


SCATTER PLOT

Each point represents the values of two variables. One

variable is chosen in the horizontal axis and another in the

vertical axis.

The simple scatterplot is created using the plot() function.

Syntax

The basic syntax for creating scatterplot in R is −

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

SRM Institute of Science and Technology 3


SCATTER PLOT
 x is the data set whose values are the horizontal
 coordinates.

 y is the data set whose values are the vertical coordinates.

 main is the tile of the graph.

 xlab is the label in the horizontal axis.

 ylab is the label in the vertical axis.

 xlim is the limits of the values of x used for plotting.

 ylim is the limits of the values of y used for plotting.

 axes indicates whether both axes should be drawn on the


SRM Institute of Science and Technology 4
plot.
DOT CHART

 dotchart() function in R Language is used to create a dot


chart of the specified data.

SRM Institute of Science and Technology 5


DOT CHART
Syntax:

dotchart(x, labels = NULL, groups = NULL,


gcolor = par(“fg”),
color = par(“fg”))
Parameters:

x: it is defined as numeric vector or matrix


labels: a vector of labels for each point.
groups: a grouping variable indicating how the elements of x are
grouped.
gcolor: color to be used for group labels and values.
color: the color(s) to be used
SRM Institute for andpoints
of Science Technology and labels. 6
DOT CHART

# Dot chart of a single numeric vector

dotchart(mtcars$mpg, labels = row.names(mtcars),


cex = 0.9, xlab = "mpg")

SRM Institute of Science and Technology 7


BAR CHART

Barplot

 In a bar plot, data is represented in the form of rectangular


bars and the length of the bar is proportional to the value of
the variable or column in the dataset.
Both horizontal, as well as a vertical bar chart, can be
generated by tweaking the horiz parameter.

Barplot(wine$quality, main = „quality of wine', xlab = quality', col=


'green', horiz = TRUE)

SRM Institute of Science and Technology 8


THANK YOU

SRM Institute of Science and Technology 9


18CSE396T– DATASCIENCE

Unit –V : Session –7 : SLO -1&SLO2

SRM Institute of Science and Technology 1


EXAMINING MULTIPLE VARIABLES

 Scatter plot
 Dotchart and Barplot

SRM Institute of Science and Technology 2


SCATTER PLOT

Each point represents the values of two variables. One

variable is chosen in the horizontal axis and another in the

vertical axis.

The simple scatterplot is created using the plot() function.

Syntax

The basic syntax for creating scatterplot in R is −

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

SRM Institute of Science and Technology 3


SCATTER PLOT
 x is the data set whose values are the horizontal
 coordinates.

 y is the data set whose values are the vertical coordinates.

 main is the tile of the graph.

 xlab is the label in the horizontal axis.

 ylab is the label in the vertical axis.

 xlim is the limits of the values of x used for plotting.

 ylim is the limits of the values of y used for plotting.

 axes indicates whether both axes should be drawn on the


SRM Institute of Science and Technology 4
plot.
DOT CHART

 dotchart() function in R Language is used to create a dot


chart of the specified data.

SRM Institute of Science and Technology 5


DOT CHART
Syntax:

dotchart(x, labels = NULL, groups = NULL,


gcolor = par(“fg”),
color = par(“fg”))
Parameters:

x: it is defined as numeric vector or matrix


labels: a vector of labels for each point.
groups: a grouping variable indicating how the elements of x are
grouped.
gcolor: color to be used for group labels and values.
color: the color(s) to be used
SRM Institute for andpoints
of Science Technology and labels. 6
DOT CHART

# Dot chart of a single numeric vector

dotchart(mtcars$mpg, labels = row.names(mtcars),


cex = 0.9, xlab = "mpg")

SRM Institute of Science and Technology 7


BAR CHART

Barplot

 In a bar plot, data is represented in the form of rectangular


bars and the length of the bar is proportional to the value of
the variable or column in the dataset.
Both horizontal, as well as a vertical bar chart, can be
generated by tweaking the horiz parameter.

Barplot(wine$quality, main = „quality of wine', xlab = quality', col=


'green', horiz = TRUE)

SRM Institute of Science and Technology 8


THANK YOU

SRM Institute of Science and Technology 9


18CSE396T– DATASCIENCE

Unit –V : Session –8 : SLO -1&SLO2

SRM Institute of Science and Technology 1


BOX AND WHISKER PLOT

Box and Whiskerplot

As summary() command in R can display the


descriptive statistics for every variable in the
dataset, Boxplot does the same albeit graphically in
the form of quartiles

displays the five-number summary of a set of


data. The five-number summary is the minimum,
first quartile, median, third quartile, and maximum.
boxplot(wine[,0:4], main='Multiple Box plots')
SRM Institute of Science and Technology 2
BOX AND WHISKER PLOT

SRM Institute of Science and Technology 3


BOX AND WHISKER PLOT
# Generate three vectors

x <- c(1,5,7,8,9,7,5,1,8,5,6,7,8,9,8,6,7,8,10,19,6,7,8,6,4,6)
y = rnorm(50, mean=8, sd=2) z = rnorm(1000, mean=10, sd=1.8)
# we create a list of vectors and call box plot with it. # range=0.0
causes the whiskers to extend upto extreme points.
# varwidth=TRUE sets the box width proportional to the number of
data points.
# Three Box-Whiskers are plotted for x, y and x vectors

alis <- list(x,y,z) boxplot(alis, range=0.0, horizontal=FALSE,


varwidth=TRUE, notch=FALSE, outline=TRUE,
names=c("A","B","C"), boxwex=0.3, border=c("blue","blue","blue"),
col=c("red","red","red"), xlab = "Tissue type", ylab = "Expression
Level") SRM Institute of Science and Technology 4
HEXBIN PLOT
library(hexbin)
library(grid)

# some data from the ?hexbin help


set.seed(101)
x <- rnorm(10)
y <- rnorm(10)

# hexbin
bin <- hexbin(x, y)

plot(bin)

SRM Institute of Science and Technology 5


HEXBIN PLOT

SRM Institute of Science and Technology 6


THANK YOU

SRM Institute of Science and Technology 7


18CSE396T– DATASCIENCE

Unit –V : Session –8 : SLO -1&SLO2

SRM Institute of Science and Technology 1


BOX AND WHISKER PLOT

Box and Whiskerplot

As summary() command in R can display the


descriptive statistics for every variable in the
dataset, Boxplot does the same albeit graphically in
the form of quartiles

displays the five-number summary of a set of


data. The five-number summary is the minimum,
first quartile, median, third quartile, and maximum.
boxplot(wine[,0:4], main='Multiple Box plots')
SRM Institute of Science and Technology 2
BOX AND WHISKER PLOT

SRM Institute of Science and Technology 3


BOX AND WHISKER PLOT
# Generate three vectors

x <- c(1,5,7,8,9,7,5,1,8,5,6,7,8,9,8,6,7,8,10,19,6,7,8,6,4,6)
y = rnorm(50, mean=8, sd=2) z = rnorm(1000, mean=10, sd=1.8)
# we create a list of vectors and call box plot with it. # range=0.0
causes the whiskers to extend upto extreme points.
# varwidth=TRUE sets the box width proportional to the number of
data points.
# Three Box-Whiskers are plotted for x, y and x vectors

alis <- list(x,y,z) boxplot(alis, range=0.0, horizontal=FALSE,


varwidth=TRUE, notch=FALSE, outline=TRUE,
names=c("A","B","C"), boxwex=0.3, border=c("blue","blue","blue"),
col=c("red","red","red"), xlab = "Tissue type", ylab = "Expression
Level") SRM Institute of Science and Technology 4
HEXBIN PLOT
library(hexbin)
library(grid)

# some data from the ?hexbin help


set.seed(101)
x <- rnorm(10)
y <- rnorm(10)

# hexbin
bin <- hexbin(x, y)

plot(bin)

SRM Institute of Science and Technology 5


HEXBIN PLOT

SRM Institute of Science and Technology 6


THANK YOU

SRM Institute of Science and Technology 7


18CSE396T– DATASCIENCE

Unit –V : Session –9 : SLO -1&SLO2

SRM Institute of Science and Technology 1


SCATTER PLOT

 Each point represents the values of two variables.


 One variable is chosen in the horizontal axis and another in
the vertical axis.
The simple scatterplot is created using the plot() function.
Syntax
The basic syntax for creating scatterplot in R is −

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

Following is the description of the parameters used

SRM Institute of Science and Technology 2


SCATTER PLOT

x is the data set whose values are the horizontal coordinates.


y is the data set whose values are the vertical coordinates.
main is the tile of the graph.
xlab is the label in the horizontal axis.
ylab is the label in the vertical axis.
xlim is the limits of the values of x used for plotting.
ylim is the limits of the values of y used for plotting.
axes indicates whether both axes should be drawn on the plot.

SRM Institute of Science and Technology 3


ANALYSIS OF VARIABLE OVER TIME

Time series analysis is a way of analyzing a sequence of data


points collected over an interval of time.

SRM Institute of Science and Technology 4


THANK YOU

SRM Institute of Science and Technology 5


18CSE396T– DATASCIENCE

Unit –V : Session –9 : SLO -1&SLO2

SRM Institute of Science and Technology 1


SCATTER PLOT

 Each point represents the values of two variables.


 One variable is chosen in the horizontal axis and another in
the vertical axis.
The simple scatterplot is created using the plot() function.
Syntax
The basic syntax for creating scatterplot in R is −

plot(x, y, main, xlab, ylab, xlim, ylim, axes)

Following is the description of the parameters used

SRM Institute of Science and Technology 2


SCATTER PLOT

x is the data set whose values are the horizontal coordinates.


y is the data set whose values are the vertical coordinates.
main is the tile of the graph.
xlab is the label in the horizontal axis.
ylab is the label in the vertical axis.
xlim is the limits of the values of x used for plotting.
ylim is the limits of the values of y used for plotting.
axes indicates whether both axes should be drawn on the plot.

SRM Institute of Science and Technology 3


ANALYSIS OF VARIABLE OVER TIME

Time series analysis is a way of analyzing a sequence of data


points collected over an interval of time.

SRM Institute of Science and Technology 4


THANK YOU

SRM Institute of Science and Technology 5

You might also like