Professional Documents
Culture Documents
library(ggplot2)
ggplot() +
colour = 'black') +
colour = 'blue') +
xlab('Years of experience') +
dataset = read.csv('Salary_Data.csv')
# Splitting the dataset into the Training set and Test set
# install.packages('caTools')
library(caTools)
set.seed(123)
# Feature Scaling
# training_set = scale(training_set)
# test_set = scale(test_set)
data = training_set)
library(ggplot2)
ggplot() +
xlab('Years of experience') +
ylab('Salary')
library(ggplot2)
ggplot() +
xlab('Years of experience') +
ylab('Salary')
Reasons for using the set.seed function
Ask Question
The need is the possible desire for reproducible results, which may for
example come from trying to debug your program, or of course from trying
to redo what it does:
These two results we will "never" reproduce as I just asked for something
"random":
R> sample(LETTERS, 5)
[1] "K" "N" "R" "Z" "G"
R> sample(LETTERS, 5)
[1] "L" "P" "J" "E" "D"
These two, however, are identical because I set the seed:
R> set.seed(42); sample(LETTERS, 5)
[1] "X" "Z" "G" "T" "O"
R> set.seed(42); sample(LETTERS, 5)
[1] "X" "Z" "G" "T" "O"
R>
There is vast literature on all that; Wikipedia is a good start. In essence,
these RNGs are called Pseudo Random Number Generators because they are
in fact fully algorithmic: given the same seed, you get the same sequence.
And that is a feature and not a bug.