Welcome to Scribd!

Statistics II: Histograms in R: Richard Gill March 5, 2009

Uploaded by

0% found this document useful (0 votes)

15 views9 pages

This document discusses histograms in R and different methods for determining bin widths. It begins by generating random gamma distributed data and plotting a histogram of the data using R's default Sturges method for determining bin widths. It then plots the histogram again using the Scott and Freedman-Diaconis methods. An outlier is added to the data and histograms are plotted again using the different bin width methods. Finally, the Venables and Ripley "truehist" function is used to plot histograms with Scott and Freedman-Diaconis bin widths. The document concludes by suggesting drawing a frequency polygon as an exercise.

Original Description:

Histo

Original Title

histogram

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

15 views9 pages

Statistics II: Histograms in R: Richard Gill March 5, 2009

Uploaded by

engineering5

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 9

Search inside document

Statistics II: Histograms in R

Richard Gill
March 5, 2009
These scripts and notes illustrate the histogram as a density estimator. First
we make some data. And before that, I set the random seed, so that the results
will be perfectly reproducible.

> set.seed(11091951)
> x <- rgamma(100, 10)
Now we show the plain vanilla R histogram.

> hist(x)

15
10
0

Frequency

Histogram of x

15
x

http://www.math.leidenuniv.nl/gill/teaching/statistics

The area under the graph was equal to the number of observations. Using the
option prob=TRUE we make the area under the graph equal to one. On this
graph I will superimpose a plot of the true density of the data. Ill also fix the
axes to get a nicer plot. The bin-width is determined by Sturges method, the
default, it doesnt hurt to make that explicit.

hist(x, prob = T, xlim = c(0, 30), ylim = c(0, 0.15), breaks = "Sturges")
xd <- seq(from = 0, to = 30, length = 1000)
yd <- dgamma(xd, shape = 10)
lines(xd, yd)
abline(h = 0)

0.10

0.15

Histogram of x

0.00

0.05

Density

>
>
>
>
>

15
x

Now lets look at the results of the other bin-width algorithms. First, Scott:

> hist(x, prob = T, xlim = c(0, 30), ylim = c(0, 0.15), breaks = "Scott")
> lines(xd, yd)
> abline(h = 0)

0.00

0.05

Density

0.10

0.15

Histogram of x

15
x

And Freedman-Diaconis:

> hist(x, prob = T, xlim = c(0, 30), ylim = c(0, 0.15), breaks = "FD")
> lines(xd, yd)
> abline(h = 0)

0.00

0.05

Density

0.10

0.15

Histogram of x

15
x

Not much to see, right? But Ill draw these all again, but now with an outlier
added to the dataset. First, plain vanilla (Sturges):

> hist(c(x, 50), prob = T, xlim = c(0, 50), ylim = c(0, 0.15),
+
breaks = "Sturges")
> lines(xd, yd)
> abline(h = 0)

0.00

0.05

Density

0.10

0.15

Histogram of c(x, 50)

30
c(x, 50)

Then Scott:

> hist(c(x, 50), prob = T, xlim = c(0, 50), ylim = c(0, 0.15),
+
breaks = "Scott")
> lines(xd, yd)
> abline(h = 0)

0.00

0.05

Density

0.10

0.15

Histogram of c(x, 50)

30
c(x, 50)

And Freedman-Diaconis

> hist(c(x, 50), prob = T, xlim = c(0, 50), ylim = c(0, 0.15),
+
breaks = "Freedman-Diaconis")
> lines(xd, yd)
> abline(h = 0)

0.00

0.05

Density

0.10

0.15

Histogram of c(x, 50)

30
c(x, 50)

Venables and Ripley think they have a much better histogram. Lets take a
look. Scott:

0.05

0.10

0.15

library(MASS)
truehist(c(x, 50), prob = T, xlim = c(0, 50), ylim = c(0, 0.15),
nbins = "Scott")
lines(xd, yd)
abline(h = 0)

0.00

>
>
+
>
>

30
c(x, 50)

And Freedman-Diaconis

0.00

0.05

0.10

0.15

> truehist(c(x, 50), prob = T, xlim = c(0, 50), ylim = c(0, 0.15),
+
nbins = "Freedman-Diaconis")
> lines(xd, yd)
> abline(h = 0)

c(x, 50)

Exercise. Figure out how to draw a frequency polygon (join the mid-points of
histogram bars by straight lines...). Theoretically you can use a larger bin-width,
at least, if the true density is twice differentiable.

CS1 R Summary Sheets
Document26 pages
CS1 R Summary Sheets
Pranav Sharma
No ratings yet
Data Analysis and Statistics For Geography Environmental Science and Engineering 1st Acevedo Solution Manual
Document11 pages
Data Analysis and Statistics For Geography Environmental Science and Engineering 1st Acevedo Solution Manual
annthompsondmrtjywzsg
100% (49)
Survival Models Solution Chapter 2
Document27 pages
Survival Models Solution Chapter 2
Nayaz Maudarbucus
100% (8)
Problems Chap3
Document27 pages
Problems Chap3
Mohamed Taha
No ratings yet
The Mathematica Handbook
From Everand
The Mathematica Handbook
Martha L Abell
Rating: 5 out of 5 stars
5/5 (1)
Working With Only Normal Curves
Document2 pages
Working With Only Normal Curves
Alexis Tran
No ratings yet
36-225 - Introduction To Probability Theory - Fall 2014: Solutions For Homework 1
Document6 pages
36-225 - Introduction To Probability Theory - Fall 2014: Solutions For Homework 1
Nick Lee
No ratings yet
DSS - Fuzzy (Literature)
Document26 pages
DSS - Fuzzy (Literature)
Meliana Aesy
No ratings yet
Statistika Ekonomi II
Document6 pages
Statistika Ekonomi II
berliana setyawati
No ratings yet
Chapter 6
Document23 pages
Chapter 6
Majd Abukharmah
No ratings yet
Lab Fat
Document7 pages
Lab Fat
Mujeeb Javed
No ratings yet
Regression Analysis
Document280 pages
Regression Analysis
A.Benhari
100% (1)
Sa 123
Document13 pages
Sa 123
Mehak Agarwal
No ratings yet
Regresion Logistica R
Document38 pages
Regresion Logistica R
anon_305782103
No ratings yet
Mathematica - Fitting Penner's Skin Scattering
Document8 pages
Mathematica - Fitting Penner's Skin Scattering
c0de517e.blogspot.com
No ratings yet
11 Binomial Distribution
Document21 pages
11 Binomial Distribution
D-497 Neha Malviya
No ratings yet
Lecture 2 - R Graphics PDF
Document68 pages
Lecture 2 - R Graphics PDF
AnsumanNath
No ratings yet
Experiment 3
Document13 pages
Experiment 3
wallace kito
0% (1)
Problem:: EX:22 X-Bar, R, and S Chart
Document14 pages
Problem:: EX:22 X-Bar, R, and S Chart
delvishjohn0
No ratings yet
6PAM1052 Stats Practical
Document25 pages
6PAM1052 Stats Practical
Bhuvanesh Srini
No ratings yet
Revit Formulas
Document6 pages
Revit Formulas
Anonymous ea87TW
No ratings yet
R Lecture#3
Document24 pages
R Lecture#3
Muhammad Hamdan
No ratings yet
s11209112 Lab 1 Report
Document9 pages
s11209112 Lab 1 Report
Rahul Narayan
No ratings yet
Statistical Computing: Set - Seed (1001) N 100 X Rlnorm (N)
Document11 pages
Statistical Computing: Set - Seed (1001) N 100 X Rlnorm (N)
Manjinder Singh
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
Document9 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
Dushyant Tara
No ratings yet
Linear Congruential Generator
Document9 pages
Linear Congruential Generator
Sourav Sarkar
No ratings yet
Tugas Statel Individu
Document17 pages
Tugas Statel Individu
Nur Hafizah
No ratings yet
R Examples
Document56 pages
R Examples
Animesh Dubey
No ratings yet
Absolute Extrema (Continued) Steps To Finding The Absolute Extrema On A Closed Interval (A, B) For A Function y F (X)
Document8 pages
Absolute Extrema (Continued) Steps To Finding The Absolute Extrema On A Closed Interval (A, B) For A Function y F (X)
Syed Mohammad Askari
No ratings yet
Lab Tutorial 9: Regression Modelling: 9.1 Fitting Linear Models: Linear Regression
Document4 pages
Lab Tutorial 9: Regression Modelling: 9.1 Fitting Linear Models: Linear Regression
Pragyan Nanda
No ratings yet
R Notes 03
Document5 pages
R Notes 03
dfrgthfrgnh drfthbfgnh
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
Document23 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
BrianLe
No ratings yet
Assignment 5 - Logistic - Regression
Document7 pages
Assignment 5 - Logistic - Regression
Mariano Manu
No ratings yet
Clase 1 - Control en Matlab
Document10 pages
Clase 1 - Control en Matlab
Eric Mosvel
No ratings yet
Midterm Codes
Document8 pages
Midterm Codes
Maro
No ratings yet
Revit Basic Formula
Document3 pages
Revit Basic Formula
Dadii Cris
No ratings yet
ListaEstatisticaProbabilidade 1
Document5 pages
ListaEstatisticaProbabilidade 1
Adrielly Isly
No ratings yet
Computational Techniques in Statistics: Exercise 1
Document5 pages
Computational Techniques in Statistics: Exercise 1
وجدان الشدادي
No ratings yet
Design of Experiments. Montgomery DoE
Document6 pages
Design of Experiments. Montgomery DoE
studycam
No ratings yet
Answer For Homework 10
Document59 pages
Answer For Homework 10
NayLowMay
No ratings yet
Atlab For Chemical Engineer: Computer Programming Second Class Part
Document47 pages
Atlab For Chemical Engineer: Computer Programming Second Class Part
Slem Hamed
No ratings yet
Plotting Graphs, Surfaces and Curves in Matlab: Plotting A Graph Z F (X, Y) or Level Curves F (X, Y) C
Document3 pages
Plotting Graphs, Surfaces and Curves in Matlab: Plotting A Graph Z F (X, Y) or Level Curves F (X, Y) C
Shweta Sridhar
No ratings yet
Teorema Del Limite Central
Document14 pages
Teorema Del Limite Central
modestoros
No ratings yet
17bec0069 Assessment-5
Document6 pages
17bec0069 Assessment-5
Jyoti Saxena
No ratings yet
Sudoku Using Constraint Satisfaction
Document12 pages
Sudoku Using Constraint Satisfaction
Tyler Derdun
No ratings yet
Set6s 03
Document7 pages
Set6s 03
Akbar Nafis
No ratings yet
Antwerpen2014sessie5 (Regression)
Document42 pages
Antwerpen2014sessie5 (Regression)
Sahoo SK
No ratings yet
Revit Formulas
Document6 pages
Revit Formulas
Phaneendra
No ratings yet
Add On Problems 2013 - Algebra 1
Document5 pages
Add On Problems 2013 - Algebra 1
James King
No ratings yet
MATLAB Examples - Mathematics
Document53 pages
MATLAB Examples - Mathematics
Rajesh Dommeti
No ratings yet
Cambridge Books Online
Document14 pages
Cambridge Books Online
Anonymous fVtXn0Jfgs
No ratings yet
Matlab
Document2 pages
Matlab
ranv
No ratings yet
Genfit
Document2 pages
Genfit
Mark Anthony Caro
No ratings yet
MathStudio Manual
Document188 pages
MathStudio Manual
Sebastian Rojas
100% (1)
Ali
Document31 pages
Ali
Muhammad Asim Muhammad Arshad
No ratings yet
Lecture 3: Numerical Error: Zhenning Cai August 26, 2019
Document5 pages
Lecture 3: Numerical Error: Zhenning Cai August 26, 2019
Liu Jianghao
No ratings yet
Perhitungan Model Propagasi
Document5 pages
Perhitungan Model Propagasi
Sarah Fadillah
No ratings yet
HW 2 Chap 2
Document6 pages
HW 2 Chap 2
César Andrés Villacrés
No ratings yet
Midterm
Document1 page
Midterm
孫利東
No ratings yet
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet