07/29/2014

http://www.netnam.vn/unescocourse/statistics/stat_frm.htm
MODULE \u201cSTATISTICAL DATA ANALYSIS\u201d

By Dr. Dang Quang A and Dr. Bui The Hong
Institute of Information Technology
Hoang Quoc Viet road, Cau giay, HANOI

Preface
Statistics is the science of collecting, organizing and interpreting numerical and non-numerical
facts, which we call data.

The collection and study of data is important in the work of many professions, so that training in the science of statistics is valuable preparation for variety of careers, for example, economists and financial advisors, businessmen, engineers and farmers.

Knowledge of probability and statistical methods also are useful for informatics specialists in various fields such as data mining, knowledge discovery, neural networks, and fuzzy systems and so on.

Whatever else it may be, statistics is first and foremost a collection of tools used for converting
raw data into information to help decision makers in their work.
The science of data - statistics - is the subject of this course.

Chapter 1 is an introduction into statistical analysis of data. Chapters 2 and 3 deal with statistical methods for presenting and describing data. Chapters 4 and 5 introduce the basic concepts of probability and probability distributions, which are the foundation for our study of statistical inference in later chapters. Sampling and sampling distributions is the subject of Chapter 6. The remaining seven chapters discuss statistical inference - methods for drawing conclusions from properly produced data. Chapter 7 deals with estimating characteristics of a population by observing the characteristic of a sample. Chapters 8 to 13 describe some of the most common methods of inference: for drawing conclusions about means, proportions and variances from one and two samples, about relations in categorical data, regression and correlation and analysis of variance. In every chapter we include examples to illustrate the concepts and methods presented. The use of computer packages such as SPSS and STATGRAPHICS will be evolved.

Audience

This tutorial as an introductory course to statistics is intended mainly for users such as engineers, economists and managers who need to use statistical methods in their work and for students. However, many aspects will be useful for computer trainers.

Objectives
Understanding statistical reasoning
Mastering basic statistical methods for analyzing data such as descriptive and inferential
methods
Ability to use methods of statistics in practice with the help of computer software
Entry requirements
High school algebra course (+elements of calculus)
Elementary computer skills
ii
CONTENTS
Chapter 1 Introduction....................................................................................................1

1.1 What is Statistics...................................................................................................1 1.2 Populations and samples......................................................................................2 1.3 Descriptive and inferential statistics......................................................................2 1.4 Brief history of statistics........................................................................................3 1.5 Computer softwares for statistical analysis...........................................................3

Chapter 2 Data presentation..........................................................................................4

2.1 Introduction...........................................................................................................4 2.2 Types of data........................................................................................................4 2.3 Qualitative data presentation................................................................................5 2.4 Graphical description of qualitative data................................................................6 2.5 Graphical description of quantitative data: Stem and Leaf displays.....................7 2.6 Tabulating quantitative data: Relative frequency distributions..............................9 2.7 Graphical description of quantitative data: histogram and polygon...................... 11 2.8 Cumulative distributions and cumulative polygons.............................................. 12 2.9 Summary............................................................................................................. 14 2.10 Exercises..........................................................................................................14

Chapter 3 Data characteristics: descriptive summary statistics..................................... 16

3.1 Introduction.........................................................................................................16 3.2 Types of numerical descriptive measures........................................................... 16 3.3 Measures of location (or measures of central tendency)..................................... 17 3.4 Measures of data variation.................................................................................. 20 3.5 Measures of relative standing............................................................................. 23 3.6 Shape.................................................................................................................26 3.7 Methods for detecting outlier............................................................................... 28 3.8 Calculating some statistics from grouped data.................................................... 30 3.9 Computing descriptive summary statistics using computer softwares................. 31 3.10 Summary........................................................................................................... 32 3.11 Exercises..........................................................................................................33

Chapter 4 Probability: Basic concepts.......................................................................... 35

4.1 Experiment, Events and Probability of an Event.................................................. 35 4.2 Approaches to probability..................................................................................... 36 4.3 The field of events............................................................................................... 36 4.4 Definitions of probability...................................................................................... 38 4.5 Conditional probability and independence........................................................... 41 4.6 Rules for calculating probability........................................................................... 43 4.7 Summary............................................................................................................46 4.8 Exercises............................................................................................................46

iii
Chapter 5 Basic Probability distributions...................................................................... 48

5.1 Random variables................................................................................................ 48 5.2 The probability distribution for a discrete random variable................................... 49 5.3 Numerical characteristics of a discrete random variable...................................... 51 5.4 The binomial probability distribution.................................................................... 53 5.5 The Poisson distribution....................................................................................... 55 5.6 Continuous random variables: distribution function and density function.............. 57 5.7 Numerical characteristics of a continuous random variable............................... 59 5.8 Normal probability distribution.............................................................................. 60 5.10 Exercises........................................................................................................... 63

Chapter 6. Sampling Distributions.............................................................................. 65

6.1 Why the method of sampling is important............................................................ 65 6.2 Obtaining a Random Sample............................................................................... 67 6.3 Sampling Distribution........................................................................................... 68 6.4 The sampling distribution ofx : the Central Limit Theorem................................. 73 6.5 Summary............................................................................................................. 76 6.6 Exercises............................................................................................................. 76

Chapter 7 Estimation...................................................................................................79

7.1 Introduction.......................................................................................................... 79 7.2 Estimation of a population mean: Large-sample case.......................................... 80 7.3 Estimation of a population mean: small sample case........................................... 88 7.4 Estimation of a population proportion................................................................... 90 7.5 Estimation of the difference between two population means................................ 92 7.6 Estimation of the difference between two population means: Matched pairs....... 95 7.7 Estimation of the difference between two population proportions......................... 97 7.8 Choosing the sample size.................................................................................... 99 7.9 Estimation of a population variance................................................................... 102 7.10 Summary......................................................................................................... 105 7.11Exer cises......................................................................................................... 105

Chapter 8 Hypothesis Testing.................................................................................. 107

8.1 Introduction........................................................................................................ 107 8.2 Formulating Hypotheses.................................................................................... 107 8.3 Types of errors for a Hypothesis Test................................................................ 109 8.4 Rejection Regions.............................................................................................. 111 8.5 Summary........................................................................................................... 118 8.6 Exercises........................................................................................................... 118

Chapter 9 Applications of Hypothesis Testing........................................................... 119

9.1 Introduction........................................................................................................ 119 9.2 Hypothesis test about a population mean.......................................................... 119 9.3 Hypothesis tests of population proportions........................................................ 125 9.4 Hypothesis tests about the difference between two population means............... 126 9.5 Hypothesis tests about the difference between two proportions......................... 131 9.6 Hypothesis test about a population variance...................................................... 134 9.7 Hypothesis test about the ratio of two population variances.............................. 135