26 views

Uploaded by Shean Hee

statss

- Chap01_IM
- Introduction to Statistics.ppt
- 363370432 Midterm Exam in Reading and Writing
- Quantitative_Techniques_1_
- Statistical Terms and Scales of Measurement2
- Statistics
- PB1MAT_01Bahan-Introduction of Statistic and Data Collection Pert 1
- 0902.1232
- Chi Cuadrado
- ch1
- Intro to Statistics
- 9780273716884_pp07
- Learning Module
- Executive Summary
- Sampling and Selection of the Sample Often Marketing Research Studies
- 5 Important Methods Used For Studyin1.docx
- Acknowledgement
- RES 351 Group a Reflection Summary Week 5
- Significant pairwise cooccurrence patterns are not the rule in the majority of biotic communities
- mcom

You are on page 1of 31

CVEN2002/2702

This lecture

1. Introduction 1.1 What is statistics? 1.2 The statistical process 1.3 Populations and samples 1.4 Random sampling 2. Descriptive Statistics 2.1 Introduction 2.2 Graphical representations Additional reading: Section 1.1 in the textbook

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

2 / 31

1. Introduction

Some quotes

I am not much given to regret, so I puzzled over this one a while. Should have taken much more statistics in college, I think. Max Levchin, Paypal Co-founder, Slide Founder, 2010 I keep saying that the sexy job in the next 10 years will be statisticians, and Im not kidding. Hal Varian, Chief Economist at Google, 2009

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

3 / 31

1. Introduction

1. Introduction

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

4 / 31

1. Introduction

What is statistics?

In order to learn about something, we must rst collect observations, referred to as data

Denition

Statistics is the science of the (i) collection, (ii) processing, (iii) analysis, and (iv) interpretation of data In short, statistics is the art of learning from data and it allows us to gain new insights into the behaviour of many phenomena. Statistical concepts and methods are not only useful but indeed are often vital in understanding the world around us Further, it allows us to turn observational evidence into information for decision making, which is probably the most important aspect

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 5 / 31

1. Introduction

What is statistics?

In engineering, this includes diversied tasks like calculating the average length of the downtimes of a computer predicting the reliability of a launch vehicle evaluating the effectiveness of commercial products studying the vibrations of airplane wings checking whether the level of lead in the water supply is within safety standards determining the strength of supports for generators at a power plant collecting and presenting data on the number of persons attending seminars on solar energy ...

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

6 / 31

1. Introduction

What is statistics?

Statistics is mainly concerned with numbers and gures, and is therefore a branch of mathematics Statistics considers the presence of randomness, uncertainty and variation, which are everywhere in real life - If each computer had exactly the same length of downtime, - If the level of lead was exactly identical everywhere and every time in the water supply, - If each seminar attracted the same number of people, - and if those values were known with absolute accuracy, Then a single observation would reveal all desired information, we would not need statistics :-(

Statistics

Statistics allows us to describe, understand and control the variability insofar as possible and to take this uncertainty into account when making judgements and decisions.

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 7 / 31

1. Introduction

Cloud seeding is the attempt to change the amount of precipitation that falls from clouds, by dispersing substances into the air that serve as cloud condensation. The usual intent is to increase precipitation (rain or snow)

1

A natural question may be Does cloud seeding using a given substance (say, silver nitrate) really work ? research question

2

First, we should observe the amount of precipitation that falls from seeded clouds, as well as from unseeded clouds experiment, collection of data

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

8 / 31

1. Introduction

For our experiment, we observe 52 clouds, 26 of which were chosen at random and seeded with silver nitrate. The following rainfall (in acre-feet) are recorded: Unseeded Clouds 1202.6 830.1 372.4 345.5 321.2 244.3 163.0 147.8 95.0 87.0 81.2 68.5 47.3 41.1 36.6 29.0 28.6 26.3 26.1 24.4 21.7 17.3 11.5 4.9 4.9 1.0 Seeded Clouds 2745.6 1697.8 1656.0 978.0 703.4 489.1 430.0 334.1 302.8 274.7 274.7 255.0 242.5 200.7 198.6 129.6 119.0 118.3 115.3 92.4 40.6 32.7 31.4 17.5 7.7 4.1 These values are our data Of course, we observe variability: we could not expect each cloud (seeded or not) to give exactly the same amount of rain!

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 9 / 31

1. Introduction

What can we do with those numbers? 3 We should present the data so that they are readily comprehensible This includes graphical representations

seeded

q q q q q qq q q q q qq q q q q q q q q q

uns.

q q q q q q q q q qq q q q

q q

qq q

500

1000

2000

2500

as well as numerical summary measures average prec. seeded = 441.98 average prec. unseeded = 164.58

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 10 / 31

1. Introduction

At rst sight, seeded clouds seem to give more precipitation than unseeded clouds. Careful! - We must take into account the possibility of chance: Due to chance only, the 26 seeded clouds might be the clouds that would have given more rainfall anyway Due to chance only, the 26 unseeded clouds might be the clouds that would have given less rainfall anyway Can we really conclude that the observed higher amount of rainfall for seeded clouds is due to seeding? Or is it possible that the seeding was not responsible for that but rather that the higher rainfall amount was just a chance occurrence? We have only observed 52 clouds. If we had observed 52 (or more) other clouds, would we observe different rainfall amounts? Can we really generalise what we are seeing on a particular data set beyond that data set? How risky is it?

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 11 / 31

1. Introduction

4

We should analyse and interpret the data bearing in mind that the observed features may be consequences of chance only

This part of statistics is called inferential statistics or statistical inference (Chapters 6-12). How far to go with generalising from an observed data set Are such generalisations reasonable or justiable Do we need to collect more data Some of the most important problems in inference concern the appraisal of the risks and consequences of making wrong decisions. Risks are often appraised by calculating probabilities of some events occurring We will discuss probability theory in more details in Chapters 3-5

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 12 / 31

1. Introduction

Finally, we should draw conclusions from our investigations, that is, we should answer the question Does cloud seeding using silver nitrate result in more rainfall than not cloud seeding using silver nitrate?

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

13 / 31

1. Introduction

The points (1) to (5) in the above example form the typical procedure for statistical inference: 1 Set clearly dened goals for the investigation; formulate the research question 2 Decide what data is required/appropriate and how to collect them; collect the data 3 Display, describe and summarise the data in an efcient way; check for any unusual data features 4 Choose and apply appropriate statistical methods to extract useful information from the data 5 Interpret the information, draw conclusions and communicate the results to others

Fact

Every step in this process requires understanding statistical principles and concepts as well as knowledge and skills in statistical methods.

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 14 / 31

1. Introduction

An experiment conducted at the University of Melbourne suggests that there may be a difference in pain threshold for blonds and brunettes.

1

A group of 19 subjects was divided into light blond, dark blond, light brunette and dark brunette groups and a pain threshold score was measured for each subject

2

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

15 / 31

1. Introduction

L Blond

q q

D Blond

q q

L Brunette

D Brunette

q q

20

60

80

100

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

16 / 31

1. Introduction

4

Pain threshold seems to increase with lighter hair colour, but is this effect real or just due to chance ? (the number of observations is quite small: we have 4 or 5 observations per hair colour)

5

Depending on what we have observed, either It is clear from the data that pain tolerance is related to hair colour or The data do not allow us to conclude that pain tolerance is related to hair colour

Remark: In the latter case we wont say pain tolerance is not related to hair colour. It might still be the case, but with such a small number of observations, we are not sure and it would be too risky to afrm it is.

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 17 / 31

1. Introduction

Population

Usually, we are interested in obtaining information about a total collection of elements, which is referred to as the population The elements are often called individuals (or units) Given the research question, we have observed some characteristic for each individual. This characteristic, which could be quantitative or qualitative, is called a variable

Example 1

In Example 1 (clouds seeding), the population consists of all the clouds of the sky. An individual is a cloud and the variable of interest is the amount of rainfall.

Example 2

In Example 2 (pain tolerance), the population consists of all blonds and brunettes of the world. An individual is one of those people and the variable of interest is the pain threshold score.

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 18 / 31

1. Introduction

Sample

It is often physically impossible or infeasible from a practical standpoint to obtain data on the whole population Think also of very expensive, or very time-consuming, or destructive experiments In most situations, we can only observe a subset of the population, that is, we must work with only partial information The subset of the population which is effectively observed is called the sample. The data are the measurements that are actually collected over the sample in the course of the investigation. Note: sometimes, we may use population to designate the set of all potential measurements and sample to designate the subset of measurements actually observed (i.e., the data)

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 19 / 31

1. Introduction

Sample

In Example 1, the sample consists of the 52 clouds whose rainfall amounts have been recorded. (We might also consider that we have two samples: 26 seeded clouds and 26 unseeded clouds). In Example 2, the sample consists of the 19 persons whose pain threshold scores have been measured.

Fact

The distinction between the data actually acquired (the sample) and the vast collection of all potential observations (the population) is a key to understanding statistics.

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

20 / 31

1. Introduction

Sampling

The process of selecting the sample is called sampling If the sample is to be informative about the total population, it must be representative of that population Suppose you are interested in the average height of UNSW students, would you select the sample from the UNSW basketball team? Suppose you are interested in the average age of UNSW students, would you select a sample made up of postgraduate students only? The quality of the data is paramount in a statistical study Your results are only as good as your data ! Sampling must be carefully done, impartially and objectively

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

21 / 31

1. Introduction

Random sampling

In practice, the only sampling scheme that guarantees the sample to be representative of the population is random sampling. The individuals of the sample are selected in a totally random fashion, without any other prior consideration Any non-random selection of a sample often results in one which is inherently biased toward some values as opposed to others. We must not attempt to deliberately choose the sample according to some criteria Instead, we should just leave it up to chance to obtain a sample which correctly covers the underlying population

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

22 / 31

1. Introduction

Once a random sample is chosen, we can use statistical inference to draw conclusions about the entire population, taking the randomness into account (using probabilities). Not possible if the sample is not random! Information drawn in a non-random sample cannot, as a rule, be generalised to larger populations.

Fact

The statistical procedures presented in this course may not be valid when applied to non-random samples. Never unquestioningly accept samples without knowing how the data have been generated / collected / observed

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

23 / 31

2. Descriptive Statistics

2. Descriptive Statistics

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

24 / 31

2. Descriptive Statistics

2.1 Introduction

Introduction

On Slide 4, we dened statistics as the art of learning from data. However, statistical data, obtained from surveys, experiments, or any series of measurements, are often so numerous that they are virtually useless unless they are condensed. Data should be presented in ways that facilitate their interpretation and subsequent analysis The aspect of statistics which deals with organising, describing and summarising data is called descriptive statistics. Essentially, descriptive statistics tools consist of

1 2

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

25 / 31

2. Descriptive Statistics

2.1 Introduction

Types of variables

There are essentially two types of variables:

1

categorical (or qualitative) variables: take a value that is one of several possible categories (no numerical meaning) Eg. gender, hair colour, eld of study, status, etc. numerical (or quantitative) variables: naturally measured as a number for which meaningful arithmetic operations make sense Eg. height, age, temperature, pressure, salary, etc.

Attention: sometimes categorical variables are disguised as quantitative variables. For example, one might record gender information coded as 0 = Male, 1 = Female. It remains a categorical variable, it is not naturally measured as a number.

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

26 / 31

2. Descriptive Statistics

2.1 Introduction

Types of variables

categorical variables

ordinal: there is a clear ordering of the categories Eg. salary class (low, medium, high), opinion (disagree, neutral, agree), etc. nominal: there is no intrinsic ordering to the categories Eg. gender, hair colour, etc.

numerical variables

discrete: the variable can only take a nite (or countable) number of distinct values Eg. number of courses you are enrolled in, number of persons in a household, etc. continuous: the variable can take any value in an entire interval on the real line (uncountable) Eg. height, weight, temperature, time to complete a task, etc.

CVEN2002/CVEN2702 (Statistics) Dr Joanna Wang Session 2, 2013 - Week 1 27 / 31

2. Descriptive Statistics

Graphical representations

A picture is worth a thousand words Graphical representations are often the most effective way to quickly obtain a feel for the essential characteristics of the data.

Fact

Any good statistical analysis of data should always begin with plotting the data. Plots often reveal useful information and opens paths of inquiry They might also highlight the presence of irregularities or unusual observations (outliers)

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

28 / 31

2. Descriptive Statistics

Dotplot

A dotplot is an attractive summary of numerical data when the data set is reasonably small Each observation is represented by a dot above the corresponding location on a horizontal measurement scale When a value occurs more than one time, there is a dot for each occurrence and these dots are stacked vertically

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

29 / 31

2. Descriptive Statistics

Dotplot: example

Example

In 1987, for the rst time, physicists observed neutrinos from a supernova that occurred outside of our solar system. At a site in Kamiokande, Japan, the following times (in seconds) between neutrinos were recorded: 0.107; 0.196; 0.021; 0.283; 0.179; 0.854; 0.58; 0.19; 7.3; 1.18; 2 Draw a dotplot:

q q qqqq

2 time (s)

Note that the largest observation is extremely different to the others. Such an observation is called an outlier. Could be: recording error? missed observations? real observation?

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

30 / 31

2. Descriptive Statistics

Objectives

Objectives

Now you should be able to: identify the role that statistics can play in the engineering problem-solving process discuss how variability affects the data collected and used for making engineering decisions discuss how probability theory is used in engineering and science discuss the importance of random sampling understand the importance of graphical representations, construct and interpret a dotplot

CVEN2002/CVEN2702 (Statistics)

Dr Joanna Wang

31 / 31

- Chap01_IMUploaded byvikramv786
- Introduction to Statistics.pptUploaded byREKKI
- 363370432 Midterm Exam in Reading and WritingUploaded byGabriel Jalocon
- Quantitative_Techniques_1_Uploaded byKukreja Jitin
- Statistical Terms and Scales of Measurement2Uploaded byFrancis Miguel Perito
- StatisticsUploaded bymajid4uonly
- PB1MAT_01Bahan-Introduction of Statistic and Data Collection Pert 1Uploaded byyeong21
- 0902.1232Uploaded byTahtsel
- Chi CuadradoUploaded byAnndy Olaya Arrazabal
- ch1Uploaded byCarmen Martinez
- Intro to StatisticsUploaded byArius Arandia
- 9780273716884_pp07Uploaded byMohsan Sheikh
- Learning ModuleUploaded byDennis Vigil Caballero
- Executive SummaryUploaded byjay
- Sampling and Selection of the Sample Often Marketing Research StudiesUploaded byxenabi
- 5 Important Methods Used For Studyin1.docxUploaded byjuvelyn e. ismael
- AcknowledgementUploaded byHanisah Jamaludin
- RES 351 Group a Reflection Summary Week 5Uploaded bykixbenn1
- Significant pairwise cooccurrence patterns are not the rule in the majority of biotic communitiesUploaded bypaspartu121
- mcomUploaded byReena Gupta
- A Brief Introduction to Grey Systems TheoryUploaded byAkshay Hinduja
- EPA(DAQ)Uploaded byNathan Holcomb
- Statistics FinalUploaded byMjhay Montemayor
- publication_2_25352_1565Uploaded byمحمد
- SP Chan (Instructor)_ Public ProfileUploaded bymaconny20
- ScalingUploaded bynirajmishra
- Will Hopkins Workshop Invitation v3Uploaded byAzim Mohammed
- Lecture 1 Introd to Stats AdaptedUploaded byAzam-Savaşçı Anderson Mohammad
- Vasilatou LouvierisfinalUploaded byAtfi Mohd Razali
- Oral Defense GuidelinesUploaded byIvy Grace Lim

- 4503Uploaded byShean Hee
- Design TrafficUploaded byShean Hee
- Bebop Drone User Guide UKUploaded byShean Hee
- CVEN2002 2702 Course Profile v15Uploaded byShean Hee
- MATH2089 Numerical Methods 2007 S2 SolutionsUploaded byShean Hee
- CVEN2302 Final Exam S2 2012Uploaded byShean Hee
- Sample Past Paper CVEN2002Uploaded byShean Hee
- Stats 2010 Cven2002Uploaded byShean Hee
- CVEN 2002Numeric Quiz 2 + SolutionUploaded byShean Hee
- cven2002 Sample Past Paper 3Uploaded byShean Hee
- cven2002 Sample Past Paper 3Uploaded byShean Hee
- cven2002 Sample Past Paper 3Uploaded byShean Hee
- 6 Stresses in Soil - HandoutsUploaded byShean Hee
- Pte Essay Free (Pearson)Uploaded byShean Hee
- Matrix TipsUploaded byShean Hee
- Lecture 3 stats cven2002Uploaded byShean Hee
- CVEN2301 Mid-Semester Quiz - Solutions 20123Uploaded byShean Hee
- CVEN2301 Mid-Semester Quiz - Solutions 20123Uploaded byShean Hee
- Mid-Semester Quiz Paper 2 2012Uploaded byShean Hee
- CVEN2301MOSCourseProfile_2013Uploaded byShean Hee
- MapleTa Quiz One CVEN2002Uploaded byShean Hee
- Mid-Semester Quiz 2012Uploaded byShean Hee
- Lecture 19 Centre of MassUploaded byShean Hee
- Tutorial 7 (1)Uploaded byShean Hee
- math2019_s1_2013Uploaded byShean Hee
- Lecture 17 Problemclass4(1)Uploaded byShean Hee
- Lecture 2 stats cven2002Uploaded byShean Hee
- CVEN2002 2702 No QuestionsUploaded byAbdel Yazidi
- CVEN2002 2702 No QuestionsUploaded byAbdel Yazidi

- Unix CommandUploaded byFebrika Erifian
- 4024_s10_qp_21Uploaded bymphotho
- Deficiency List.xlsxUploaded bybidyut100
- ordinal_vs_cardinalUploaded bynaxleo3105
- International Journal of Engineering Research and Development (IJERD)Uploaded byIJERD
- Postgresql Q&AUploaded byANDALOUS
- Iodoform SlidesUploaded bynatinlala
- HTML Element Reference - HTML _ MDNUploaded bydippune
- Basic Model Behavior With StellaUploaded byLenin Bullon Villanes
- TM 5-4120-384-14 AIR CONDITIONER MDL F18H-3S USED ON TRAILER (SEE MAP 4120-405) NSN 4120-01-165-1125Uploaded byAdvocate
- illumin8 White Paper 082118Uploaded byKen Granville
- High Availability and Disaster Recovery Considerations for Hyper-VUploaded byArthur Voyager
- D.3.3 Use a Set-square to Draw Right Angles and to Identify Right Angles in 2D ShapesUploaded byshuaibkhawaja
- Imagenes Con MatlabUploaded byLuis Navarro
- Lab 1Uploaded byKhoa Dao
- Advanced Controls-Lab Report 2Uploaded byOmar Abdullah
- Beta Laplace Distributionan Appropriate Model for Sex HormoneUploaded byVIVEK DHAR DWIVEDI
- Qp Em and Header 2014Uploaded byatulkumargaur26
- 2006 Sexual Identity Development AmongUploaded byValquíria Sousa
- ARIEL-ER-105.1.2Uploaded bygustavofx21
- 2014-08-22T14-21-18_r3dlogUploaded byFrankec Rtf
- Electric PropulsionUploaded byAmol Varpe
- Hare Searle ReviewUploaded bynythamar
- Report PainterUploaded byCaroll Jazmin
- SAP Solution ManagerUploaded bycijumatsu
- EE 418, Lecture 01Uploaded byKarimovaRaikhanovna
- Stata_ Graphing Distributions · PsychstatisticsUploaded byRajnish Ravi
- Regenerative Braking SystemUploaded byMayurMahajan
- INGE 3032-016 All Dynamics FormulasUploaded byTylos_of_Attica
- Booting AIX Into Maintenance ModeUploaded byManoj Bhat