y

© All Rights Reserved

2 views

y

© All Rights Reserved

- Dr.tuguinayo SHSk12SportsArtsTracks
- CSEC Past Papers on STATISTICS
- Development of Computer Assisted Instruction (CAI) in College Statistics
- Chapter III
- Bet Smarter With the Monte Carlo Simulation
- Statistical Models help in Simulation
- 1
- Fall 2011 Syllabus STAT3309 7 TTH-1
- dallisgrass lab report 1
- icm1954.2.0275.0276.ocr
- QAM Introduction...NKGUPTA
- ADD Math Folio Complete
- INtroduction Statistics
- A new reliability analysis method for uncertain structures with random and interval variables
- Probability
- QM Formative Ephiphany Term
- Study Habits of Bachelor of Sciencexxxxxxxxxxxxxxxxxx
- article critique
- Msg 00062
- Normal Distribution

You are on page 1of 6

Measurements are prone to errors. Therefore, all techniques for data analysis must consider

this error in measurements. Experimental errors while taking measurements are sometimes

unavoidable and may depend on accuracy. For example, consider the measurement of

length. Measure the length of a table as 5 meters. Here, we are actually comparing the

length of the table with that of a standard that is 1 meter long. In this comparison, there is

always some uncertainty regarding its accuracy. It depends on the accuracy of the scale that

you have used for measuring the length. If the length measured is between 5 and 6 and the

scale that was used did not have any subdivisions of meters marked on that, the

measurement is not accurate. To get a more accurate measurement of the length, use a

scale where the meter is subdivided into centimeters and the length of the table can be

measured to the accuracy of centimeters, say 5 meters and 3 centimeters. Experimentally-

determined quantities always have errors to varying degrees. The reliability of the

conclusions drawn from this data must take experimental errors into considerations for

calculations. Minimization of errors by adopting accurate measurement scales, estimation of

the errors and principles of error propagation in calculations are very important in all

sciences to prevent deceptive and confusing interpretation of facts.

Experimental and measurement errors always create uncertainty in the final data. This

problem can be solved by introducing the rules of significant figures. In this method, we

specify the range of error by which each of the given values can be varied. Each of the

readings will be uncertain within this range of error. This error value is known as absolute

error. The same error can be represented in terms of percentage, and then it is called

relative error.

For example, when representing the temperature of a solution it will be 37 3C. Here,

3C represents the actual temperature range by which the reading is uncertain or can be

varied and this is known as the absolute error. When the same error is represented as a

percentage it is known as relative error. 37 3C can be represented as 37 1.25%.

Here, the error, 1.25 % is called relative error.

C. Types of Errors

Systemic Errors

Random Errors

When an error affects all measurements in the same way it is called a systemic error. In

most cases, the cause of this error is known and introducing a correction factor can minimize

the error. For example, a watch showing an error of + five minutes (five minutes fast). In this

case we can reduce five minutes from the time shown by the clock to get the correct time. A

balance that shows an error of 0.5 gm can be adjusted for that error effectively if the fact

is known. If an error occurs due to unknown reasons it is called a random error or an

accidental error. This type of error can be detected by repeating the experiments under the

same conditions. If different experimental values or results when repeating the experiments

without changing the experimental conditions are found, then there are random errors.

These errors can be quantified and minimized by applying methods of statistical analysis.

The results or data of an experiment should be reliable and reproducible. The term

precision refers to the reliability and reproducibility of results. It also indicates the

magnitude by which the data is free from random errors. We also use the term accuracy to

refer to the quality of the data. When there is a minimum of both systemic and random

errors or when it is almost zero and the results are reproducible, then we refer to the data as

accurate.

D. Statistical Analysis

Data is the set of results that is obtained from an experiment. Data makes a crude form of

information. Information is the communication of knowledge. Knowledge is established or

proved facts supported by evidence or data. But data is not knowledge. The data can be

converted into knowledge systematically as per the sequence shown below.

Data becomes information when it becomes relevant to solve your specific problem.

Information becomes facts when the data can support the information. Facts become

knowledge when they are useful in the successful explanation of the problem, phenomenon,

or process. Statistics play an important role in the systematic conversion of data into

knowledge. It is science that helps you in making decisions under uncertainties based on a

numerical and measurable scale. This decision making should be based on the data, but not

on personal views and belief. Statistical analysis of data involves the study of the laws of

probability, collection, organization and presentation of data, data properties, relationships

of data, etc.

Data can be of two types. Qualitative data and quantitative data.

Data such as color, size, or any other attribute of a population is not computable by

arithmetic relations and is considered qualitative data. They are the markers by which we

can identify an individual, process, or to which group or class they belong. They are called

categorical variables.

statistical analysis is applicable only in the case of this type of data. Quantitative data can

be of discrete data or continuous data. Discrete data are countable data. For example, the

number of unripe fruits present among the fruits of a basket or box. When the parameters

are measurable and are expressed in a continuous scale, it is called continuous data. For

example, the weight of tissues used in an experiment.

Statistical analysis of data includes a number of steps. The first thing in statistical

analysis is to measure or to count. This measuring or counting is the connection between

the reality and the data. A set of data is the representation of the reality in the form of a

numerical or measurable scale. If the analyst is involved in collecting the data, it is called

primary type data otherwise, it is called secondary type data.

Data, which is in discrete or continuous type, can be in any one of the following forms:

Nominal, Ordinal, Interval and Ratio(NOIR)

Under the conditions of uncertainty, decision making is largely dependent on the application

of statistical analysis of data for probabilistic risk assessment of the decision.

Figure 1 is the graphical representation constructing statistical models for decision making

under the conditions of uncertainties.

Figure 1 The statistical

thinking process in decision making under

uncertainties.

Statistics are sets of mathematical methods used to collect, analyze, present, and interpret

data to get to a conclusion about the problem. They are now used in a wide variety of

professions to solve many complex experimental problems. The methods of statistical

analysis are very helpful for decision makers, managers, and administrators of political,

business, and economics to enable them to arrive at correct and better decisions about

uncertain states of affairs. The advancement in computer technology and software has

greatly simplified statistical analysis, and a great number of statistical information is

available in todays economic socio-political environments. New developments in software

engineering have played an important role in statistical data analysis. There are very

efficient software packages with extensive data-handling capabilities. They are ideal for

handling various types of data from very small to very elaborate forms, which can be carried

out routinely. Even though computers assist in the statistical analysis, the analysis mainly

focuses on the outcome, in its ability to make correct predictions and decisions.

Definition of (understanding) the problem;

Data collection or its compilation;

Analyzing the data; and

Final assessment and reporting of results.

Defining the problem: A clear vision of the problem is a prerequisite. The correct

definition of the problem will help in collecting the exact type of data for analysis.

Collecting data: The data has to be collected from a specific group or population.

Therefore, the population about which we are trying to make an inference also has to be

clearly defined. Sampling and experimental design are required for carrying out precise

collection of data. Designing the ways to collect data is an important part of statistical

methods of data analysis, even though improvements in computational statistics have

simplified the process of data collection.

Defining the population and sample are two important aspects of statistical analysis.

(a) Population: a set of all the elements of interest in an experiment or study.

(b) Sample: a subset of a population is called a sample.

In statistics, we select a small, well-defined population and then extend the inference to the

whole population. This is known as Inductive Reasoning in mathematics. Its main purpose is

to test the hypothesis regarding a population. Inference about a population is obtained from

the information contained in a sample.

Analyzing the data: Data is grouped or classified and analyzed by suitable methods

turning its conversion into results.

Reporting the results: Finally, the results are expressed in a suitable form such as tables,

graphs, or a set of percentages. Since only a small collection or sample has been examined

and not the entire population, the results should reflect the uncertainty condition through

probability statements, intervals of values, and errors.

Data has to be analyzed and converted into a result that tells the proper information or

knowledge. The data that we obtain may be from small groups or samples, which represent

the entire population. Samples are the only the realistic way to obtain data because of time

and cost constraints. For the convenience of statistical analysis, data can be classified into

two categories: cross-sectional and time series data:

Cross-sectional data - Data collected at the same time or approximately the same point of

time.

Time series data - Data collected at different time intervals over a specific time period.

The data may be collected from existing sources or from a new observation of experiments

designed to get new data. In experimental studies there will be a number of factors

influencing the process. First, the variable of interest is identified and then the other

variables or factors are controlled so that data can be collected on the influence of the

variables. A survey is the most common type of observational study.

F. Data Analysis

In statistics, there are mainly two categories of data analysisexploratory methods and

confirmatory methods. Simple arithmetic calculations are used to analyze data and easy-

to-draw pictures are used to summarize the data in exploratory methods.

important in decision making because it provides a means for measuring, expressing, and

analyzing the uncertainties linked with future events.

The data that is recorded on a data sheet will go through three stages:

Coding: The data are transferred, if necessary, onto coded sheets.

Typing: Data are typed and stored by at least two independent data- entry persons.

Editing: The data is compared to the independently entered data to check for errors.

When the data is recorded or entered into the data sheet or computer, the following types of

errors are possible :

Recording errors

Typing errors

Transcription errors (incorrect copying)

Inversion, (example- 123.45 is typed as 123.54) errors

Repetition errors

Deliberate errors.

G. Trends

Experimental data is displayed in a suitable graphical form to analyze the trends of variation

among the variables. In certain cases it can be observed that the values are highly variable

and fluctuate around a mean value. This type of phenomenon is called scatter and the

distribution so obtained is called Gaussian distribution. For example, if we want to plot

the variation of blood glucose levels as a function of time, we may get a scattered

distribution. If we want to draw a line through all the values, it will result in a highly

fluctuating line.

The following are the main mathematical models used for testing the distribution

of variables.

Normal

Application: It is a basic distribution of statistics and an appropriate model for many

physical phenomena. Many applications arise from the central theorem average of values

of n number of observations approach normal distribution, irrespective of form of original

distribution under quite general conditions.

Example: Distribution of physical measurements, intelligence test scores, product

dimensions, average temperatures, etc. Many methods of statistical analysis presume to be

normal distribution. The

generalized Gaussian distribution has the following probability density function (pdf).

If n =1, it is Laplacian and if n = 2 it is Gaussian distribution. This distribution approximates

reasonably good data in some image coding applications.

independent uniform random variable.

Log-normal

Application: The representation of a random variable whose logarithm follows normal

distribution. This is a model for processes arising from many small multiplicative errors and

is appropriate when the value of an observed variable is a random proportion of the

previously observed value.

Example: Distribution of various biological phenomena, distribution of sizes from breakage

process, distribution of income size, life distribution of some transistor types, etc. In cases

where the data are log normally distributed, the geometric mean acts as a better data

descriptor than the mean. The more closely the data follows a log-normal distribution, the

closer the geometric mean is to the median, and therefore log re-expression produces

symmetrical distribution. The ratio of two log-normally distributed variables is known as log-

normal.

Poisson

Application: It is usually used in quality control, reliability, queuing theory, etc. If the

events take place independently at a constant rate, it gives a probability of exactly x

independent occurrences during a given period of time. It may also represent the number of

occurrences over constant areas or volumes. It is frequently used as approximation to

binomial distribution.

Example: Used to represent distribution of a number of defects in a piece of material,

customer arrivals, insurance claims, incoming telephone calls, radiation emitted, etc.

Geometric

Application: It gives probability of the number of binomial trials required before the first

success is achieved.

Example: It can be used in quality control, reliability, and other industrial situations.

Binomial

Application: It gives probability of exact success in n number of independent trials, when

probability of success p on single trial is a constant. Used frequently in quality control,

reliability, survey sampling, and other industrial problems.

In chi-square distribution, the probability distribution curve stretches over the positive

side of the line and has a long right tail. The form of the curve depends on the value of the

degree of freedom. Chi-square distribution is mainly used in Chi-square tests for association.

Chi-square tests are of statistical significance and widely used in bivariate tabular

association analysis. The hypothesis is based on whether or not two different populations are

different enough in some characteristic or aspect of their behavior based on two random

samples. This procedure is also known as the Pearson Chi-square test. The Chi-square

test is used to see if an observed distribution is in accordance to any particular distribution.

This test is calculated by comparing the observed data with the expected data based on the

particular distribution.

- Dr.tuguinayo SHSk12SportsArtsTracksUploaded byMarliel Paguiligan Castillejos
- CSEC Past Papers on STATISTICSUploaded byLiciaMc
- Development of Computer Assisted Instruction (CAI) in College StatisticsUploaded byAmadeus Fernando M. Pagente
- Chapter IIIUploaded byrelly_pbic
- Bet Smarter With the Monte Carlo SimulationUploaded byOladipupo Mayowa Paul
- Statistical Models help in SimulationUploaded byMohith Reddy
- 1Uploaded byAalia Hussain
- Fall 2011 Syllabus STAT3309 7 TTH-1Uploaded byAman ZeGreat
- icm1954.2.0275.0276.ocrUploaded byAAKASHKANSAL
- dallisgrass lab report 1Uploaded byapi-327029589
- QAM Introduction...NKGUPTAUploaded bybalakscribd
- ADD Math Folio CompleteUploaded byJoel Ooi
- INtroduction StatisticsUploaded byAbdullah Al Mamun Tusher
- A new reliability analysis method for uncertain structures with random and interval variablesUploaded byYoun Seok Choi
- ProbabilityUploaded byManu Choudhary
- QM Formative Ephiphany TermUploaded byClaire Goh
- Study Habits of Bachelor of SciencexxxxxxxxxxxxxxxxxxUploaded byChachi Hasiman
- article critiqueUploaded byapi-236260255
- Msg 00062Uploaded byhalohalobandung2
- Normal DistributionUploaded byMohamed Abd El-Moniem
- Completed Book.docxUploaded byMrinmoy Sikder
- stats reportUploaded byapi-286574365
- AS104_Teaching_Plan_May-Aug_2016.pdfUploaded byDe El Eurey Shine
- BaayenDavidsonBates2008MixedModels_CrossedRandomUploaded byDebilcina
- Tutorial List 1Uploaded byनेपाली नेवरि प्रसन्न
- MdtUploaded byKaisheem Fowler Bryant
- BBA_Syllabus_06-06-2013Uploaded byGangadhar
- Hollmann HydropowerUploaded bypirotte
- Managementul-Riscurilor-AplicatieUploaded byMuheeb Altaleb
- STAT Formulas 08262008Uploaded bySugan Pragasam

- ELECTIVE2-notescompiled (1)Uploaded bymichsantos
- Fermentation IndustriesUploaded bymichsantos
- RA 9297Uploaded byDaphne Cosi Leal
- Anticipated Results DistillationUploaded bymichsantos
- 4. Lecture Notes- Occupational Safety and Health.docxUploaded bymichsantos
- Achem ProblemsUploaded bymichsantos
- Dna ReplicationUploaded bymichsantos
- Solid Waste ManagementUploaded bymichsantos
- Distillation ConclusionUploaded bymichsantos
- Water Environment 1Uploaded bymichsantos
- Data and InterpretationUploaded bymichsantos
- asdfghUploaded bymichsantos
- BiochemEng_Group1_5201Uploaded byRomar Panopio
- 3b. Why Safety-Workplace Hazards.pptUploaded bymichsantos
- 09jun00Uploaded byVaidish Sumaria
- 6.1.4-written.docxUploaded byRomar Panopio
- 3. Disaster and Emergency Preparedness-PhilippinesUploaded bymichsantos
- Cooling Tower1Uploaded bymichsantos
- Batch or ContinuousUploaded bymichsantos
- Sedimentation FinalUploaded bysean.juman6067
- Leaching EquipmentsUploaded byArun Kumar
- 2.Preparation of Biological Solutions and Serial DilutionsUploaded bymichsantos
- 02QUIZUploaded bymichsantos
- rotavapUploaded bymichsantos
- Physical SystemUploaded bymichsantos
- Solid EnvironmentalUploaded bymichsantos

- Q vs. SPSSUploaded byasesorestadistico
- PE.II-CompLabs-2014.04.01_tUploaded byJustina Sasnauskaite
- CREDIT or FINANCIAL or RISK ANALYSTUploaded byapi-121417563
- Comparition of Discipulus Lgp SoftwareUploaded byVane Cañete
- Neural Networks-MATLAB ExamplesUploaded bydpksobs
- Great Change Point DetectionUploaded bymshchetk
- Neural Networks DocumentationUploaded byEva
- Uses of Index NumbersUploaded byAjay Karthik
- US Federal Reserve: softwaretoolsUploaded byThe Fed
- 2015 S1 ACF5320 Unit GuideUploaded bySteven
- gurbir 1Uploaded byMayur Madhavi
- WS-Demand Forecasting ScopeUploaded bydhatchayaniramkuma
- 1qtr2011 Part2 Brave Butters PDF (1)Uploaded bySmilie Chawla
- 1-s2.0-S0022169416300786-mainUploaded byguidoxl
- Wind Power Prediction Wit MLUploaded bynils_ericksson
- Inequality and Crime in ChinaUploaded byabdulraufhcc
- Abbott, Andrew - Sequence Analysis - New Methods for Old Ideas 1995Uploaded byDiego Marques
- A Weighted LS-SVM Based Learning System for Time Series ForecastingUploaded byJuan Sepúlveda
- Fitting State Space Model EViewsUploaded byFilipe Stona
- Market Demand and PotentialUploaded bysanket sunil
- BSA (Lesson Plan) MBA-I SemUploaded byShraddhanshu Tiwari
- forecastingUploaded byjayakumargj
- MPRA Paper 57244Uploaded byV Prasanna Shrinivas
- 1-s2.0-S0079661114001700-main(1)Uploaded bywawan.anggriawan
- Charles Tilly and the Paradox of the ActorUploaded byJon Azkune
- bfm_978-3-540-92942-0_1Uploaded byskumar4787
- Time Series Analysis Lecture Notes DulalUploaded byBishal Shrestha
- Stat PackagesUploaded bySaravanan
- Internship Report Vti HrUploaded byAsif Hafeez
- manual.pdfUploaded byPragyan Nanda

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.