You are on page 1of 4

# STATISTICS Statistics is described as a mathematical body of science that pertains to the collection, analysis, interpretation or explanation, and presentation

of data, or as a branch of mathematics concerned with collecting and interpreting data. Because of its empirical roots and its focus on applications, statistics is typically considered a distinct mathematical science rather than as a branch of mathematics. Some tasks a statistician may involve are less mathematical; for example, ensuring that data collection is undertaken in a way that produces valid conclusions, coding data, or reporting results in ways comprehensible to those who must use them. STATISTIC A number that represents a piece of information (such as information about how often something is done, how common something is, etc.) TYPES OF STATISTICS DESCRIPTIVE STATISTICS Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. Descriptive statistics are typically distinguished from inferential statistics. With descriptive statistics you are simply describing what is or what the data shows. With inferential statistics, you are trying to reach conclusions that extend beyond the immediate data alone. INFERENTIAL STATISTICS With inferential statistics, you are trying to reach conclusions that extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the population might think. Or, we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in our data. CENTRAL TENDENCY A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data. As such, measures of central tendency are sometimes called measures of central location. They are also classed as summary statistics. The mean (often called the average) is most likely the measure of central tendency that you are most familiar with, but there are others, such as the median and the mode. MEAN - The arithmetic mean is the most common measure of central tendency. It is simply the sum of the numbers divided by the number of numbers. The symbol "" is used for the mean of a population. The symbol "M" is used for the mean of a sample. The formula for is shown below: M = X/N Where X is the sum of all the numbers in the sample and N is the number of numbers in the sample.

MEDIAN - The median is the midpoint of a distribution MODE the most frequent value in the data set. This is the only central tendency measure that can be used with nominal data, which have purely qualitative category assignments. SKEWNESS Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. For univariate data Y1, Y2, ..., YN, the formula for skewness is: Skewness = iN = [1(YiY)3+/* (N1)s3] Where Y is the mean, s is the standard deviation, and N is the number of data points. The skewness for a normal distribution is zero, and any symmetric data should have a skewness near zero. Negative values for the skewness indicate data that are skewed left and positive values for the skewness indicate data that are skewed right. By skewed left, we mean that the left tail is long relative to the right tail. Similarly, skewed right means that the right tail is long relative to the left tail. Some measurements have a lower bound and are skewed right. For example, in reliability studies, failure times cannot be negative. KURTOSIS Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak. A uniform distribution would be the extreme case. For univariate data Y1, Y2, ..., YN, the formula for kurtosis is: Kurtosis=iN = [1(YiY)4+/(N1)s4 where Y is the mean, s is the standard deviation, and N is the number of data points. Grouped and Ungrouped Data Grouped data is data that has been organized into classes. Grouped data has been classified and some data analysis has been done, which means this data is no longer raw. Un-grouped data has not been organized into groups. It is just a list of numbers. COUNTING TECHNIQUES There are many situations in which it would be too difficult and/or too tiring to list all of the possible outcomes in a sample space. Counting Techniques are various ways of counting the number of elements in a sample space without actually having to identify the specific outcomes. Factorials If n is a positive integer, then n! = n (n-1) (n-2) ... (3)(2)(1) n! = n (n-1)! A special case is 0! 0! = 1

PERMUTATIONS A permutation is an arrangement of objects without repetition where order is important. Permutations using all the objects A permutation of n objects, arranged into one group of size n, without repetition, and order being important is:
nPn

= P(n,n) = n!

Example: Find all permutations of the letters "ABC" ABC ACB BAC BCA CAB CBA PERMUTATIONS OF SOME OF THE OBJECTS A permutation of n objects, arranged in groups of size r, without repetition, and order being important is:
nPr

= P(n,r) = n! / (n-r)!

Example: Find all two-letter permutations of the letters "ABC" AB AC BA BC CA CB Shortcut formula for finding a permutation Assuming that you start a n and count down to 1 in your factorials ... P(n,r) = first r factors of n factorial Distinguishable Permutations Sometimes letters are repeated and all of the permutations aren't distinguishable from each other. Example: Find all permutations of the letters "BOB" To help you distinguish, I'll write the second "B" as "b" BOb BbO OBb ObB bBO bOB If you just write "B" as "B", however ... BOB BBO OBB OBB BBO BBO There are really only three distinguishable permutations here. BOB BBO OBB If a word has N letters, k of which are unique, and you let n (n1, n2, n3, ..., nk) be the frequency of each of the k letters, then the total number of distinguishable permutations is given by:

Consider the word "STATISTICS": Here are the frequency of each letter: S=3, T=3, A=1, I=2, C=1, there are 10 letters total 10! 10*9*8*7*6*5*4*3*2*1 Permutations = ----------------- = -------------------------------- = 50400 3! 3! 1! 2! 1! 6*6*1*2*1

COMBINATIONS A combination is an arrangement of objects without repetition where order is not important. Note: The difference between a permutation and a combination is not whether there is repetition or not -- there must not be repetition with either, and if there is repetition, you cannot use the formulas for permutations or combinations. The only difference in the definition of a permutation and a combination is whether order is important. A combination of n objects, arranged in groups of size r, without repetition, and order being important is:
nCr

= C(n,r) = n! / ( (n-r)! * r! )

Another way to write a combination of n things, r at a time is using the binomial notation: Shortcut formula for finding a combination Assuming that you start a n and count down to 1 in your factorials... C(n,r) = first r factors of n factorial divided by the last r factors of n factorial MULTIPLICATION RULE The multiplication rule is a result used to determine the probability that two events, A and B, both occur. The multiplication rule follows from the definition of conditional probability. The result is often written as follows, using set notation: Where: P(A) = probability that event A occurs P(B) = probability that event B occurs = probability that event A and event B occur P(A | B) = the conditional probability that event A occurs given that event B has occurred already P(B | A) = the conditional probability that event B occurs given that event A has occurred already For independent events, that are events which have no influence on one another, the rule simplifies to: That is, the probability of the joint events A and B is equal to the product of the individual probabilities for the two events.