You are on page 1of 35

UB 0203

D R H J H M AD I H AH K H AL I D

MATHEMATICS AS A LANGUAGE
Mathematicians need to be clear and concise when they communicate. The language of mathematics is better at communicating quantitative information than day to day language. How best do I communicate my work? The answer is to use a combination of written phonetic words, graphical representation of information, and certain symbolic conventions of mathematics. The challenge of the mathematician is not simply thinking up harder and harder proofs, but the challenge of finding ways to communicate information.

WHAT IS STATISTICS?
What do you think of when you hear the word statistics? Think of a general question that could be answered with statistics. How would you carry out the process in order to answer your question? Be as specific as possible. What is a random event? Give an example of something that happens randomly and something that does not.

HISTORY OF STATISTICS
The history of statistics can be said to start around 1749. Over time, there have been changes to the interpretation of what the word statistics means. In early times, the meaning was restricted to information about states. This was later extended to include all collections of information of all types, and later still it was extended to include the analysis and interpretation of such data. In modern terms, "statistics" means both sets of collected information, as in income distribution and temperature records, and analytical work which requires statistical inference.

RELATION WITH PROBABILITY


The relation between statistics and probability theory developed rather late, however. In the 19th century, statistics increasingly used probability theory, whose initial results were found in the 17th and 18th centuries, particularly in the analysis of games of chance (gambling). By 1800, astronomy used probability models and statistical theories, particularly the method of least squares. Early probability theory and statistics was systematized and extended by Laplace; following Laplace, probability and statistics have been in continual development. In the 19th century, social scientists used statistical reasoning and probability models to advance the new sciences of experimental psychology and sociology; physical scientists used statistical reasoning and probability models to advance the new sciences of thermodynamics and statistical mechanics. The development of statistical reasoning was closely associated with the development of inductive logic and the scientific method.

STATISTICS IN EVERYDAY LIFE


1. Weather Forecasts Do you watch the weather forecast sometime during the day? How do you use that information? Have you ever heard the forecaster talk about weather models? These computer models are built using statistics that compare prior weather conditions with current weather to predict future weather. 2. Emergency Preparedness What happens if the forecast indicates that a hurricane is imminent or that tornadoes are likely to occur? Emergency management agencies move into high gear to be ready to rescue people. Emergency teams rely on statistics to tell them when danger may occur. 3. Predicting Disease Lots of times on the news reports, statistics about a disease are reported. If the reporter simply reports the number of people who either have the disease or who have died from it, it's an interesting fact but it might not mean much to your life. But when statistics become involved, you have a better idea of how that disease may affect you.

4. Medical Studies Scientists must show a statistically valid rate of effectiveness before any drug can be prescribed. Statistics are behind every medical study you hear about. 5. Political Campaigns Whenever there's an election, the news organizations consult their models when they try to predict who the winner is. Candidates consult voter polls to determine where and how they campaign. Statistics play a part in who your elected government officials will be 6. Insurance You know that in order to drive your car you are required by law to have car insurance. If you have a mortgage on your house, you must have it insured as well. The rate that an insurance company charges you is based upon statistics from all drivers or homeowners in your area. 7. Stock Market Another topic that you hear a lot about in the news is the stock market. Stock analysts also use statistical computer models to forecast what is happening in the economy. Note: Try to think where do YOU encounter statistics in YOUR life

STATISTICS AS PROBLEM SOLVING


Four things make a problem statistical: the way in which you ask the question, the role and nature of the data, the particular ways in which you examine the data, and the types of interpretations you make from the investigation. A statistics problem typically contains four components: 1. Ask a question. Asking a question gets the process started. Its important to ask a question carefully, with an understanding of the data you will use to find your answer. 2. Collect appropriate data. Collecting data to help answer the question is an important step in the process. You obtain data by measuring something, so your measurement methods must be chosen with care. Sampling is one way to collect data; experimentation is another. 3. Analyze the data. Data must be organized, summarized, and represented properly in order to provide good answers to statistical questions. Also, the data you collect usually vary (i.e., they are not all the same), and you will need to account for the sources of this variation. 4. Interpret the results. After you analyze your data, you must interpret it in order to provide an answeror answersto the original question.

EXAMPLE
Ask a question: Are men typically taller than women? Do men typically have longer arm spans than women?

a. Examine the 24 measurements for height and arm span. Youll notice that they are not all the same. What is the source of this variation? Can you explain why there are differences? b. Suppose your goal was to prove that men are typically taller than women. Does this data prove that conclusion? Why or why not? Talk about error and bias. What can you do to reduce these? Sampling?

STATISTICAL REASONING
The way people reason with statistical ideas and make sense of statistical info. All students should be able to do the following: Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them Select and use appropriate statistical methods to analyze data Develop and evaluate inferences and predictions that are based on data Understand and apply basic concepts of probability For data analysis and statistics, students are expected to do the following: Formulate questions, design studies, and collect data about a characteristic shared by two populations or different characteristics within one population Select, create, and use appropriate graphical representations of data Find, use, and interpret measures of center and spread, including mean and interquartile range Discuss and understand the correspondence between data sets and their graphical representations, especially histograms, stem and leaf plots?, box plots?, and scatter plots?

PRE-TEST
Consider topics of interest to you that involved collecting data about a characteristic shared by two populations. Formulate five questions that involve collecting qualitative (categorical) data and five questions that involve collecting quantitative (numerical) data. For each question, identify the type of data that will be collected and an appropriate way to display the data (e.g., line graph, bar graph, histogram, pie chart, stem and leaf plot, box plot)

PRESENTATION OF DATA
Statistical representation is the science/art of using data to describe the world around us. There are numerous ways of constructing statistical representations. The proper representation depends upon the nature of the data and the particular issues being addressed. A combination of methods is often appropriate e.g. tables, charts and graphs. Statistical representations include pictograms, bar graphs, line graphs, box plots, pie charts, histograms and box plots.

WHY USE GRAPHS AND CHARTS?


Graphs are quick and direct highlight the most important facts facilitate understanding of the data can convince readers can be easily remembered Knowing what type of graph to use with what type of information is crucial. Depending on the nature of the data some graphs might be more appropriate than others. Yet, a graph is not always the most appropriate tool to present information. Sometimes text or a data table can provide a better explanation to the readersand save you considerable time and effort.

GRAPHS: FOUR GUIDELINES


Define your target audience. Ask yourself the following questions to help you understand more about your audience and what their needs are:
Who is your target audience? What do they know about the issue? What do they expect to see? What do they want to know? What will they do with the information?

Determine the message(s) to be transmitted. Ask yourself the following questions to figure out what your message is and why it is important:
What do the data show? Is there more than one main message? What aspect of the message(s) should be highlighted? Can all of the message(s) be displayed on the same graphic?

Determine the nature of the message(s). Consider the following instructions and their appropriate terms when labelling the graph or describing features of it in accompanying text: If your graph will... Use the following terms... describe components share of, percent of the, smallest, the majority of compare items ranking, larger than, smaller than, equal to establish a time series change, rise, growth, increase, decrease, decline, fluctuation determine a frequency range, concentration, most of, distribution of x and y by age analyse relationships in increase with, decrease with, vary with, despite, data correspond to, relate to do any combination of e.g., 'percentage of dropouts among the 15 to 24 the above actions age group has increased because of....'

Experiment with different types of graphs and select the most appropriate.
pie chart (description of components) horizontal bar graph (comparison of items and relationships, time series) vertical bar graph (comparison of items and relationships, time series, frequency distribution) line graph (time series and frequency distribution) scatterplot (analysis of relationships)

Pictographs A pictograph uses picture symbols to convey the meaning of statistical information. Pictographs should be used carefully because the graphs may, either accidentally or deliberately, misrepresent the data. This is why a graph should be visually accurate.

Figure 1. Number of students who like chocolate chip cookies best

Pie charts A pie chart is a way of summarizing a set of categorical data or displaying the different values of a given variable (e.g., percentage distribution). This type of chart is a circle divided into a series of segments. Each segment represents a particular category. The area of each segment is the same proportion of a circle as the category is of the total data set. Pie charts usually show the component parts of a whole. Often you will see a segment of the drawing separated from the rest of the pie in order to emphasize an important piece of information.

Figure 2. Music preferences in young adults 14 to 19

Figure 1. Student and faculty response to the poll 'Should Avenue High School adopt student uniforms?'

Bar Chart A bar graph may be either horizontal or vertical. The important point to note about bar graphs is their bar length or heightthe greater their length or height, the greater their value. Bar graphs are one of the many techniques used to present data in a visual form so that the reader may readily recognize patterns or trends. Bar graphs usually present categorical and numeric variables grouped in class intervals. They consist of an axis and a series or labeled horizontal or vertical bars. The bars depict frequencies of different values of a variable or simply the different values themselves. The numbers on the x-axis of a bar graph or the y-axis of a column graph are called the scale. When developing bar graphs, draw a vertical or horizontal bar for each category or value. The height or length of the bar will represent the number of units or observations in that category (frequency) or simply the value of the variable. Select an arbitrary but consistent width for each bar as well. There are three types of graphs used to display time series data: horizontal bar graphs, vertical bar graphs and line graphs. All three of these types of graphs work well when you need to compare values. However, in general, data comparisons are best represented vertically.

Number of police officers in Crimeville, 1993 to 2001

Number of students at Diversity College who are immigrants, by last country of permanent residence

Internet use at Redwood Secondary School, by sex, 1995 to 2002

Drug use by 15-year-old students in one school, by gender

Example Comparing several places or items Figure 5 is an example of a double horizontal bar graph. Hillary sampled an equal number of boys and girls at her high school and asked them to pick the one snack food they liked the most from the following list: Popcorn, chips, chocolate bars, crackers, pretzels, cookies, ice cream, fruit, candy, vegetables. She created a graph to display the results of her survey. Examine Figure 5, and answer the following questions: What comparison does this graph show? Which snack food was least preferred by girls? Which snack food was preferred by substantially more boys than girls? Which snack foods were preferred by more girls than boys? Which snack food was preferred equally by both boys and girls?

Figure 5: Preferred snack choices of students at Hillary's high school

Example Inappropriate use of bar graphs Vertical bar graphs are an excellent choice to emphasize a change in magnitude. The best information for a vertical bar graph is data dealing with the description of components, frequency distribution and time-series statistics. A horizontal bar graph may be more effective than a line graph when there are fewer time periods or segments of data. If you want to compare more than 9 or 10 items, use a line graph instead. Figure 6 is an example of when a line graph should be used instead of a horizontal bar graph.

Car types produced in Global City, January

Other bar graphs

Stacked Bar Chart Campbell High Triathlon, percentage of time spent on each event, by

competitor

Dot Graphs - Rates of enrolment by program, Academia University

Smoking fr ncy of year-ol s on e arkview Secondary School rack and field eam

Earnings in Utopia, by sex

Line graphs Line graphs are more popular than all other graphs combined because their visual characteristics reveal data trends clearly and these graphs are easy to create. Line graphs, especially useful in the fields of statistics and science, are one of the most common tools used to present data. A line graph is a visual comparison of how two variablesshown on the x- and yaxesare related or vary with each other. It shows related information by drawing a continuous line between all the points on a grid. Line graphs compare two variables: one is plotted along the x-axis (horizontal) and the other along the y-axis (vertical). The y-axis in a line graph usually indicates quantity (e.g., dollars, litres) or percentage, while the horizontal x-axis often measures units of time. As a result, the line graph is often viewed as a time series graph. For example, if you wanted to graph the height of a baseball pitch over time, you could measure the time variable along the x-axis, and the height along the y-axis. Although they do not present specific data as well as tables do, line graphs are able to show relationships more clearly than tables do. Line graphs can also depict multiple series which are usually the best candidate for time series data and frequency distribution. Bar and column graphs and line graphs share a similar purpose. The column graph, however, reveals a change in magnitude, whereas the line graph is used to show a change in direction. In summary, line graphs show specific values of data well reveal trends and relationships between data compare trends in different groups of a variable Graphs can give a distorted image of the data. If inconsistent scales on the axes of a line graph force data to appear in a certain way, then a graph can even reveal a trend that is entirely different from the one intended. This means that the intervals between adjacent points along the axis may be dissimilar, or that the same data charted in two graphs using different scales will appear different.

Plotting a trend over time

Comparing two related variables

Using correct scale

Number of guilty crime offenders, Grishamville

Multiple line graph Cell phone use in Anytowne, 1996 to 2002

Histograms The histogram is a popular graphing tool. It is used to summarize discrete or continuous data that are measured on an interval scale. It is often used to illustrate the major features of the distribution of the data in a convenient form. A histogram divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group, and an area proportional to the number of observations falling into that group. This means that the rectangles will be drawn of non-uniform height. A histogram has an appearance similar to a vertical bar graph, but when the variables are continuous, there are no gaps between the bars. When the variables are discrete, however, gaps should be left between the bars. Figure 1 is a good example of a histogram. A vertical bar graph and a histogram differ in these ways: In a histogram, frequency is measured by area of column while in a vertical bar graph, frequency is measured by height of bar. Histogram characteristics Generally, a histogram will have bars of equal width, although this is not the case when class intervals vary in size. Choosing the appropriate width of the bars for a histogram is very important. As you can see in the example above, the histogram consists simply of a set of vertical bars. Values of the variable being studied are measured on an arithmetic scale along the horizontal x-axis. The bars are of equal width and correspond to the equal class intervals, while the height of each bar corresponds to the frequency of the class it represents. The histogram is used for variables whose values are numerical and measured on an interval scale. It is generally used when dealing with large data sets (greater than 100 observations). A histogram can also help detect any unusual observations (outliers) or any gaps in the data. Frequency polygon - ? Cumulative Frequency Polygon (s-curve, ogive)

Figure 1. Distribution of salaries of the Acme Corporation

Figure 2. Distribution of salaries of the Acme Corporation

WHEN IS IT NOT APPROPRIATE TO USE A GRAPH?


data are very dispersed

too few data (one, two or three data points)

data are very numerous

data show little or no variations

You might also like