What is Statistics?

It is difficult to define Statistics in a few words since its dimension, scope, function; use & importance are constantly changing over time. Facts & figures about any phenomenon whether it relates to population, production, income, expenditure, sales, birth or death or any other quantitative measures of phenomenon or events are called Statistics. What are statistical methods?

The methods of analyzing statistical data are called statistical methods. There are two methods: Experimental Method Observational Method

What is Inferential Statistics?

In inferential statistics, it deals with techniques used for analysis of data, making the estimates and drawing conclusions from limited information taken on sample basis and testing the reliability of the estimates. It provides the bases for predictions, forecasts, and estimates that are used to transform information into knowledge. e.g.; suppose we want to have an idea about the percentage of illiterates in our country. We take a sample from the population and find the proportion of illiterates in the sample. This sample proportion with the help of probability enables us to make some assumption about the population proportion. This study belongs to inferential statistics.

What is Statistical Forecasting?

Statistical forecasting concentrates on using the past to predict the future by identifying trends, patterns and business drives within the data to develop a forecast. This forecast is referred to as a statistical forecast because it uses mathematical formulas to identify the patterns and trends while testing the results for mathematical reasonableness and confidence.

What are the major functions of Statistics?

Some of its important functions are given below: It presents facts in a definite form.

Statistics simplifies mass of figures. It helps in presenting complex data in a suitable tabular, diagrammatic and graphic form for an easy and clear comprehension of the data. It facilitates comparison. It helps in formulating and testing hypothesis. It helps in making decisions. It helps in the formulation of suitable policies.

What are the scopes of statistics?

Statistics plays a vital role in every fields of human activity. Statistics has important role in determining the existing position of per capita income, unemployment, population growth rate, housing, schooling medical facilities etc…in a country. Now statistics holds a central position in almost every field like Industry, Commerce, Trade, Physics, Chemistry, Economics, Mathematics, Biology, Botany, Psychology, Astronomy etc…, so application of statistics is very wide.

What are the limitations of statistics?

The important limitations of statistics are: 1. Statistics laws are true on average. Statistics are collections of facts. So single observation is not a statistics, it deals with groups and aggregates only. 2. Statistical methods are best applicable on quantitative data. 3. Statistical cannot be applied to assorted data. 4. It sufficient care is not exercised in collecting, analyzing and interpretation the data, statistical results might be misleading. 5. Only a person who has an expert knowledge of statistics can handle statistical data efficiently. 6. Some errors are possible in statistical decisions. Particularly the inferential statistics involves certain errors. We do not know whether an error has been committed or not. What is data?

The term data refers to groups of information that represent the qualitative or quantitative attributes of a variable or set of variables.

What are the sources of Data?

The data can be collected from:

i. Direct field operation such as census ii. Already published data Data can be broadly categorized into two types depending on their sources:

Primary Data:

The primary data are the first hand information collected, compiled and published by organization for some purpose. They are most original data in character and have not undergone any sort of statistical treatment.

Example: Population census reports are primary data because these are collected, complied and published by the population census organization.

Secondary Data:

The secondary data are the second hand information which is already collected by someone (organization) for some purpose and are available for the present study. The secondary data are not pure in character and have undergone some treatment at least once.

Example: Economics survey of England is secondary data because these are collected by more than one organization like Bureau of statistics, Board of Revenue, the Banks etc…

How to design questionnaire?

The steps required to design a questionnaire include: 1. Defining the Objectives of the survey 2. Determining the Sampling Group 3. Writing the Questionnaire Pretesting a questionnaire To determine the effectiveness of your survey questionnaire, it is necessary to pretest it before actually using it. Pretesting can help you determine the strengths and weaknesses of your survey concerning question format, wording and order.

**There are two types of survey pretests: participating and undeclared.
**

Participating pretests dictate that you tell respondents that the pretest is a practice run; rather than asking the respondents to simply fill out the questionnaire, participating pretests usually involve an interview setting where respondents are asked to explain reactions to question form, wording and order. This kind of pretest will help you determine whether the questionnaire is understandable. When conducting an undeclared pretest, you do not tell respondents that it is a pretest. The survey is given just as you intend to conduct it for real. This type of pretest allows you to check your choice of analysis and the standardization of your survey. According to Converse and Presser (1986), if researchers have the resources to do more than one pretest, it might be best to use a participatory pretest first, then an undeclared test.

Editing of Data: After collecting the data either from primary or secondary source, the next step is editing. Editing means the examination of collected data to discover any error and mistake before presenting it. It has to be decided before hand what degree of accuracy is wanted and what extent of errors can be tolerated in the inquiry. The editing of secondary data is simpler than that of primary data.

Q: How the data can be presented? Data can be presented as: i. Tabulated form such as frequency distribution table ii. Charting Diagrams Graphs

Type of data classification

The process of arranging data into homogenous group or classes according to some common characteristics present in the data is called classification. Geographical classification: When the data are classified by geographical regions or location, like states, provinces, cities, countries etc… Chronological classification: When the data are classified or arranged by their time of occurrence, such as years, months, weeks, days etc… For Example: Time series data.

Qualitative Base: When the data are classified according to some quality or attributes such as sex, religion, literacy, intelligence etc… Quantitative Base: When the data are classified by quantitative characteristics like heights, weights, ages, income etc… Tabulation of table Parts of table: Table number Each table should be numbered. There are different practices with regard to the place where this number is to be given. Table number helps to give easy reference. The Title: A title is the main heading written in capital shown at the top of the table. It must explain the contents of the table and throw light on the table as whole different parts of the heading can be separated by commas there are no full stop be used in the little. Caption: The vertical heading and subheading of the column are called columns captions. The horizontal headings and sub heading of the row are called row captions. Stub: The space where the rows headings are written is called stub. Stubs are the designation of rows or rows heading. The Body: It is the main part of the table which contains the numerical information classified with respect to row and column captions. Head Notes: A statement given below the title and enclosed in brackets usually describe the units of measurement is called head notes. Foot Notes: It appears immediately below the body of the table providing the further additional explanation.

Types of table:

Simple table: In a simple table only one character is shown. Hence, this type of table is also known as one-way table. Complex table: In a complex table two or more character are shown. Complex tables enable fill information to be incorporated and facilitate a proper consideration of all related facts. General purpose table: General purpose tables, also known as the reference tables or repository tables, provide information for general use or reference. Special purpose table: Special purpose tables, also known as summary or analytical tables, provide information for particular discussion. They show relationship between two different groups of figures.

11. Charting data Diagrams: A diagram is a two-dimensional geometric symbolic representation of information. Graphs: In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines or curves for the edges.

12. Types of diagrams: One-dimensional diagrams e.g. bar diagram: In this these diagrams there are bars/ thick lines, of which only the length matter not the width. When a large number of observations are to be compared, these are the only form that can be used effectively. Two-dimensional diagrams: Here the length as well as the width of the bars is considered. Thus the area of the bar represents the given data.

Pictograms: A pictogram is an ideogram that conveys its meaning through its pictorial resemblance to a physical object. Cartograms: A cartogram is a map in which some thematic mapping variable – such as travel time or Gross National Product – is substituted for land area or distance.

Types of bar diagrams

Simple bar diagrams: A simple bar chart is used to represents data involving only one variable classified on spatial, quantitative or temporal basis. Sub-divided bar diagrams: Sub-divided or component bar chart is used to represent data in which the total magnitude is divided into different or components. Multiple bar diagrams: By multiple bars diagram two or more sets of inter-related data are represented (multiple bar diagram facilities comparison between more than one phenomenon) Percentage bar diagrams: Sub-divided bar chart may be drawn on percentage basis. To draw sub-divided bar chart on percentage basis, we express each component as the percentage of its respective total. In drawing percentage bar chart, bars of length equal to 100 for each class are drawn at first step and sub-divided in the proportion of the percentage of their component in the second step. Deviation bar diagrams: They represent net quantities which can have positive or negative value. Broken bars: In certain type of data there may be wide variations in values. In order to gain space for the smallest bars of the data, the large may be broken. In the figure below a simple bar diagram is shown:

2008

2009

2010

2011 0 5 10 15 20 25 30 35 40 45

Batting Average of Shakib Al Hasan in different years

Pie diagrams: A pie chart (or a circle graph) is a circular chart divided into sectors, illustrating proportion. In a pie chart, the arc length of each sector (and consequently its central angle and area), is proportional to the quantity it represents. When angles are measured with 1 turn as unit then a number of percent is identified with the same number of centiturns. Together, the sectors create a full disk. It is named for its resemblance to a pie which has been sliced. The diagram below is a Pie diagram:

No. of Students in Different Faculties in NSU

BBA English Physics Economics

i)

Graphs

Graphs of time series or line graphs: A time-series graphs is a line graph where time is measured on the horizontal axis and the variable being observed is measured on the vertical axis. It is of two types

a)Range chart: It is a method of showing the range of variation, i.e. the minimum and maximum value of a variable. In the fig below a range chart is shown:

Range Chart for Missing Doses/ Nursing Unit- ICU’s b) Band Graph: It is a type of line graph which shows the total for successive time periods broken up into sub-totals for each of the component parts of the total. In other words, band graph shows how and in what proportion the individual items comprising the aggregate are distributed. In the fig below a band graph is shown:

Taste Band Clearance Profile ii) Graphs of frequency distribution: a) Histogram: A histogram consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins), with an area equal to the frequency of the observations in the interval. In the fig below a Histogram is shown:

Salary of Employees in a company

b) Frequency polygon: It is a graph of frequency distribution. It has more than four sides. It is particularly effective in comparing two or more frequency distribution.

Annual Transaction Count c) Smoothed frequency curve: It can be drawn through the various points of the polygon. The curve is drawn freehanded in such a manner that the area included under the curve is approximately the same as that of the polygon.

Smoothed Frequency Curve d)Cumulative frequency curves or ‘Ogive’: Data may be expressed using a single line. An ogive (a cumulative line graph) is best used when you want to display the total at any given time. The relative slopes from point to point will indicate greater or lesser increases; for example, a steeper slope means a greater increase than a more gradual slope. An ogive,

however, is not the ideal graphic for showing comparisons between categories because it simply combines the values in each category and thus indicates an accumulation, a growing or lessening total. If you simply want to keep track of a total and your individual values are periodically combined, an ogive is an appropriate display.

Ogive 16. Frequency Distribution: In statistics, a frequency distribution is a tabulation of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way the table summarizes the distribution of values in the sample.

Name of the Family No. of Members in Family Turin 8 Lemon 5 Tulish 4 Nayim 9 No. of Members in different Families

Formation of frequency distribution: a) Grouped data: Data which have been arranged in groups or classes rather than showing all the original figures. b) Ungrouped data: Data that has not been organized into groups. c)Class limits: This is the lowest and the highest value that can be included in the class. d)Class intervals: The span of the class, which is the difference between the upper limit and the lower limit, is known as class intervals.

e)Class frequency: The number of observations corresponding to the particular class is known as class frequency. f)Class mid-point: It is the value lying half-way between the lower and the upper class limit of a class interval. d)Exclusive method: When the class intervals are so fixed that the upper limit of one class is the lower limit of the next class, it is known as the ‘exclusive method’ of classification. e) Inclusive method: Under the ‘inclusive method’ of classification, the upper limit of one class is included in that class itself.

Principals of classification: The number of classes should preferably be between 5 and 15. As far as possible one should avoid odd values of class-intervals e.g. 3,7,11, etc. The starting point i.e., the lower limit of the first class, should either be 0 or 5 or multiple of 5. To ensure continuity and to get correct class-interval we should adopt ‘exclusive’ method of classification. However, where ‘inclusive’ method has been adopted it is necessary to make an adjustment to determine the class-interval and to have continuity. Whenever possible all classes should be of the same size.

References:

http://en.wikipedia.org/ www.emathzone.com

** Business Statistics” by S.P. GUPTA & M.P. GUPTA
**

An Introduction to Statistics & Probability by M. Nurul Islam

Course code: BUS172 Course Title: Introduction To Statistics

Course code: BUS172 Course Title: Introduction To Statistics

Submitted By xxxxxxxxxxxxxx Submitted to xxxxxxxxxxxxxx

Prepared By Sazzad Hossain Lemon Dept. of APECE (2008-09 Session) University of Dhaka lemon.apece.du@gmail.com

