You are on page 1of 46

Data flows from sources like..
• Village Level • Mandal Level

Data flows like flood water!! Database is like a Check dam!

• District Level
• State Level • Region Level • Country Level • Global Level • A database is a collection of records (or data files) combined and treated as a unit for information retrieval

• We have statistical databases on various aspects like
• • • • • Food grains Blood Banks Tax Payers Agricultural Output Share Market and many more…..

Statistical Databases!

• The DATA should be converted into information (reports) by applying Data Analysis Tools

• Examining data for its relevance • Preparation of tables

What is Data • Graphic display of information Analysis? • Estimating the unknown

Making Figures Speak (the truth!)

Example: Agricultural output by Cropcutting experiments

• Establishing functional relationship between causes and effect • Computing the Growth rates • Understanding the Trends and making forecasts … and many more! • Preparing a document stating the methodology and interpreting the results

• The Common and Old Method
• • Physical counting of cases from data sheets Hand Calculations

• •

How to do?

Reference to Statistical Books for formulae Bypassing complex calculations and reporting the easy-to-do things alone!

• The Contemporary Method
• Get data into the computer

• •

Use a statistical software Prepare document using a Word Processor

.

Education level. Marital Status.• A new health insurance scheme is introduced by a company for its employees • The management wishes to know the reaction of its employees to the new scheme • Opinions were collected from 50 employees on several aspects like A survey on health insurance • Age. Gender. . monthly income and Concept Rating. Present arrangements for health check up.

• A questionnaire has been designed and used for collecting data Collection of data with suitable coding • Opinions were sought on a five point scale (multiple choice-tick one only) • Coding of responses is as follows. • Extremely interested • Interested • Indifferent • Not interested • Not at all interested 5 4 3 2 1 .

5000 & above M F M S 1 2 3 4 .3000 to Rs.Coding for personal factors • Age – (initially no coding ) • actual years • Gender • Male • Female • Marital Status • Married • Single • Monthly income • Less than Rs.1000 to Rs.4999 • Rs.1000 • Rs.2999 • Rs.

Coding for personal • Present Arrangement • Private doctor-own expenses factors • Government/Corporate Hospitals • Partial reimbursement • Full reimbursement • Education • Below Higher Secondary • Higher Secondary • Graduation • Post-graduation 1 2 3 4 1 2 3 4 .

Age. marital status etc • Is there any relationship between the income level and the type of response? • Identify the factors influencing the adoption to new scheme? • What else the data speaks! .• Analysis is based on the questions for which the data is expected to provide answers • Some questions • Identify how many are interested in the new scheme and how many are either indifferent or not interested Analyze • Cross tabulate them along Gender. the Data! Education.

• Data Entry • -The First Step • Analysis with Software • – The Second Step .

The physical structure of data • The data collected from the field contains filled-in questionnaires or sheets • Each sheet must have a serial number • The sheets should be converted into a data file for use in computer • We can probably divide the work and make more than one file and assign the work to Data Entry Operators • The Data Entry Design should be well planned and be common for all operators • These data files can be pooled up if necessary to make a project-data-file .

Data should be arranged as separate records one for each individual (entity) The data should be numeric for carrying out any analysis Names and other labels will not go in for analysis but can be used for reporting TAKING DATA FROM BOOK TO COMPUTER Suitable coding should be defined before entering data in the computer .

• • • • • • FoxPro Lotus MS-Excel MS-Access Oracle On-line formats Software for data entry and data • Packages for Statistical Analysis analysis • • • • SPSS SAS MINITAB SYSTAT ..• There are many packages for data entry like.

A VISIT TO EXCEL .

Gender etc Key in the data row wise or column wise (press ENTER key after each entry) Save the file with a suitable name in a Folder meant for this project . Age.MAKING A DATA FILE Open Excel On the title bar of the Excel window the file name appears as Microsoft Excel Book1 It usually contains three sheets named Sheet1.Sheet2 and Sheet3 In Sheet1 start entering the data from cell A1 Reserve the first row for column headings like Sno.

A SAMPLE DATA SHEET File Name: Food – Folder: D:\Statman .

NOT THE CORREC T STYLE OF DATA ENTRY .

THE RIGHT WAY! .

DATA SHEET OF HEALTH INSURANCE .

Finding sums Data sorting and Filtering Making one dimension tables Cross tabulations Creating different types of graphs Making abstracts from worksheets Changing the styles of presenting data Linking Excel report to a document ANALYTIC AL FEATURE S IN EXCEL .

Copy & Paste Auto Fill Paste Special Freeze Panes Exporting Excel data to Word SOME TIPS IN DATA HANDLIN G .Selecting a part of data Sorting Filtering Column width Cut.

and many more DATA ANALYSIS PAK ..A free package of simple statistical tools is available in Excel It is called Data Analysis Pak It provides for analyses like  Summary statistics  Comparison of groups  Correlations  Regression analysis  Statistical tests of hypothesis  ….

S B VARADAN.R B BEENA. A G ACHUTAN. L B SASIKALA. M B ANITHA.M B PERUMAL. A B MUTHU. M G CASTE SC SC ST OC OC OC BC BC BC OC OC BC BC SC ST BC ST ST BC SC ENGLISH MATHS 60 27 55 44 46 54 35 47 20 46 54 50 63 46 54 52 35 40 25 36 28 40 64 56 37 45 63 44 56 52 45 48 50 46 35 38 52 50 41 55 SCIENCE 45 36 65 28 35 45 64 65 54 45 38 37 54 36 63 54 68 65 54 58 It is enough to copy the Word Table and Paste in Excel! . S B PRADEEP. M B GANESH. D B DIVYA. B G GOPAL.Data Prepared In Word Table SNO 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 NAME GENDER RAJA. T G VASUMATHI. B B JAYA. R G VALLI. L G RAMAN. R G NEELIMA. D G ANDAL. N B MUREGESH. K G SIVARAJAN.

We have got it in Excel! .

Soft Skill Can we make a table of counts (frequencies) from this data? WHY NOT ? USE PIVOT TABLES OPTION .

Skill You can make one-way and two-way frequency tables from Excel sheet Use Data menu and select the Pivot Table and Chart sub menu Follow the Wizard steps You will get the required tables Make Frequency Tables! .

Frequency distribution of students by caste (one-way table) Count of SNO CASTE Total BC 7 OC 5 SC 4 ST 4 Grand Total 20 .

Frequency distribution of students by Caste and Gender (two-way table) Count of SNO GENDER CASTE B BC 3 OC 4 SC 2 ST 2 Grand Total 11 G 4 1 2 2 9 Grand Total 7 5 4 4 20 Can we do this with hand calculations if there are thousands of cases? Not impossible but difficult to do! .

Soft Skill Can we make a Frequency table with given class intervals? CERTAINLY ! USE STATISTICAL FUNCTIONS .

ENGINEERING FUNCTIONS Built-in Functions In Excel .

STATISTICAL FUNCTIONS Built-in Functions In Excel .

AQUIRE SKILL BY DOING… DEMO FOLLOWS…. ..

7 17.2 16.5 16.2 18.5 20.7 15.0 16.2 17.2 15.0 19.3 15.7 17.8 16.3 17.3 13.5 15.7 17.2 17.6 13.9 16.1 13.9 20.4 14.6 18.7 16.8 15.9 16.4 17.3 13.9 17.1 14.4 14.2 16.Making a Frequency Table Body length (cm) of 120 fish 16.2 17.8 10.3 14.3 14.9 15.1 13.8 14.5 16.4 13.6 17.4 17.9 14.0 11.4 16.5 13.6 12.0 14.2 16.4 14.1 14.4 15.0 15.2 16.2 17.8 18.3 12.1 13.4 18.6 14.0 18.9 12.8 16.2 15.3 13.1 18.9 12.0 15.3 14.7 16.0 18.6 15.8 16.0 14.3 13.4 16.4 14.6 15.1 13.6 12.7 18.6 15.7 15.9 17.1 16.2 13.4 14.3 18.1 14.4 15.8 12.7 13.3 15.2 15.6 18.2 14.5 15.5 14.4 14.1 15.8 13.7 19.8 16.2 15.0 17.3 12.6 13.9 18.5 13.5 13.8 15.6 12.5 15.3 13.9 18.8 14.7 13.2 Prepare a frequency table using Excel .

6 9.14 14 -16 16 .0 15.9 2 freq 2 26 43 31 16 2 We use the Paste function ‘FREQUENCY’ lower limit upper limit upper bound (BIN) 10 12.9 20 22.12 12 .9 12 14.min max range interval 10.20 20 .0 17.0 11.0 19.18 18 .9 14 16.7 20.0 21.9 16 18.0 13.9 18 20.22 freq 2 26 43 31 16 2 120 Learn more by ‘Do it yourself’ .9 Class 10 .

You can also construct a Bar Chart Class 10 .20 20 .14 14 -16 16 .18 18 .22 TOTAL freq 2 26 43 31 16 2 120 .12 12 .

ADVANCED FEATURES .

Data Analysis Pak .

Data Analysis Pak .

3 22.99 20.43 20.9 18.43 17.5 22.6 20.3 23.7 22.63 17.55 18.08 18.12 21 25.23 .49 19.Body Mass Index of Tribal Groups The t-test Is the Average BMI Same for the two groups ? Sugali Yanadi 20.51 21.77 23.4 18.63 18.7 20.2 22.

085962 .36 Variance 4.090767 P(T<=t) one-tail 0.229768 Hypothesized Mean Difference 0 df 20 t Stat 3.898222 Observations 12 10 Pooled Variance 3.t-test output t-Test: Two-Sample Assuming Equal Variances Sugali Yanadi Mean 21.002882 t Critical one-tail 1.73833 19.724718 P(T<=t) two-tail 0.005764 t Critical two-tail 2.319215 1.

p-p Plot .

WIDE RANGE OF APPLICATIONS Control charts Forecasting Curve fitting Solver for optimization College Admissions Evaluation of test scores & ranking …and many more! .

The best way of learning Excel is to work with Excel .

S.Sarma Prentice Hall India .V.Statistics Made Simple Do it yourself on PC By K.

Thank you .