You are on page 1of 53

Suyash Upadhyay: 06221201717

MODULE 1
INTRODUCTION TO SPSS
STEPS:
1. Click on analyze then descriptive frequencies drag gender into variables.
2. Choose variable gender by clicking with your mouse.
3. Once gender is highlighted move it across into the variable box by clicking on arrow, click ok.
4. Measures for central tendency for interval variables

1.1 WHAT IS THE MEAN/MEDIAN?


Age of patients
No. of sessions
Satisfaction rating

1.2 WHAT IS THE RANGE OF VALUE FOR LOWEST TO HIGHEST WE


CAN OBTAIN THIS INFORMATION BY RUNNING FREQUENCIES FOR
AGE, SESSIONS AND SATISFACTIONS.
Running frequencies for the measure of central tendency
1. From the top menu, click analyzes then descriptive statistics, then frequencies.
2. Double click the left mouse button, on the variables age session and satisfaction to move
them into the variable box.
3. Click statistics and click in the boxes next to mean, median, mode, minimum and maximum
then click continue.

1
Suyash Upadhyay: 06221201717

2
Suyash Upadhyay: 06221201717

1.3 BAR CHARTS


Bar charts present a graphical display of categorised data.For example- comparing the mean no.
of sessions provided by the councellers.
1. From the top menu click graphs,chart builder in the gallery of charts, bar should be
highlighted showing a range of the bar charts . click on the first bar chart and drag it into
the preview area.
2. Drag councellers from the x axis and drag sessions on y axis .
3. Click ok, this will produce bar chart in the outputscreen.

1.4 HISTOGRAMS
These are similar to bar charts but are designed to represent data along a continuum. Age of
patients is a good example.
For age:
1. From top menu click on graphs, chart builder, select histogram from the gallery of charts,
2. Drag the first histogram into preview area, from the variable list drag the age into x axis
3. Click ok, this will produce histogram.

3
Suyash Upadhyay: 06221201717

1.5 EDITING A CHART


To edit a chart, you need to double click it with left mouse button, this will open it , right click.

4
Suyash Upadhyay: 06221201717

1.6 BOX PLOTS


To produce box plots for age and gender
1. From the top menu, click graphs, chart builders.
2. Select the box plot from the gallery of charts to reveal the range of box plots chart
available.
3. Drag the first box plot into the preview area.
4. From the variables list, drag the age into y axis and gender into the X axis.
5. Click ok; this will produce two box plots showing the distribution of age in the sample.
These box plots illustrate the spread of the data.
1. The shaded box contains the middle 50% of values.
2. The line inside the box depicts the median value.
3. The T bar lines above and below the box reach to the highest and lowest values.
So, from these box plots comparing the ages of our males and females we can
see that the median age of females is higher and the overall the age ranges are
higher. Then the spread of ages indicated by the size of the shaded boxes and
the length of the T bars is roughly similar for both groups.

5
Suyash Upadhyay: 06221201717

1.7 CONTINUOUS VARIABLES


Procedure to obtain descriptive statistics for continuous variables
1. From the menu click on analyze
2. Click on descriptive statistics, descriptive
3. Choose and highlight the continuous variables you want to examine(session and
satisfaction)
4. Move these into the variables box
5. Click on options button
6. Tick mean, standard deviation, min, max
7. Click on skewness and kurtosis
8. Click on continue and then ok

6
Suyash Upadhyay: 06221201717

1.8 INTERPRETATION OF OUTPUT FROM DESCRIPTIVE


Sessions by 30 respondents range from min of mean to the max mean. The mean score is
7.37and SD 2.341. Descriptive also provides some info concerning skewness and kurtosis of the
continuous variables.
The skewness value provides an indication of the symmetry of the distribution.
Kurtosis is the info of the peakness. If the distribution is perfectly normal, you would obtain a
skewness and kurtosis value of zero.
If the skewness values are positive the scores will be clustered to the left end, whereas negative
skewness indicates a clustering of score at the higher end. Kurtosis value below0 (-0.247)
indicates distribution that are relatively flat.

7
Suyash Upadhyay: 06221201717

MODULE 2
MANAGE DATA IN SPSS

2.1 FINDING OUT THE CASE SUMMARY


Case summary are used to understand the nature of the data
2.1.1 ON THE BASIS OF GENDER
1. Go to analyze tab, then select report and choose the option of case summaries.
2. A dialogue box named summaries cases will appear, in which one has to enter marks
obtained in final year in the column of variables and what's your gender in the column
of grouping variable column. (Fig. 1)
3. Then go to statistics and select mean in cell statistics and press continue. (Fig.2)

8
Suyash Upadhyay: 06221201717

In output statistics viewer, summarize case summarizes appear Containing marks


obtained in final exam on the basis of gender with Mean (Fig.3 and 4)

9
Suyash Upadhyay: 06221201717

2.1.2 ON THE BASIS OF CASTE

1. Again go to analyze tab, then select report and choose the option of Case
summaries.
2. A dialogue box named summaries cases will appear, in which one has to enter
"marks obtained in final year" in the column of variables And “in which Caste do you
belong" in the column of grouping Variable column.
3. Then go to statistics and select mean in cell statistics and press continue.(Fig.5)

10
Suyash Upadhyay: 06221201717

In output statistics viewer, summarize case summarizes appear containing marks obtained in
final Exam on the basis of gender with mean. (Fig.6)

11
Suyash Upadhyay: 06221201717

2.2 COMPUTING NEW VARIABLE


1. Go to transform tab, and select compute Variable from the drop down menu.
2. Dialogue box named compute variable will appear
3. One has to type "midterm" in target variable column, then type MEAN then select
different types of labels in brackets() and choose mean from the column of functions
and special variables. Press "ok”.

4. After pressing ok, a new column the sheet of Data editor appear Named “midterm”
5. Midterm column represents the mean value of all the marks combined of five midterms
scored by every individual

12
Suyash Upadhyay: 06221201717

MODULE 3
CODING AND RECODING IN SPSS

3.1 RECODING INTO DIFFERENT VARIABLES: OLD AND NEW VALUE


1. Go to transform tab, and choose recoding into different variables old and new value
from the drop down menu.
2. Dialogue box rammed " recoding into different variables" will appear
3. Enter “GRADE” in the name of output variable and its label. Also, Choose "marks in final
exam" in the numeral value column

4. Select “old and new value" option


5. Dialogue box named "recoding into different variables old and new value" appears.
Choose different range i.e. 60 to 75 as 'A’ grade; 50 to 59 as B grade and 0 to 49 AS ‘C’
grade and press continue.

13
Suyash Upadhyay: 06221201717

6. One ok is pressed, another column named "GRADE” appeared in Data editor sheet
depicting grade A,B and C according to the marks of final exams.

14
Suyash Upadhyay: 06221201717

3.2 RECODING INTO SAME VARIABLE: OLD AND NEW VALUE


1. Go to transform tab, and choose recoding into some variables old and new value from
the drop down menu.
2. Dialogue box named “recoding into same variables” will appear.
3. Select midterm as numeric value expression
4. Select" old and new value” option
5. Dialogue box named” recoding into same variables old and new value" appears. Choose
different range .i.e. 0 to 5 as’3’; 5.1 to 6 as ‘2’ and 6.1 to 10 as ‘1’ ;then press continue.

15
Suyash Upadhyay: 06221201717

6. Once ok is pressed, column named “Midterm” re-appeared in Data editor sheet


depicting new values.

16
Suyash Upadhyay: 06221201717

MODULE - 4
SELECTING, SORTING AND ANALYSING THE DATA IN SPSS

4.1 SELECT CASES

1. Go to data tab, choose "select cases" from the drop-down menu.


2. A dialogue box named select cases will appear then select ' if condition is satisfied’.
3. Choose " select cases if " and type Gender =1 in the variable box and then press
continue.

17
Suyash Upadhyay: 06221201717

4. In data editor the data appear with some changes. Female students remains unmarked
since they are selected cases.

18
Suyash Upadhyay: 06221201717

4.2 CASE SUMMARIES


1. Go to analyze tab, then select reports then case summarize from the drop down menu.

2. Case summarize dialogue box appears, select "marks obtained in final exam" in variable
column and "what's your gender" in grouping variable column, also limit cases to 120.

19
Suyash Upadhyay: 06221201717

3. Go to "statistics", a dialogue box named "summary statistics report appears" Choose


"Mean" in cell statistic column and click continue.

4. In output appears in output data viewer showing marks obtained in final exam by
females.

20
Suyash Upadhyay: 06221201717

4.3 SORT CASES


1. Go to data tab, Select sort cases from the drop down menu.
2. Select which section one want to sort. Here we select "what's your gender".
Select ascending order as sorting order and press "ok".

21
Suyash Upadhyay: 06221201717

3. In Data viewer, data gets arranged on the basis of gender. Since Females were assigned
1 as value label, they appear before males who were assigned 2 as the value label.

22
Suyash Upadhyay: 06221201717

MODULE-5
FINDING OUT THE MISSING VALUES AND RECODING THE
SAME VARIABLE AFTER FILLING THE MISSING VALUES
ACCORDING TO GROUP: - SPLITTING FILE

5.1 MISSING VALUES

1. Go to analyze tab, select descriptive statistics and statistics from drop down menu.

2. Frequency named dialogue box appears, choose "4-year resale value "in variable column
and then go to 'statistics'.
3. Choose mean from different options in the frequencies statistics dialogue box and press
continue.
4. Press "ok"

23
Suyash Upadhyay: 06221201717

5. In output data viewer, 4-year resale value appear with mean

5.2 RECODING INTO SAME VARIABLE: OLD AND NEW VALUES

1. Go to transform tab, and choose recoding into same variables old and new value from
the drop down menu.
2. Dialogue box named " recoding into same variables " will appear. Select 'resale value as
numeric value expression

24
Suyash Upadhyay: 06221201717

3. Select “Old and New Value” Option


4. Select “System Missing” in old value box and ’32.4’ in the new value box as it is mean.
And then press Continue.

5. Then in data editor, all the missing values of the data is replaced by a common number
i.e. 32.4 as it is the mean of all the values combined together.

5.3 SPLITTING FILE


1. Go to data tab, select split file from the drop down menu.

25
Suyash Upadhyay: 06221201717

2. Split file dialogue box appears, in which manufacturer should be selected in group based
on column and press "ok".

3. In output viewer, data appear into different splits based on manufacturer with their
mean value.

26
Suyash Upadhyay: 06221201717

5.4. RECODING INTO SAME VARIABLE: OLD AND NEW VARIABLE


(SELECTING THE PARTICULAR BRAND NAME AFTER SPLITTING THE
FILE)

For example:- BMW missing value is replaced by its mean value


1. Go to transform tab, and choose recoding into same variables old and new value from
the drop down menu.
2. Dialogue box named" recoding into same variables " will appear. Select "resale value' as
numeric value expression

3. Select " old and new value" option


4. Select " system missing " in old value box and '32.4 'n new value box as it is the mean
Then press continue.

27
Suyash Upadhyay: 06221201717

5. Then in data editor, all the missing values of the BMW is replaced by a common number
i.e. 32.4 as it is the mean of all the values combined of BMW together.

6. Press "ok"
7. In data editor, all the missing values of BMW is replaced by "32.4"

28
Suyash Upadhyay: 06221201717

5.5 MEAN VALUE OF BMW, ONCE MISSING VALUE IS REMOVED


1. Once missing value of BMW is replaced by 12.96, then go to transform tab, and choose
recoding into same variables old and new value from the drop down menu.
2. Dialogue box named" recoding into same variables" will appear.

3. Choose 'IF' option from the dialogue box, then another dialogue box named "Recode
into same variable: If cases". Select the option "Include if case satisfies condition" and
type manufacturer "BMW" in the blank area and press "continue".

29
Suyash Upadhyay: 06221201717

4. Then in output viewer; mean frequency, percentage and cumulative percentage is


changed as the only manufacturer- BMW is depicted with no missing value.

30
Suyash Upadhyay: 06221201717

MODULE-6
DESCRIPTIVE STATISTICS-DEFINING, KURTOSIS
AND SKEWNESS
6.1 FINDING OUT NORMAL DISTRIBUTION ACCORDING TO A CASE
STUDY ON HYGINE CONDITION OF A SEMINAR
1. Go to analyze tab, choose "explore" option under Descriptive Statistics.

2. A new dialog box named "Explore" will appear. Enter hygiene condition of day 1,2 and
3 cache in national seminar under " Dependent list" and Gender in "factor list"

31
Suyash Upadhyay: 06221201717

3. Then select Plots from the dialog box. A new dialog box will appear, choose
"HISTOGRAM" from the options and press continue.

4. In output viewer, both case summary' and descriptive' appear with "Histogram"

5. To check the distribution curve, select the "histogram". go to Elements' in chart editor
and select show distribution curve

32
Suyash Upadhyay: 06221201717

6. Distribution curve is shown in "Histogram"

33
Suyash Upadhyay: 06221201717

6.2 EXAMINING OUTLIERS WITH THE HELP OF BOXPLOT


1. Hygiene condition of each day in National Seminar shown by different "BOXPLOT"
For example: - Hygiene condition of day 1 in national seminar is shown which depicts
that one value is entered wrongly as it falls out of range.

2. Select the BOXPLOT; go to edit and choose "Go to case".

34
Suyash Upadhyay: 06221201717

3. Now the frequency distribution of "FEMALE is now evenly distributed

6.3 CHECKING NORMALITY AND HOMOGENEITY


1. To check the homogeneity, one has to select the "BOXPLOT.

35
Suyash Upadhyay: 06221201717

2. Go to analyze tab, choose to explore under Descriptive.


Choose plots from different options and then a dialog box named "Plots" will appear.
Select normally plots with test' and 'untransformed then press continue.

3. Once "ok" is pressed, Descriptive in output viewer appears withchanged frequencies.


Now value of both skewness and kurtosis' lies between -I and +1

36
Suyash Upadhyay: 06221201717

MODULE - 7
CORRECTING DATA PROBLEMS
There are three methods to correct data problems
7.1 LOG TRANSFORMATION - IT REDUCES POSITIVE SKEWNESS
Log 10 (X). Example LOG(DAY 2 +1)
1. Go to Transform" and choose "Compute Variable" from the drop down menu

2. Compute Variable dialog box will appear. Type "Log" under Target and Variable
column and then select Type & Lable, in appeared dialog box again write "Log" under
label and choose numeric as "type and press continue.

37
Suyash Upadhyay: 06221201717

3. For the column of numeric expression, choose "Lg10" from the list of function and
special variables' then select "Hygiene condition of day 2"; then do +1 in the bracket.
Press "OK".

4. Then go to Analyze choose "Explore" from drop down menu. Under dialog box named
"Explore" choose Log' under dependent list and Gender in factor list. Select histogram
from plots option and then press 'ok

38
Suyash Upadhyay: 06221201717

5. Another new column is added in "editor sheet" named "Log"

6. In output viewer, Histogram of Log is shown.

39
Suyash Upadhyay: 06221201717

7.2 RECIPROCAL TRANSFORMATION


Dividing one by each score also reduces the impact of large scores This transformation
reverses the scores.
1. Go to Transform tab, choose "Compute Variable from the drop-down menu. In the dialog
box, type "reci" for Reciprocal in Target and Variables. In numeric expression type "1" and
select "Hygiene condition for Day 2' from the list: press "ok"

2. Then go to Analyse choose "Explore" from drop down menu. Under dialog box named
"Explore" choose reci' under dependent list and Gender in factor list. Select histogram from
plots option and then press ok'

40
Suyash Upadhyay: 06221201717

3. In output viewer, Histogram of 'reci’ is shown.

7.3 SQUARE ROOT


It reduces positive skewness and useful in stabilizing variance.

1. Go to Transform tab, choose Compute Variable from the drop-down menu.


In the dialog box, type "SQRT" for Square root in Target and Variables. For the column of
numeric expression, choose "SQRT" from the list of function and special variables'. Then
select "Hygiene condition of day 2" in the bracket. Press "OK".

41
Suyash Upadhyay: 06221201717

2. In "Date editor" a new column named "SQRT" is shown.

3. Go to explore and select log in dependent variable.

42
Suyash Upadhyay: 06221201717

4. In output viewer, Histogram of 'SQRT is shown with other graphs.

43
Suyash Upadhyay: 06221201717

MODULE 8
HYPOTHESIS TESTING FOR MEANS- PARAMETRIC TEST
8.1 ONE SAMPLE T TEST

To know that whether the height of my participants is taller or shorter than average mean or
known mean or table mean
Research question - does this diet effect the average the average height of children
Null hypothesis - there is no significance difference between average mean and sample mean
Alternate hypothesis- there is significance difference between sample mean and average mean

Steps

1. Go to analyze, click compare means, then select one sample t Test.

44
Suyash Upadhyay: 06221201717

2. Move variable height in the variable box, click ok and output will be displayed

Interpretation -Calculated value (1.057) is less than the table value (2.262) and also p value is
0.318 which is larger than .05.
Finally, at 95% confidence interval lower confidence interval is negative and upper is poistive so
it covers zero, hence mean of our sample i.e. 65.8 is not statistically different from national
average
Hence null hypothesis accepted

45
Suyash Upadhyay: 06221201717

8.2 TWO SAMPLE T TEST

Research Question - To see the effect of diet on calorie loss before and after 6 months of diet
Null hypothesis -There is no difference in the calories before and after 6 months of diet
Alternate hypothesis - There is difference in the calories before and after 6 months of diet

Steps
1. Go to analyze, then compare means and then paired sample t test,

46
Suyash Upadhyay: 06221201717

2. Move variable height in the variable box, click ok and output will be displayed

Interpretation

Calculated value 17.418 is greater than table value 2.262, hence difference is significant.

p value .000 is less than p value 0.05

95% confidence interval does not include zero as the upper and lower value both are negative
thus difference is significant. Hence null hypothesis is rejected,

47
Suyash Upadhyay: 06221201717

MODULE 9
ANOVA TESTING FOR MEANS
9.1 COMPARING MEAN OF 3 POPULATIONS USING ONE WAY ANOVA
1. Selecting the command

2. Selecting Variables

48
Suyash Upadhyay: 06221201717

3. Post Hoc Multiple comparison

OUTPUT

49
Suyash Upadhyay: 06221201717

REPORTING THE RESULT


There was a significant effect of promotion on level of sales.
If p<0.05 the two groups that differ significantly were the high medium promotion and
medium low promotion groups.

50
Suyash Upadhyay: 06221201717

Module 10
CHI SQUARE
10.1 TEST FOR INDEPENDENCE OF TWO VARIABLES

It is used to explore the relationship between two categorical variables.


Exercise - is there any relationship between genders and counsellors.
Assumptions - the lowest expected frequency in any cell should be at least five.
Within 2*2 table the expected frequency should be at least ten.

Steps
1. From the top menu, click on analyze, then descriptive statistics and then crosstabs.

51
Suyash Upadhyay: 06221201717

2. Click on gender as row variables and counsellors as column variables.

3. Click statistics button.


4. Tick chi square, click continue.

5. Check on the cells button


6. In the count box, tick observed and expected.
7. In the percentage section, click on the row, column and total boxes.

52
Suyash Upadhyay: 06221201717

8. Click on continue and then OK.

Interpretation: There is a relationship between counsellor and gender null hypothesis rejected.
There is no relationship between counsellor and gender.

53

You might also like