# Chapter One

Analysing Simple Questionnaires: SPSS Basics
Introduction
Aims of the first three chapters
This opening set of three chapters has a number of purposes. Most directly, they teach
you how to analyse simple, data-gathering questionnaires. But this basic goal runs in
parallel with some other goals.
• the chapters introduce you to issues connected with how you structure a dataset
for computer analysis – how you organise it, and how you prepare the data for
input. (This is developed in Chapter 2.)
• the chapters cover a basic set of statistical techniques for describing data
concisely, and in ways that make patterns within the data more salient. These
techniques include the use of descriptive statistics such as means, standard
deviations, cross-tabulations etc.
• the chapters also cover some of the dataset basics, such as how to label what you
do effectively, and how to use the power of SPSS to create new variables, or to
recode the values you initially input
• the chapters show you how to display data visually, using a number of different
formats, such as bar charts, and pie diagrams
• the chapters show you how to save the computer output that you generate, and use
this output in other documents
The Plan of the Three Chapters
The next three chapters, starting with this one, are organised to progressively extend
the techniques you can use to analyse questionnaires.
In this chapter, Chapter One:
• You are introduced to a “taster” questionnaire, on lateralisation, i.e. a
preference for right or left, and shown how to do some simple analyses. For
this section you will also have to gather some data yourself.
• The main questionnaire dataset for this chapter is described to you. The
questionnaire which is used concerns a survey that was done of the British
ELT profession – to ask people whether they thought it would be worth
establishing a British Institute of English Language Teaching (BIELT) as a
professional organisation for the field. The questionnaire contains over 50
items, and there are responses from nearly 1200 people.
In Chapter Two:
• You are taken through a reduced version of the dataset, with a subset of the total
number of variables, and only 100 cases (i.e. responses from 100 people). This
reduced dataset, though, is enough to teach you how to use the various statistical
and graphing procedures. There are tasks for you to do at all stages.

In Chapter Three:
• You are given the entire dataset to analyse, with a series of progressively more
• You revisit the lateralisation questionnaire, with data you should have gathered by
then, and compare your results to those obtained in a sample already collected.
Presentation Conventions
This chapter (and others in the course) follow a set of consistent presentation
conventions. The chapters contain four different sorts of information, and each of
these is signalled in some way. The conventions are:

plain text: where there is simply text (not boxed, or caused to stand out in any
way) you are dealing with simple exposition. These sections are the central
material in the course. They are to be read, thought about, and absorbed.

The task is located at an appropriate point in the chapter and you should attempt
to complete the task there and then, by accessing the computer, using SPSS, and

feedback: most tasks will be followed by feedback. This feedback is
enclosed in a box bordered by a single line and the text is in Univers font
(as is the rest of this bullet point). Feedback sections usually contain the
output the task was intended to produce, together with a commentary on
the output. Note that the tasks in the first chapter are unusual in that they
are not accompanied by feedback. This is because they are more
straightforward in nature, and are not output-focussed.

reflection: from time to time you will encounter blocks of text, headed Reflection,
always italicised, and enclosed within a double-line border. These sections are
feedback. They ask you to stand back and reflect upon what you have done.
Actual tasks require you to do specific things, (and, with luck, find out reasonably
interesting things). But the techniques which the tasks push you to use are general,
and can be applied, by you, to other datasets, and to answer other questions. The
reflection sections push you to appreciate the generality of the techniques you
have learned, so that you can transfer them to other situations.

Building and Inputting a Questionnaire Dataset
The first dataset will only be used briefly. It is the least serious in the entire course,
and is only intended to introduce you to working with data. Consider, first, the
following set of questions:
Which side do you prefer? Tick the side that applies to
you.
1. With which hand do you draw?
2. Which hand would you use to throw a ball to hit a
target?
3. In which hand would you use an eraser on paper?
4. With which foot would you kick a ball to hit a target?
5. Which hand removes the top card when you are dealing
from a pack?
6. If you wanted to pick up a small stone with your toes,
which foot would you use?
7. If you had to step up onto a chair, which foot would you
place on the chair first?
8. Which eye would you use to look through a telescope?
9. Which is your dominant eye? (Hint: point at something,
then close each eye in turn. You should find that your
pointing finger only lines up with one eye.)
10. If you wanted to listen to a conversation behind a closed
door, which ear would you place against the door?
11. Into which ear would you place the earphone of a small

Left

Right

___

No
Pref.
___

___
___
___

___
___
___

___
___
___

___

___

___

___

___

___

___
___

___
___

___
___

___

___

___

___
___

___
___

___
___

___

These questions are all concerned with the issue of which “side” of your body you
prefer (or, to use the more specialist term, “lateralisation”). Notice here that there are
four questions on “handedness”, three questions on “footedness”, and two each on
“eyedness” and “earedness”. In other words, the questionnaire implies that it is not
entirely safe to have only one item in each of these areas, and that lateralisation may
not be simple, i.e. the fact that you have a right-hand preference does not guarantee
that you have a right-foot preference, and so on. Lateralisation, in other words, is
thought to be a complex “more or less” issue rather than simply “all or none”.
Imagine handling data generated by this questionnaire. The first decision that you
have to make is how to give numeric values to the answers. In that respect, let’s
imagine that you choose to represent left (hand, foot, eye, ear) by ‘1’, no preference
by ‘2’, and right by ‘3’. In effect, what you are doing here is coding the questionnaire
responses, so that an easy-to-handle numerical code replaces the original answer.
This numerical code can then be accessed more easily by a statistical program.
This leads to the next question – how to represent or arrange this data?
In effect, the standard manner of doing so is to imagine a matrix, in which:

for ease of exposition. you often cannot imagine forgetting what it all means. The data also suggests that in most cases (but not person e) there are slight intrusions of the “other” laterality. When you are inputting data. as we shall term them). It functions as a diary of all the things you have done. to introduce a little variety. while the first column shows an identifier (also minimal) for each person who has filled in the questionnaire. it’s worth creating a completely separate word-processed file in which you jot notes on the dataset (a separate file for every dataset you have). You may prefer not to use just letters (which presuppose that you have already got some sort of codebook telling you what or who the letters refer to). we are working with the answers from the first six people only): ID a b c d e f Q1 3 3 3 1 3 3 Q2 3 3 3 1 3 3 Q3 3 3 3 1 3 3 Q4 3 3 3 1 3 1 Q5 3 3 3 1 3 1 Q6 3 3 3 1 3 3 Q7 3 3 1 1 3 1 Q8 1 3 9 1 3 1 Q9 3 3 3 1 3 3 Q10 3 3 2 1 3 3 Q11 3 3 1 3 3 1 Notice that the first row and the first column in this matrix have been taken up with non-data elements.• • each row represents a person each column represents one of the questionnaire items (or variables. and remember what on earth it means! Note that. The first row shows an identifier (rather minimally in this case) for each of the variables or measures. as follows (where. how the passage of relatively brief periods of time erases this memory. showing which identifier code matches up with which real person. This “codebook” file could then contain the key information. You could arrange this data. regarding the identifiers in the first column: a) it’s good practice to put an identifier in the first column. Remember that. It is amazing though. it is very helpful to put in an identifier which will enable you to come back to this data. So. You can see that it captures the way that most of the people have a “right” laterality preference (since most responses are coded ‘3’). while there is one left-lateralised person (person d) and one rather ambidextrous person (person f). . and instead use names or more complicated codes. we get to look briefly at the data itself. The real purpose is to have a unique identifier. Finally. and why. on paper. b) you don’t have to use single letters in this way. precisely in order to help you remember later what decisions you have made. see it in the first column. perhaps because they convey something more real about the source of the data (and also cover more than 26 possibilities!). The data then appears in the remaining space in this matrix (which basically resembles a spreadsheet). even beyond this first column identifier.

for whatever reason.e. let’s say we want to gather data on lateralisation because we have some hypotheses about the distribution of lateralisation. to signal that there is a missing value. and which is taken to mean someone can use either right or left. The first of these is in the third data line. let’s see how we would plug these factors into our questionnaire. a coding value was chosen which could not naturally occur. for Question 8.) . and so will “know” to ignore it in all calculations. (rather than leaving a blank). There is one more thing that we need to consider at the outset. This may be interesting. So. This is meant to represent a case where someone has replied ‘No preference’. (I leave it you to decide which order you prefer. seriously). and this is used. we can check whether there are patterns in the answers where people say that they have no laterality preference. we think that there may be some aspects of lateralisation which are different for certain categories of people. chose not to answer this question (about “the eye you use to look through a telescope”). They are simply convenient for exposition. if you accept this dubious premise. which doesn’t fit into the expected categories. we are not dealing with a completely missing value. and by using a coding like ‘2’. (where no response is given). In other words. This cell in the data matrix is coded ‘9’. Sex This is relatively easy. Let’s imagine that we will code one sex ‘1’ and the other ‘2’. But taking a fairly superficial approach.Last of all. The computer will later be instructed that ‘9’ represents a missing value. notice a couple of ‘different’ numbers. i. The intention here is to designate person “c” as having failed to respond. Here are a few suggestions: • sex • age • nationality • familial lateralisation Now the point here is not that it is seriously being proposed that these factors are crucial in an investigation of lateralisation. but with a slightly different response. We haven’t considered why this questionnaire is being used (and in all honesty nor will we. What I am imagining happened here is that this respondent. Another cell contains the coding ‘2’. So.

if thought to be more appropriate. 0. there are other members of someone’s immediate family who are left-handed. and essentially meaningless. in the sense that we might get very few people with any particular age. or. in this circumstances. 1. brothers or sisters are left handed? And note: this question will generate a simple number as an answer. . we will simply add the question: What proportion of your parents. It would be too detailed. it may be interesting to see whether. Nationality Clearly one cannot have a code for every country. Of course. we might try: North America 1 Central and South America 2 Europe 3 Africa 4 Middle East 5 Asia 6 Australasia 7 Familial Handedness Or alternatively any of these could be broken down more finely. to two decimal places.00 if 100% of your relatives are lefthanded.07 if only one in fourteen is left-handed. e. For example. and each of the categories is therefore more likely to have a sizable number of “cases”. you could input data on the age. grandparents. If there is a genetic component in lateralisation. For our purposes. of each subject. The next question is then to ask how far the family “net” goes. in years.g.Age This is a little more difficult. some of these categories could be conflated. and what will be entered into the datafile will be a number referring to a range of ages. So. we might try: Under 20 1 20-29 2 30-39 3 40-49 4 50+ 5 This system will allow us to recode actual ages within these five categories. let’s imagine that this would be inefficient. But for present discussion. the “raw” data will be recoded. Typically. even if someone is right-handed. So we would need to rationalise in some way.

00 0. would be likely to contain far far more responses than simply six.15 0. again. . That they are placed fairly prominently on the line helps in this process.08 0. they can be used as the organising basis for a number of useful analyses designed to bring out patterns in the data.32 0.g. six is a manageable number to display. Two points are worth making about this: • placing what might be called “organising variables” in columns at this point in the line makes good sense in that they are extremely visible • now that these variables are coded.18 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 3 3 3 1 3 3 3 3 3 1 3 3 3 3 3 1 3 3 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 3 3 3 1 1 3 1 1 3 9 1 3 1 3 3 3 1 3 3 3 3 2 1 3 3 3 3 1 3 3 1 Bear in mind. in real life. we now need to think about how to incorporate it within our dataset. you could now: • explore whether men and women differ in their laterality preferences • explore whether there is any sort of difference in laterality preferences as the figures relate to older people (perhaps right-handers who are older were pressured into losing their natural left-handedness early in life) • explore whether laterality preferences are related to (fairly grossly defined) national origin (e. A fairly general response would be as follows: ID Sex Age Nation -ality Fam. is the proportion of left handers constant across populations?) • explore (more complicatedly) whether men and women of different ages differ in laterality preferences (perhaps men or women who are older were pressured differently to lose their left-handedness) • explore whether people who have left-handers in their family show different lateralisation patterns to those who do not. (and imagining that you had access to a dataset of (say) 200 cases). Once again. As a result of organising the data in this way. H’ness a b c d e f 1 1 2 2 1 2 2 2 3 2 1 3 2 4 3 6 6 4 0.10 0. which is its only good feature! Here we see that four new columns have been inserted immediately after the ID column and before the scores for each questionnaire item rating. that this dataset.Assuming that we have this information.

2/7. no-one related only by marriage. With which hand do you draw? 2.The full form of the questionnaire is given below. cousins. In which hand would you use an eraser on paper? 4. which foot would you use? 7. Which hand removes the top card when you are dealing from a pack? 6. If you wanted to listen to a conversation behind a closed door. You should find that your pointing finger only lines up with one eye. e. If you had to step up onto a chair. i. because it is assumed that when you print out this questionnaire.1: Gathering Lateralisation Data Sex: _____________ Age: _______________ Nationality: _____________ Family Handedness: _____________ (Think of the ten closest members of your family. Which is your dominant eye? (Hint: point at something. Although presented as a task. If you cannot think of ten such (blood) relatives. Then count the number of lefthanders in this group.) 10. the material isn’t shaded. children. Into which ear would you place the earphone of a small radio? Left ___ No Pref. you won’t want to give out people a shaded version! Task 1. Which hand would you use to throw a ball to hit a target? 3. If you wanted to pick up a small stone with your toes. which foot would you place on the chair first? 8.g. With which foot would you kick a ball to hit a target? 5. in the order: brothers and sisters.e. and give this number out of ten. which ear would you place against the door? 11. parents. then close each eye in turn. ___ Right ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___ . indicating two left-handers out of seven relatives. grandparents. Which eye would you use to look through a telescope? 9. 1. and give the result.) Which side do you prefer? Tick the side that applies to you. then count as many as you can.

It prepares the ground immeasurably.which contains a lot of separate values. but certainly thirty responses. since this will bias the familial handedness results by giving you repeated versions of the same thing. and extend it. but you don’t have to analyse all the data you do collect. We will return to this. At the outset. In other words. e.g. think carefully about organising variables at the stage where you are planning data collection – if you don’t. e. in most cases). avoid getting lots of responses from the same family. Statistical programs have difficulty analysing data: . it would be good if you had completed collecting your data. • having separate measures (in this case of the different aspects of lateralisation) is what will allow the “fineness” of the statistical explorations which will come later. Remember: you can’t analyse data you don’t collect. you should try and get this data collected in the next two to three weeks. • thinking of relevant “organising” measures (or variables) will be very helpful later to structure an investigation.Print this questionnaire in multiple copies and try to get it completed by as many people as you can. to try to get some variety regarding: • Sex • Age • Nationality In addition. It won’t take anyone longer than five minutes to complete the questionnaire. Yet it is a very powerful simplicity which contains a number of general lessons: • structuring a dataset into rows (cases. always try to think of as many of these as you can. all we have done so far is very simple. you should also think about developing a data file so that you can input this data to SPSS.5. it would be sensible. although it might be easier. In other words. age in the present case For the first of these. Guidance is provided in the following pages which will enable you to set up this data file. Because of the information that the questionnaire collects. or even more. so it shouldn’t be difficult to get this number. We will return to the questionnaire at the end of Chapter Three. • coding non-numeric data needs a little planning (but probably not that much. or people) and measures (or variables) is central to any later statistical work. the answers “left” and “right” you need to devise a word-to-number transfer system (and here we’ve used ‘1’ (left). or subjects.which is expressed as text . so by the time you reach that point. as a suggestion. if possible. you’ll kick yourself later when you do think of something it would have been really useful to have asked. As you work through Chapters One to Three and acquire relevant SPSS skills. • labelling the rows and columns is very helpful (especially after the passage of any length of time). Reflection In one sense.g. . ‘2’ (no preference). in Task 1.

g. prepare. and then. Start. (Sometimes.Click Continue.) For the second. 4. you should do exactly the same additional thing for each variable: . Go to the Data drop down menu. on computer disc or CD-Rom or downloadable from the internet. Q2. and choose Define Variable 3.Type in ‘9’ in the left-hand box which becomes available.e. Later. for ID.2 The datasets in this course are almost all given to you. because more than ten will make the data unwieldy.and ‘3’ (right)). 9. . but not entirely realistic. Click Cancel if you are presented with an inset screen showing previous files which have been opened. (for all variables except ID). i. type in ‘ID’ in the box for Variable Name (replacing the ‘Var0001’ which is probably being proposed).Click on the Missing Values button. Each time. Notice that the name at the head of the column changes to ID. continents for nationality).Click on the Discrete Missing Values radio button. Repeat this process for each of the remaining variables. and then a blank screen consisting of rows and columns. you will have to gather. and then input the data to the computer. since when you are working with your own data. In such cases. . you don’t have to do anything with the Type button. 2. Start SPSS on your computer. (e.e. you’ll have more labels than just two. 7. . click OK. but here. Programs. This is convenient. to get this data input process under way. follow these steps: For SPSS 9 or lower 1. though. This should open up the standard SPSS screen. the variable type that SPSS guesses by default. (i. and/or try to work in the range of fewer than ten categories. since typically. Task 1. 8.Click OK. it is as well to specify String. with variables indicated at the top. The point of this first questionnaire on lateralisation is therefore to give you a little practice at inputting data yourself. nationality). Click Continue. . . You should now be confronted with a spreadsheet-type screen. Note that because you are now dealing with numeric data. giving each names such as Sex. There are four buttons in the lower part of the screen. For now. from the radio buttons available. you will type in letter-based identifiers. since you can accept the default choice that SPSS makes for you. 6. use any “natural” coding system that seems relevant (e. the way to deal with the flood of data is to impose order on it with a reduced set of categories. (Most of the time.g. Click on Type. On the screen that comes up. of course. Numeric. will be the right choice. Q1. 5. Back at the original Define Variable screen. age. SPSS for Windows). Remember: the consequence of your decision now will be played out in the analyses that are possible later. datasets you are given on other topics will emphasise analysis much more than data input). choose String.

and then you leave it when you do most of the analyses. In other words. 6. In the cell below missing click on the grey button to open the dialogue box. (Most of the time you will actually be working with SPSS. Click Cancel if you are presented with an inset screen showing previous files which have been opened. and the columns represent aspects of each variable. click in the cell below ID. In other words. 7. Note also that. the variable view will be the one that is important. you cannot use a word processor to edit this data. with two tabs towards the bottom left hand corner. Click on the next cell to the right. SEX. and then set the type as numeric. go to the File drop-down menu. click on the variable view tab to move to it. Then go back to variable view because we have not finished setting up this first variable. columns are variables or measures and rows represent people. If you are not. Start. indicating that the file is a specialised SPSS file. in contrast to the variable view. After you have finished defining and assigning a Missing Value code for the last variable (Q11). and save your work. labelled data view and variable view. click on the top left hand corner box with the mouse. In this case. Click OK to close the dialogue box. The top left hand box should be highlighted with a thick black line around it. saved in a special format. But right at the beginning. at the beginning you spend a bit of time in this view. Input the actual data values for all six people. and width as 10. as the ID will be a letter identifying each questionnaire respondent. 5. Note in this view. For all the other variables or measures in this dataset.sav’. e. You should be in the variable view screen. choose string. and the appropriate choice will be numeric. as all the values for sex are whole numbers with no decimals. SPSS for Windows). Click in the cell immediately to the right of ID. Click on the up and down arrows to increase the column width to 10. below width. Click it and a menu will pop up containing a number of radio buttons.e. So you have to make a mental adjustment about the orientation used by each of these two screens when you move between them. besides ID.g. and notice that the title ID appears in the first column. SPSS will automatically give this file the suffix ‘. You can leave the rest of the cells in the ID row alone. and for all fifteen variables. (i.10.) 12. 4. (If another box is highlighted. . 3.e. put in data for sex.) Type the name of the first variable. Start SPSS on your computer (i. 2. You use these to indicate what kind of data goes into this column. 11. is “ID”. Save the file For SPSS 10 or higher: 1. Type the name of the next variable. Click on the data view tab. Still in variable view. nationality and the eleven questionnaire responses. Then click on the discrete missing values variables button and type in the number ‘9’. You will see a grey button in the right hand part of this cell. when you have a new dataset. which in this case. you need to tell SPSS how you have set up your data. This should open up the standard SPSS screen. You will need to provide a name for the file you are saving. data view is important. Decimals should be set as zero. For this. You should now be confronted with a spreadsheet-type screen. Press ENTER. Programs. with data view the ROWS represent variables. and not available from other programs. age. the value will be a number. when you are actually analysing data.

As you can see. simply reminding you about the data gathering that you need to do. In all cases the missing value code is 9. age. After you have finished defining and assigning a Missing Value code for the last variable. SPSS offers an excellent solution to this. Q1. You will need to provide a name for the file you are saving. 11. that the underscore character is used here to separate the two “words”.…. you haven’t done anything with the data yet! But the data handling. Task 1. defining. indicating that the file is a specialised SPSS file. very importantly. A major additional point: variable names in SPSS are limited to a length of eight characters. These are that: • you have learned how to give names to variables • you have learned how to select either string or numeric data types • you have learned how to specify missing values • you have learned how to input data And. which we will cover later in the chapter.) 9.sav”. In other words. But what you have learned has general applications. so change the decimals cell in this row to 2. nationality and the responses to each of the eleven questions).e.1.3 This is really a repeat of Task 1. in the six cells below ID.This means that where you have no information about the sex of a respondent. Save the file. while preserving legibility. Now click on the data view tab and type the letters a-f in the left hand column. this data goes to 2 decimal places. Set up the remaining variables in the same way (age. and inputting skills will underlie everything that you do with SPSS. so (a) you can’t use a name longer than this. and save your work. besides ID. and (b) you need to think about how you “pack meaning” into the eight characters. more broadly. and the underscores “fools” it. (We will return to naming and labelling issues later in the chapter. Reflection You have done a set of specific things with the lateralisation dataset and obtained a set of results. the computer will give it a value of 9. . you cannot use a word processor to edit this data. 8. nationality.Q11). 10. O. family handedness. SPSS doesn’t accept spaces in variable names. around pp 21-22. but very. you have started to learn how to structure a dataset.K. Input the actual values for all 6 people and for all fifteen variables (i. put in data for sex. and the decimal value should be set as 0. Note. Q2. saved in a special format and not available from other programs. Age is obviously easy. Although restricting names in the datasheet to eight characters is irritating. in passing. except for family handedness. SPSS will automatically give the file the suffix “. but you might need to do things like use the name Fam_Hand so that the “real” name of Family Handedness is transparent. go to the file drop down menu.

and sought to obtain information about a whole range of relevant areas. The Main Questionnaire for this chapter: The BIELT Survey The next section gives you some background information about the BIELT questionnaire. But it will make much more sense to you if you first work through some background information so that you are familiar with the issues involved. trying. and they agreed to finance a survey to explore what those within the ELT profession thought of this idea. But the BATQI organisation. or the Institute of Mechanical Engineers.Print out the actual questionnaire on lateralisation. perhaps introduce a membership scheme with entry based on appropriate qualifications which would convey to outsiders what levels of expertise exists. Discussions took place with the British Council in 1998. has been concerned with professionalism and qualifications within the English as a Foreign Language area of English Language Teaching. and also perhaps act as a voice for the ELT profession. you will be asked to do some work with this expanded data file. say. but is now known as QUITE. It will help to contextualise the tasks which you will soon be doing. a member of the BATQI Committee at that time. set up in 1992. The questionnaire was written by Charles Lowe. when you are completing Chapter Three. the British Psychological Society. But at this later point. has also wanted to explore whether the English Language Teaching profession would want to see established a general professional organisation. ever since its inception. . so that you can undertake a slightly less structured exploration. It has a scheme which accredits courses leading to qualifications for teachers of ESOL (English for Speakers of Other Languages). At the end of Chapter Three. In this section. I provide for you a brief description of a questionnaire survey which was conducted jointly by BATQI (the British Association of TESOL Qualifying Institutions) and the British Council into attitudes regarding the formation of a British Institute of English Language Teaching (henceforth. you will also be asked to work with your own (lateralisation) data. as you do so. re-open the lateralisation data file. Most of the data analysis in this chapter will use a different questionnaire (the BIELT questionnaire) and associated data that will be given you so that you can be set known tasks and given feedback on them. BIELT). or the British Medical Association. Such a new organisation would be intended to inject a greater level of professionalism into ELT. Here are some sections of the questionnaire – to give you a flavour of how it worked 1 This organisation still exists. and add the new data to it. make copies and give it to about thirty people. is useful as a device to help learn about the basics of questionnaire design. When you have got this data. of the sort represented in other fields by. BATQI1. age and nationality. The data from this survey. which is based on almost 1200 cases. to obtain as much variety as you can for sex.

(the full questionnaire can be found in Appendix One: note that you are only looking at excerpts here): .

Form official links (eg DfEE. Establish international equivalencies for qualifications 1 2 3 4 5 6 6. Establish a Professional Code of Practice for the protection of the 1 2 3 4 5 6 public 7. 9. The baseline qualification for entry should be: a Camb/RSA/Trinity Certificate (or similar) b Camb/RSA/Trinity Diploma (or similar) Strongly Agree 1 2 3 4 5 6 1 2 3 4 5 6 57. Age a20-25 b26-30 c31-35 d36-40 e41-45 f46-50 g51-55 h56-65 We will look at the questionnaire in more detail later. ‘Buyer beware’ advertisements (e. i. Establish an accepted framework of professional qualifications 1 2 3 4 5 6 5.e. 33-34. 2. i. IT SHOULD: 3.Excerpts from the BIELT Questionnaire FOR EACH STATEMENT. Function as a lobbying and public relations force for British ELT 1 2 3 4 5 6 8. 59 while others ask for more open information. IT SHOULD OFFER: 33. QCA) 1 2 3 4 5 6 9. so for now you only need to note a couple of points from these excerpts: • some questions require pre-defined responses. TTA.e. 57 . 58.g. PLEASE CIRCLE THE NUMBER WHICH REPRESENTS YOUR VIEW Strongly Disagree 1. ______________________________________________________________________________ ______________________________________________________________________________ b. a. 1. A new body to represent the British ELT profession is necessary 1 2 Strongly Agree 3 4 5 6 2. Please add any further assumptions you would like included in the above list. Gender aM bF 59. Other __________________________________________________________________________ FOR ENQUIRERS AND NEW ENTRANTS. EU. ______________________________________________________________________________ Strongly agree Strongly disagree FOR THE PROFESSION. b. 36a. in the Guardian) 1 2 3 4 5 6 Strongly Disagree 36. A source of information about the profession 1 2 3 4 5 6 34. 3-8. Act as the sovereign and constitutive body – overarching and ‘supra’ 1 2 3 4 5 6 4. Role ___________________________________________ 58.

role. The point of sending people in the ELT profession a questionnaire was to establish a range of things: • most importantly of all. factual questions. but ones which are not difficult to re-code numerically (as 1 and 2. and you will also learn how to analyse questionnaire derived datasets of this sort. respectively • some of the open items generate data which was impossible to code easily. a very large proportion of it). and the data was sent to the author for analysis. But the entire . it was important to know therefore what views they had about a number of options regarding such an organisation. and others used more elaborate sets of options (e. besides being useful and important. Some used simple choices (Male/Female for gender). but two (58. did they think the establishment of a professional organisation was a worthwhile thing? Assuming that the answer to this was positive. it was felt important to gather information about respondents relating to their location. and on a very large scale. Simultaneously you will see what the data reveals about the views of the British-oriented ELT profession regarding the formation of a professional organisation. and so were left out of the present dataset • one of the open items (57: Role) was coded. and the remainder of the chapter requires you to look at this data. such as: • What it should do: • for the profession (Questions 3-9) • for individuals (Questions 10-17) • for schools/universities (Questions 18-22) • for examination boards (Questions 23-25) • for freelancers (Questions 31-32) • for new entrants to the profession (Questions 33-34) • Membership and qualifications issues (Questions 36-42) • What its annual fee should be (Questions 44. qualifications. age. and salary. and 1-8. 46) • How it should come into being (Question 48) In addition. The Nature of the Questionnaire If would probably be helpful if you located the complete questionnaire in the Appendix so that you can refer to it while you are reading this section. did not fall into limitless categories. (CfBT). since the responses. The responses to the questionnaire were then coded and input to the computer by staff at the Centre for British Teachers.g. The excerpts shown pretty much cover the sorts of items contained in the entire questionnaire. Through the offices of the British Council it was possible to distribute this questionnaire world-wide.59) use other categories. area of work. experience. These biographical/background issues were probed through fairly direct. gender. We will discuss this more later.• most of the pre-defined questions use the 1-6 scale. You now will have this dataset (or rather. for qualifications). (Questions 50 – 64).

e. Once a variable was named (i. so it’s being ignored here. and then the name was typed in with SPSS Version 9. or that there weren’t areas omitted. etc (i. there was no mid-point. and I am assuming the questions were framed clearly. and provided the relevant information in a direct manner. laborious and timeconsuming. you will see what sort of information was returned with it. By this I do not mean that the design of the questionnaire cannot be criticised. but for the coders at the CFBT. and then nearly 1200 respondents! Note that for these six steps. The purpose of the pages which follow in this chapter. or simply completing the appropriate row in Data View in SPSS Version 10). The sections on what a British Institute should do. but this doesn’t fit in with the problem at hand. since there were a lot of data points per person. I am treating this questionnaire as unproblematic in nature. and simply depended on the honesty and comprehensiveness of the answers given. . or that there weren’t other choices that could have been made as to how information was obtained. is simply to acquaint you with processing data from a relatively straightforward questionnaire with a reasonable number of questions. in which. with the arrangement: Strongly Disagree 1 Strongly Agree 2 3 4 5 6 In this way. it was a simple matter to input the appropriate number which had been circled or crossed by each respondent. a missing value was defined as ‘9’. (learning how to analyse questionnaires). As a result. It is useful in that regard to look at the following categories of data: • the numeric items: many of the questions required six-step ratings. and a fairly large number of responses. most of the questions 1-42) were framed in terms of six step rating scales. for me as data analyst. Putting these into the computer was easy. Data. What I mean is that the measuring methods used were straightforward. The guiding questions are: • what questions can be addressed with this information? • how can the information be presented in numerical form? • how can the information be displayed? Preparing the Questionnaire Assuming that by now you have looked over the questionnaire in the appendix. Simple. we can proceed directly to analysing the data from the questionnaire. In these materials. and that respondents understood what was meant. was to get the information from the questionnaires into an SPSS format. Define Variable. Questions were asked.biographical section used fairly transparent methods. There may be times when these assumptions cannot be made about questionnaires.e. how its membership systems should function. therefore. a gradation of views was possible. The next problem. interestingly.

The same applies for a number of items on the last page of the questionnaire. The following codings were used: Gender Male Female Missing Value 1 2 9 Age 20 – 25 26 – 30 31 – 35 36 – 40 Missing Value 1 2 3 4 9 Experience (in years) Language Teaching 0 –1 1 2–3 2 4–5 3 6 – 10 4 10+ 5 41 – 45 46 – 50 51 – 55 56 – 65 Teacher Training 0–1 1 2–5 2 6 – 10 3 10+ 4 Research Management 0–2 1 0–1 1 3–5 2 2 –5 2 6+ 3 6+ 3 Missing Value 9 for all sub-categories of experience Employed/Freelance Employed 1 Missing Value 9 Salary Freelance 2 5 6 7 8 Materials/Publishing 0–1 1 2–3 2 4–5 3 5+ 4 Exam Board 0–1 1 2–5 2 6+ 3 Both 3 . For example.• the “multiple choice” items which form a scale: many of the items required choice from amongst a group of pre-defined categories. the following arrangement was used: a: £30 1 b: £40 2 c: £50 3 d: £60 4 e: £80 5 f: £100 6 Missing Value 9 The transformation of £30 to 1 and then the way the other values follow is reasonably obvious in this case. To cope with this. and required the choice of one. the question on the annual individual membership fee gave a series of amounts.

we have to stand back a little and reflect upon what we are trying to achieve. the responses to the question: Q1: A new body to represent the British ELT profession is necessary Strongly Disagree Strongly Agree 1 2 3 4 5 6 received the following average ratings (where the maximum possible rating was ‘6’). (organised by salary level): £10-15 5. we can analyse all the other data in the questionnaire organised by salary. In questionnaires you may use in the future. In this case there were six codings of salary.00 £31-35 4. Because we coded salary in this way. that the number of options available to respondents (and so the codings that are possible) reflect the importance the questionnaire writer attached to each of them and the number of sub-divisions he thought it would be meaningful to make.23 £16-20 5.£10-15k £16-20k £21-25k 1 2 3 £26-30K £31-35k £36k 4 5 6 Reflection Notice.88 .44 £26-30 5. in all these cases.60 £36+ 4. and applies to pretty much any research-linked coding that one may do. • the complicated items: In addition to the above sorts of coding problems that the questionnaire contains. Hence the three categories for Experience: Exam Board. for example. but the five categories for Experience: Language Teaching. The relevance of this section is general. We will take these one-by-one: Reflection Before we can really address the question of what to do here.21 £21-25 4. The key issue is that the codes that we choose now will be the organising basis for analyses that we can do later. there are some other items which require decisions. you will have to think about the appropriate numbers of coding categories when you need to deal with data of this sort. So. Think back to the salary codings from the previous page.

the Czech Republic. The codings for salary. Here are the decisions that were made in the present case: • Country: Respondents actually provided the name of their specific country. open questions. therefore. because we had the salary codings.K. e. and that the people with the highest salaries give clearly lower ratings about how much they think such a new body is necessary. The larger the number of coding categories. China. This presents two problems. to be a case-by-case decision. there are. To handle these issues realistically. but the greater the scope for confusion. The point here is that such a tabulation of information. if we did have a specific coding for the Czech Repubic. We need to code country in such a way that the coding will allow us. at the time of coding. What you do in any particular circumstance has. also. we can see that all of these figures are fairly high (from a maximum of 6) but it is interesting that the highest two ratings are associated with those who receive the lowest salaries. of the people who earned £10-15k. When we look at the actual numbers here. the coding which was used was: U. the more detailed the potential analysis. basically. in terms of mean scores. to put this another way. Now we can restate the problem of how we code the information about country and the other. And we have to bear in mind that we may not. to do the analyses that we want to do. we might find that there are meaninglessly few respondents who fall into that category. First. codable. but the easier it will be. we could “ask” the computer to calculate the average ratings for Question 1.In other words. too many countries in the world for it to be feasible to give them each a separate coding number. The fewer coding categories we use. and so on. Second. became the organising frame for reporting how these subgroups gave ratings for Question 1. is only possible because the data was coded to capture these six levels of salary. while it might be useful to be able to analyse all the Czech Republic respondents as a group. in other words.K. the “cruder” the analyses of the data that will be possible. U. later. we need to make a decision about fineness of coding. and the average rating for Q1 of people who earned £16-20k. be sure which analyses we will want to do! Or.g. North America Australasia Rest of Europe Indian Sub-Continent Far East Middle East Africa Central/South America Missing Value 1 2 3 4 5 6 7 8 9 99 .

respondents can answer in whatever way they choose. In any case. EAP 2 Retired 12 FE College Lecturer 3 ESP 13 University Teacher 4 Employer 14 Coordinator/Organiser 5 EAL/ESL 15 Middle level post of responsibility 6 Admin/Non-professional 16 Senior level of responsibility 7 Inspector/Advisor 17 Manager 8 Examiner 18 Teacher Trainer 9 Student 19 Author/Writer/Editor 10 Missing Value 99 This may not be an ideal set of categorisations. there’s enough data to be going along with! . The simple solution here is to code the different qualifications sequentially. the following (rather arbitrary) coding system was used for this question: Tutor/Teacher 1 Consultant 11 Lecturer. and a PGCE) give a sufficient educational qualification. but at a practical level. In this way. or because there were privacy issues involved. as has been done in this case. Since it is a completely open question. the “highest” value achieved by what may be multiple ticks from respondents while capturing separately whether there’s an actual educational qualification. and operates as the starting point for further manipulation that is useful • Role: This question presents yet another set of problems. it is possible to code. but did not exist in the original questionnaire. This was because the item concerned was too open-ended or unstructured to enable effective coding. and ‘0’ for anyone who does not. But it needs to be realised that there is no straightforward scale at work here. which is coded ‘1’ for anyone with one of these three qualifications.• Qualifications: Question 62 offers. Educational Qualification. or because there was duplication. after some scanning of the answers. Three of the qualifications (Cambridge Diploma. So. a number of qualifications that respondents might hold. there were frequently occurring responses. a new variable. Notice that this variable exists in the SPSS data file. Trinity Diploma. as options. Accordingly. it served to capture almost all of the open-ended responses that were made. has been created. while the rest do not. for the basic qualification variable. Even so. Note that the preceding discussion of the questionnaire items has not covered all the items from the questionnaire – various of them have been omitted. The interesting point to remember is that the original data can be transformed.

though.Task 1. there’s a small aspect of coding with SPSS that is well worth learning.32 0. and the way to learn it is to use the dataset you are already familiar with – that on lateralisation.10 0. start SPSS (if its not already open). First. and open the Lateralisation dataset. Before that. As before the data for the first six cases should look like this: ID Sex Age Nat. Han.08 0. Fam.00 0. Q 1 Q 2 Q 3 Q 4 Q 5 Q 6 Q 7 Q 8 Q 9 Q1 0 Q1 1 a b c d e f 1 1 2 2 1 2 2 2 3 2 1 3 2 4 3 6 6 4 0.15 0.18 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 2 1 2 1 1 1 2 1 2 1 1 1 2 1 1 1 1 2 2 1 2 2 1 9 2 1 2 1 1 1 2 1 1 1 1 3 2 1 1 1 1 2 1 1 2 .4 In a moment you will be doing some analyses with the small version of the BIELT questionnaire.

Click it again. Define Variable Check that the missing value is set to ‘9’. For SPSS Version 9 • • • • (a) (b) (c) (d) (e) (f) (g) (h) • Open the file. e) Repeat (b) and (c) with ‘2’ and ‘Female’ f) Click Add again g) Click OK • You are now back at the main data screen. modify it so that ‘9’ functions to signal missing values. VALUE LABELS allows you to toggle back and forth between the data numbers and labels which refer to them. if it isn’t already open Click on the Sex variable heading and then choose Data. clicking on this button allows you to toggle back and forth between the data numbers and labels which refer to them. Notice that you are building up a set of labels in the bottom part of the dialogue box. Interesting. clicking on VIEW. the row for sex. OPEN. While looking at the data for Sex. Once you have added the value labels. DATA then choose the file in the place that you saved it. Click on the Data View tab. click in the next cell to the right. Now click on the View drop down menu and click on Value Labels if it hasn’t already got a tick next to it. Type in the word sex • In the same row. Look at what happens to the data for sex. eh? Click on VIEW. Repeat (b) and (c) with ‘2’ and ‘Female’ Click Add again Click Continue Click OK You are now back at the main data screen. Interesting. click the Value Labels icon. . sex. and as a result to make it more accessible. For SPSS Version 10 or higher: • Click on FILE. and the data will go back to being in the form of 1’s and 2’s. Make sure you are in variable view • In row 2. eh? Once you have added the value labels. then Type ‘Sex’ into the Variable Label box Type ‘1’ into the Value box Type ‘Male’ into the Value Label box Click on the Add button (which has been dimmed ‘til this point) Notice that you are building up a set of labels in the large box at the bottom of the screen. in the column headed Values a) Click on the grey button and a dialogue box opens b) Type ‘1’ into the Value box c) Type ‘male’ into the Value Label box d) Click on the Add button (which has been dimmed up to this point). click the cell below the heading Label.What you need to do now is to make this datasheet less opaque. VALUE LABELS again. Click on the Labels button. Locate the Value Labels button towards the right end of the bar of icons almost at the top of the screen. If it isn’t.

working in Variable View. Task 1. and is immensely more communicative for the different values. Note also you can use proper “spaces” at this stage! . for example. and then attend to Missing Values and Value Labels. (Referring back to pp 5-6 might be helpful for this. restricted to eight characters. but even so. 3. With Missing Values this importance is obvious. You are no longer. at this stage. with the meaningfulness of output. The overriding purpose in this task is to input data which will help you. (Of course. you could. and help readers of that output.) 2. SPSS will still work if you don’t do this. But it’s very important. as appropriate. But here are two crucial reasons. so it's worth putting up with the tedium at this stage. for each variable. it’s easy to forget! With Sex there are only two values. try to be concise. and they are fairly obvious. so the only danger is getting them the wrong way round. You will need to do this task for every variable except Familial Handedness. above) it’s difficult to keep things in your head and the capacity to switch quickly between labels and data is really useful. Second. It’s tedious to go through the Data. for a variable. it’s well worth the tedium of adding Label information. the labels you put in for each variable and value are what appear in the SPSS output. It makes the difference between clarity and incomprehensible output! So when you are inputting the original data. With Value Labels it’s more subtle. First. don’t be too verbose! With Variable Name you can be a little more descriptive. Define Variable sequence. fill in the values for Question One. but with Value Labels. But with a long and arbitrary list like that for Role (covered on P20.) Then use the relevant icon from the bar of icons (or VIEW. Hints: 1.Reflection Once again I have to say that you have learned a powerful technique. VALUE LABELS) to see the effect of what you’ve done. and then copy this material and paste it into other variables which have exactly the same labels. This circumvents the 8-letter limit on variable names. Note that with SPSS 10. but the output you generate will be much less clear.5 Put in variable names and value labels for the rest of the variables on the Lateralisation dataset.

