• Quantitative analysis techniques such as tables, graphs and statistics allow us to do this, helping us to explore, present, describe and examine relationships and trends within our data. • Quantitative data refer to all such primary and secondary data and can range from simple counts such as the frequency of occurrences to more complex data such as test scores, prices or rental costs. • To be useful these data need to be analysed and interpreted. Quantitative analysis techniques assist you in this process. Preparing, entering and checking data • number of cases of data, that is the sample size • type or types of data (scale of measurement); • data layout and format required by the analysis software; • impact of data coding on subsequent analyses (for different types of data); • process of entering (or inputting) data; • need to weight cases; • process of checking the data for errors. Types of data • Categorical data refer to data whose values cannot be measured numerically but can be either classified into sets (categories) according to the characteristics that identify or describe the variable or placed in rank order. • Numerical data are those whose values are measured or counted numerically as quantities Categorical data They can be further subdivided into descriptive and ranked. • Descriptive data or nominal data is impossible to define the category numerically or to rank it. • Ranked (or ordinal ) data are a more precise form of categorical data. Numerical data • Interval data can state the difference or ‘interval’ between any two data values for a particular variable. • Ratio data can also calculate the relative difference or ratio between any two data values for a variable. • Continuous data are those whose values can theoretically take any value provided that researchers can measure them accurately enough. • Discrete data can be measured precisely. Data layout • Some primary data collection methods automatically enter and save data to a computer file at the time of collection, normally using predefined codes. • These data can subsequently be exported in a range of formats to ensure they are compatible with different analysis software. For example, google sheet or survey money. • For other data collection methods, you will have to prepare and enter your data for computer analysis. Data Layout • Virtually all analysis software will accept the data if they are entered in table format which is called a data matrix. • The multiple-response method of coding uses the same number of variables as the maximum number of different responses from any one case. • The multiple-dichotomy method of coding, uses a separate variable for each different answer. Coding • Actual numbers are often used as codes for numerical data, even though this level of precision may not be required. • Once your data are recorded in a matrix, an analysis software can be used to group or combine data to form additional variables with less detailed categories. • This process is referred to as re-coding. Coding • Existing coding schemes can be used for many variables. E.g. industrial classification, occupation, social class and socioeconomic classification. • Coding at data collection usually occurs when there is a limited range of well- established categories into which the data can be placed. • Coding after data collection is necessary when you are unclear as to the likely responses or there are a large number of possible responses in the coding scheme. Coding for missing data • Statistical analysis software often reserves a special code for missing data • Four main reasons for missing data are identified by De Vaus (2014): – The data were not required from the respondent, perhaps because of a skip generated by a filter question in a survey. – The respondent refused to answer the question (a non-response). – The respondent did not know the answer or did not have an opinion. Sometimes this is treated as implying an answer; on other occasions it is treated as missing data. – The respondent may have missed a question by mistake, or the respondent’s answer may be unclear. – It may be that leaving part of a question in a survey blank implies an answer; in such cases the data are not classified as missing. Entering and saving data • If software is used to collect data or secondary data have already existed, it is not necessary to enter input and save the files. • However, some data are needed to enter and save in the computer. • Although some data analysis software contains algorithms that check the data for obvious errors as it is entered, it is essential for researchers to take considerable care to ensure that the data are entered correctly and save the file regularly. • More sophisticated analysis software allows to attach individual labels to each variable and the codes associated with each of them. Checking for errors • There will be errors no matter how carefully researchers code and subsequently enter data. • The main methods to check data for errors are as follows: – Look for illegitimate codes. – Look for illogical relationships. – Check that rules in filter questions are followed. • For each possible error, researchers need to discover whether it occurred at coding or data entry and then correct it. Weighting cases To weight the cases: 1. Calculate the percentage of the population responding for each stratum of stratified random sampling. 2. Establish which stratum had the highest percentage of the population responding. 3. Calculate the weight for each stratum using the following formula: Weight = highest proportion of population responding for any stratum/ proportion of population responding in stratum for which calculating weight 4. Apply the appropriate weight to each case. Exploring and presenting data • Exploratory Data Analysis (EDA) approach useful in these initial stages. Exploratory Data Analysis approach allows researchers flexibility to introduce previously unplanned analyses to respond to new findings. This approach emphasizes on the use of graphs to explore and understand the data. • Visual display to illustrate one or more relationships among numbers – Charts – Bar graphs or bar charts – Pie charts • Once you have explored the variables, researchers begin to compare variables and interdependences between variables: – comparing intersections between the data values for two or more variables; – comparing cumulative totals for data values and variables; – looking for interdependences between cases for variables. Exploring and presenting individual variables
• To show specific amounts by using Table.
• To show the highest and lowest values by using bar chart or bar graph, histogram, or pictogram. • To show a trend by line graph • To show proportions or percentages with pie chart • To show the distribution of value by simply plotting, or frequency polygon, or a histogram, or kurtosis, or box plot. Comparing variables • To show interdependence and specific amounts by using contingency table or a cross-tabulation • To compare the highest and lowest values by using multiple bar graph, or compound bar graph. • To compare proportions or percentages by using percentage component bar graph. • To compare trends so the intersections are clear with multiple line graph. • To compare the cumulative totals by using stacked bar graph. Comparing variables • To compare the proportions and cumulative totals by using comparative proportional pie charts. • To compare the distribution of values • To show the interdependence between cases for variables by using a scatter plot. Describing data using statistics • Descriptive statistics enable researchers to describe variables numerically. • Statistics to describe a variable focus on two aspects: – the central tendency; – the dispersion. Describing the central tendency • When describing data for both samples and populations quantitatively, it is usual to provide some general impression of values that could be seen as common, middling or average. These are termed measures of central tendency. Describing the central tendency • To represent the value that occurs most frequently – The mode is the value that occurs most frequently. – For descriptive data, the mode is the only measure of central tendency that can be interpreted sensibly. – Data are grouped into suitable categories and the most frequently occurring or modal group is quoted. Describing the central tendency • To represent the middle value – Median value can be resulted by ranking all the values in ascending order and finding the mid-point in the distribution. – For variables that have an even number of data values the median will occur halfway between the two middle data values. • To include all data values – The mean includes all data values in its calculation. However, it is usually only possible to calculate a meaningful mean using numerical data. Describing the dispersion • Two of the most frequently used ways of describing the dispersion are the: – difference within the middle 50 percent of values; – extent to which values differ from the mean (standard deviation). • Although these dispersion measures are suitable only for numerical data, most statistical analysis software will also calculate them for categorical data if numerical codes are used. Describing the dispersion • To state the difference between values • Range will be resulted when the difference between the lowest and the highest values is calculated. – The median divides the range into two. – The range can be further divided into four equal sections called quartiles. – The lower quartile is the value below which a quarter of your data values will fall; the upper quartile is the value above which a quarter of your data values will fall. – The remaining half of data values will fall between the lower and upper quartiles. The difference between the upper and lower quartiles is the inter- quartile range. • Percentiles • Deciles. Describing the dispersion • To describe and compare the extent by which values differ from the mean – The standard deviation is used to describe the extent of spread of numerical data. – Coefficient of variation is resulted by dividing the standard deviation by the mean and then multiplying the answer by 100. The values of this statistic can then be compared. – Index numbers compare each data value against a base value that is normally given the value of 100, differences being calculated relative to this value. Index number is calculated: Examining relationships, differences and trends using statistics • In statistical analysis , the relationship between a variable and another variable can be found out by testing the likelihood of a relationship (or one more extreme) occurring by chance alone, if there really was no difference in the population from which the sample was drawn. • This process is known as significance or hypothesis testing. • The data that have been collected are compared with what researchers would theoretically expect to happen. Testing for normality • Histograms, box plots and frequency polygons can be used to assess visually whether the data values for a particular numerical variable are clustered around the mean in a symmetrical pattern, and so normally distributed. • For normally distributed data, the value of the mean, median and mode are also likely to be the same. • Use statistics to establish whether the distribution as a whole for a variable differs significantly from a comparable normal distribution. • Can be used in statistical software such as IBM SPSS Statistics using the Kolmogorov–Smirnov test and the Shapiro–Wilk test Testing for significant relationships and differences • Testing the probability of a pattern or hypothesis such as a relationship between variables occurring by chance alone is known as significance testing. • With most statistical analysis software, significance testing consists of a test statistic, the degrees of freedom (df) and, based on these, the probability (p-value) of your test result or one more extreme occurring by chance alone. Type I and Type II errors • Inevitably, errors can occur when making inferences from samples. • Statisticians refer to these as Type I and Type II errors. • Type I errors might involve researchers concluding that two variables are related when they are not, or incorrectly concluding that a sample statistic exceeds the value that would be expected by chance alone. • The term ‘statistical significance’ refers to the probability of making a Type I error. • A Type II error involves the opposite occurring. This means that Type II errors might involve researchers in concluding that two variables are not related when they are, or that a sample statistic does not exceed the value that would be expected by chance alone. Testing for significant relationships and differences • To test whether two variables are independent or associated by using chi square test or phi. • To test whether two groups are different – Ranked data can be tested by Kolmogorov–Smirnov test. – Numerical data can be tested by independent groups t- test or paired t-test. • To test whether three or more groups are different by using one-way analysis of variance or one-way ANOVA. Assessing the strength of relationship • To assess the strength of relationship between pairs of variables – A correlation coefficient enables you to quantify the strength of the linear relationship between two ranked or numerical variables. – If both the variables contain numerical data, Pearson’s product moment correlation coefficient (PMCC) can be used to assess the strength of relationship. – The two used most widely in business and management research are Spearman’s rank correlation coefficient (Spearman’s ρ, the Greek letter rho) and Kendall’s rank correlation coefficient (Kendall’s τ, the Greek letter tau). Assessing the strength of relationship • To assess the strength of a cause-and-effect relationship between dependent and independent variables – Coefficient of determination enables to assess the strength of relationship between a numerical dependent variable and one numerical independent variable. – Coefficient of multiple determination enables to assess the strength of relationship between a numerical dependent variable and two or more independent variables. Assessing the strength of relationship
• To predict the value of a variable from one or
more other variables. – Regression analysis can also be used to predict the values of a dependent variable given the values of one or more independent variables by calculating a regression equation. Examining trends • Line graph can be drawn to obtain a visual representation of the trend. • Three of the more common uses of such analyses are: – to explore the trend or relative change for a single variable over time; – to compare trends or the relative change for variables measured in different units or of different magnitudes; – to determine the long-term trend and forecast future values for a variable. Analyzing the Data Part 1 Analyzing Qualitative Data • Qualitative researchers need to make sense of the subjective and socially constructed meanings expressed by those who take part in research about the phenomenon being studied. • Since meanings in qualitative research depend on social interaction, qualitative data are likely to be more varied, elastic and complex than quantitative data. • The quality of qualitative research depends on the interaction between data collection and data analysis to allow meanings to be explored and clarified. Deciding on your approach to analysis • Using a deductive approach – Theoretical propositions are used as a means to devise a framework to help researchers to organise and direct the data analysis. – To devise a theoretical or descriptive framework, researchers need to identify the main variables, components, themes and issues in the research project and the predicted or presumed relationships between them. • Using an inductive approach – The alternative to a deductive approach is to start to collect data and then explore them to see which themes or issues to follow up and concentrate on. – An inductive approach may be a difficult strategy to follow and may not lead to success for someone who is an inexperienced researcher. The interactive nature of the process • Data collection, data analysis and the development and verification of propositions are very much an interrelated and interactive set of processes in qualitative research. • Analysis is undertaken during the collection of data as well as after it. • This analysis helps to shape the direction of data collection. • The interactive nature of data collection and analysis allows recognizing important themes, patterns and relationships as researchers collect data. Preparing the data for analysis • Transcribing qualitative data – Interview is often audio-recorded and subsequently transcribed. • Using electronic textual data including scanned documents – For some forms of textual data, the data may already be in electronic format. • Both need time to organize. However, transcribing qualitative data takes longer time. Aids to help the analysis • Ways of recording information and developing reflective ideas to supplement written-up notes or transcripts and categorised data include: – interim or progress summaries; – transcript summaries; – document summaries; – self-memos; – a research notebook; – a reflective diary or journal. Aids to help the analysis • Interim or progress summaries - The progress of research to date, the results of interview or observation, and the findings from the secondary data are written in interim summary. • Transcript summaries - A transcript summary compresses long statements into briefer ones in which the main sense of what has been said or observed is rephrased in a few words. • Document summaries - A document summary is used to summarise and list the document’s key points for research and to describe the purpose of the document, how it relates to the researcher’s work and why it is significant. • Self-memos - Self-memos allow recording ideas that occur to researchers about any aspect of their research, as they think of them. • Research notebook - The purpose of research notebook will be similar to the creation of self-memos. • Reflective diary or journal - Reflective diary or journal is devoted to reflections about the experiences of undertaking research, what researchers have learnt from these experiences, how they will seek to apply this learning as their research progresses and what they will need to do to develop their competence to further their research. Thematic Analysis • Thematic Analysis can be referred as a ‘foundational method for qualitative analysis’ • Thematic Analysis can be used to help researchers: 1. comprehend often large and disparate amounts of qualitative data; 2. integrate related data drawn from different transcripts and notes; 3. identify key themes or patterns from a data set for further exploration; 4. produce a thematic description of these data; and/or 5. develop and test explanations and theories based on apparent thematic patterns or relationships; 6. draw and verify conclusions. Procedure of Thematic Analysis • Becoming familiar with the data - Familiarization with the data involves a process of immersion that continues throughout the research project. • Coding the data - Coding is used to categorize data with similar meanings. Coding involves labeling each unit of data within a data item with a code that symbolizes or summarizes that extract’s meaning. • Searching for themes and recognizing relationships - Searching for themes involves researchers making judgments about their data and immersing themselves in the data judgments. • Refining themes and testing propositions - The themes that researchers devise need to be part of a coherent set so that the themes provide with a well-structured analytical framework to pursue the analysis. • Evaluation on the analysis- Thematic Analysis offers a systematic approach to qualitative data analysis that is accessible and flexible. Template Analysis • Template Analysis is a type of Thematic Analysis, with a few key differences. • In Template Analysis, a researcher only codes a proportion of the data items before developing an initial list of codes and themes, known as a coding template. • The coding template is the hierarchical list of codes and themes, which is used as the central analytical tool in Template Analysis. Procedure of Template Analysis • The initial procedure of Template Analysis reflects that of Thematic Analysis. Familiarizing with the data is the same. • The initial transcript or transcripts will be coded. • The development of an initial coding template will be an exploratory process involving the arrangement and rearrangement of the codes researchers have used until they devise themes that appear to represent key ideas and relationships in the data. • As data collection proceeds, your template will be subject to modification. • The template may continue to be revised until all of the data collected have been coded and analysed carefully. • Evaluation - Template Analysis adopts a higher level of structure earlier on than Thematic Analysis through the development of an initial coding template. Explanation Building and Testing • Analytic Induction - Analytic Induction uses an incremental approach to build and test an explanation or theory. Analytic Induction seeks to develop and test an explanation by intensively examining the phenomenon being explored through the successive selection of purposive cases. • Deductive Explanation Building - Explanation Building involves an incremental attempt to build an explanation by testing and refining a predetermined theoretical proposition. • Pattern Matching - Pattern Matching involves predicting a pattern of outcomes based on theoretical propositions to explain what researchers expect to find from analyzing their data. Grounded Theory Method • Grounded Theory Method is part of a wider methodological approach. • Grounded Theory is an emergent and systematic research strategy. It avoids using a priori codes derived from existing theory and commences inductively, by developing codes from the data. • The development of an emergent idea or theory from these data informs the direction of a Grounded Theory study. Narrative Analysis • Narrative Analysis is a collection of analytical approaches to analyse different aspects of narrative. These may be combined in practice, depending on the research question and purpose, and the nature of the data. • Thematic Narrative Analysis – This approach to Narrative Analysis focuses on ‘what’ the narrative is about rather than ‘how’ it is constructed. Thematic Narrative Analysis can be used to analyse an individual narrative or multiple, related narratives. • Structural Narrative Analysis – Structural Narrative Analysis analyses the way in which a narrative is constructed. Discourse Analysis • In Discourse Analysis, the emphasis is not on studying the way in which language is used for its own sake. • In this more specific sense, ‘discourse’ describes how language is used to shape this meaning-making process, to construct social reality. • A discourse is therefore not just seen as neutrally reflecting social practice or relations but as constructing these. • Discourse Analysis explores how discourses construct or constitute social reality and social relations through creating meanings and perceptions. Content Analysis and quantifying qualitative data • Content Analysis is an analytical technique that codes and categorises qualitative data in order to analyse them quantitatively. • Content Analysis has a long history that illustrates its use as an approach spanning qualitative and quantitative methods. • ‘Content analysis is a research technique for the objective, systematic and quantitative description of the manifest content of communication.’ Data Display and Analysis • The process of analysis consists of three concurrent sub-processes: • data condensation - includes summarising and simplifying the data collected and/or selectively focusing on some parts of this data; • data display - involves organising and assembling your data into summary diagrammatic or visual displays • drawing and verifying conclusion - by the use of data displays Using CAQDAS • CAQDAS (Computer Assisted Qualitative Data Analysis Software, sometimes abbreviated to QDAS) refers to programs containing a range of tools to facilitate the analysis of qualitative data. • When used systematically, it can aid continuity and increase both transparency and methodological rigour. Functions of CAQDAS programs • Structure of work • Closeness to data and interactivity • Explore the data • Code and retrieve • Project management and data organisation • Searching and interrogating • Writing memos, comments, notes, etc. • Output References • Research Methods for Business Students (7th edition) by Mark Saunders, Philip Lewis, and Adrian Thornhill (Chapter 12 and Chapter 13)