You are on page 1of 37

Week 10: Evaluation and

Integration
Quantitative Data Analytics

Dr Alison McFarland
alison.mcfarland@kcl.ac.uk
Week 10: Weekly Overview

Weekly reading
Coughlan et al. (2007)
Bell et al – refresh from Term 1
Two optional Spector papers (on control variables & CMV)

Today’s tutorial – group project evaluation


Attend! 
Complete the activity

Quiz open 10am Friday (today) to 10am Monday.

2 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Week 9 Quiz Feedback

3 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Session Overview

• Part 1: Evaluating QDA


• Part 2: Going Beyond the Data
• Part 3: Integrating the Topics & Final Assessment
Learning Outcomes

• By the end of the session you will:


• Understand how to approach evaluation in quantitative data analytics better
• Feel more confident about “going beyond the data” when writing up findings
• Be able to draw together the different elements of the module
• Understand what it coming up in the final assessment
Part 1: Evaluating Data Analytics
There is no such thing as a perfect piece of analytics
work...
• No sample is entirely representative
• No measure is perfect
• We can never account for all predictors of a DV
• We can never ‘prove’ causality
• A section of ‘study limitations’ will be demanded by all journals for published
academic research

Establish the merits and the weaknesses of research


methodologies, to evaluate the strengths and limits of
the knowledge produced
Goals of Evaluation

• Critical evaluation is not simply being negative about the project - it is a


constructive and reflective process

• Which features of the study are its main strengths and which are its main
weaknesses?
• How do each influence the persuasiveness of the findings?
• Implications for generalisation of the findings to practice or knowledge?
• How could the study have been done better?
Evaluation Checklist – adapted from Term 1

Research Design – Can we infer causality?


Measures – Validity / reliability?
Sample – Representative of population? Large enough?
Statistical tests conducted appropriately – Assumptions
checked? Control variables?
Research cycle – From real world/theory – to hypothesis – to
clear finding?
Your QDA Group Project

• You will be asked MCQs about the QDA group project

10 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Causality and Research Design (1)

• Are the researchers using the most appropriate research


design?
• Cross-sectional research designs (“snapshot” design)
• Both IV and DV measured once and/or at the same time
• Does not tell us about causal direction of relationships
• Mostly commonly via surveys
• Potential for measurement bias if using the same type of measures, e.g.
two variables both measured by Likert scales
CMV and your group project…

12 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Causality and Research Design (2)

Longitudinal designs (Quasi) Experimental designs


• IV and DV measured at different • Use of treatment groups and control
time-points groups
• Ideally 3 time-points • Intervention for treatment group but
not for control group
• Does a change in IV precede a change
in DV? • The difference between groups is
assessed
Time 1 Time 2 Time 3
Baseline Treatment Outcome
IV IV measure phase measure
Group 1 Control
DV DV Difference?
Group 2 Treatment
Requirements for causal
inference
1. Temporal sequencing i.e. if X causes Y, X must come before Y.

2. Clear causal Something that creates and explains the connection


mechanism between IV and DV. Either theoretical, and/or via
path models.

Ideal: random selection of participants, control of


3. Eliminating all other
EVERYTHING (i.e. experiments)
explanations
Reality: regression model which controls statistically

14 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Validity & Reliability
of Measures
• How ‘good’ are the measures?
• Have the researchers used ‘pre-existing measures’ or developed their own?
• Have they presented statistical information about the validity and reliability of
measures for their current sample?
• E.g. alpha statistic – internal consistency; factor analysis – criterion/discriminant validity
• But remember, those are not the only sources of reliability and validity!

• When you look at the measures, at example items, are there any obvious flaws –
confusing, overly long, double meaning, very short/very long response scale
• There is lots of literature on this topic – good way to demonstrate extra reading
in your exam!
Non-Random Sampling

• In practice it is very difficult, if not impossible, to


achieve a truly random sample
• Have the researchers made attempts to achieve an
appropriate sample?
• Who is the sample?
• Who is the (intended vs legitimate) population? Is the
sample representative of intended population? Or, which
population are the findings likely to generalise to?
• Response rates? Missing data?
• What are the limitations regarding the generalisation of
findings?
Sample Size & Statistical Power

• Have the researchers collected data with an


appropriate sample size?
• Larger samples are good because:
• They better represent the population
• Statistical tests have greater power

• Statistical power is the ability of inferential tests


to correctly reject or accept the null hypothesis
• Sample size ‘rules of thumb’ exist for power – see
textbook chapters for specific tests
Test Assumptions

• Have the test assumptions been met?

• Each inferential test has a set of assumptions that must be met in order for the
test statistics to be valid
• These vary from one test to another - Andy Field textbook is very good on assumptions

• Example: you code everybody’s favourite colour (red = 1, green = 2, blue = 3).
You calculate a mean value, which is 2.13. This value is meaningless, because you
haven’t met the core assumption for mean calculations – that the data is
continuous (or at least has an order!).
Control Variables

• Have the researchers used appropriate control


variables?
• Control variables remove the effect of
potentially problematic effects from study
findings
• Including a variable as a predictor in a
multiple regression “controls for” its influence
on both IV and DV
• e.g. “having controlled for the effects of working
hours, gender was not found to influence salary”
Paradigm critiques

“The Qualitative Criticism” “The Big Data Criticism”


• Belief that there are deep problems with the • Overly focused on theory and
underlying epistemological and ontological explanation
approach of quantitative research • Social science approach open
1. Social world not the same as natural world to sample / context
2. Measurement process artificial and idiosyncrasies
spurious • Small and static research
3. The meaning of events to individuals are designs
ignored • Failure to capture “prediction”
and offer practical solutions
Evaluation is like detective work…

• What are the clues in what I being


told?
• What is unclear or ambiguous?
• What am I not being told, that I
need to know?

• (Inferential) statistics are based on a set of probabilities


• How much CONFIDENCE should I have in the results?
• What is the LIKELIHOOD that caveats are needed?
Part 1 Summary

• No analytics project is perfect – there are always going to be trade-offs


made in order to complete the research.
• When evaluating, we should think about:
• Research design, and its implications, particularly for causality
• Whether the measures are reliable and valid
• Whether the sample has been appropriately selected, and whether it’s large enough
• Whether test assumptions have been met
• Whether control variables have been used
• We could also consider the wider limitations of the quantitative paradigm

22 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Part 2: Going Beyond
the Data
Generating Narrative

• ‘So what?’ of the findings – zooming out from the


numbers, what do they all mean?
• How do the statistical findings contribute to the
analytics aims/RQ?
• Are there any additional statistics that might
contribute (e.g. stats from extra variables;
environmental factors)?
• Degree of confidence in the findings (evaluation)
• Recommendations to the organisation / stakeholder
Narrative is based on two things…

Deep Understanding Communication


• Do you understand the scenario? • Do you know your audience and
• Do you understand the aims/purpose? what they will understand?

• Do you understand how the project • Norms of communication


was conducted? • Should you use visual
• Do you understand all of the communication?
statistics? • Can you tell the story and “bring
• Do you understand what best practice meaning to the surface”?
looks like?
Six Thinking Hats – applied to QDA

How your Alternatives


answer is & trade-offs
structured

Contributions
Key findings of project
from project

Big picture Critical


messages evaluation

26 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Part 3: Integration &
Assessment
Topics

• 5QQM245 Introduction to • Week 6: Logistic Regression 


Organisational Research Methods • Week 7: Factor Analysis & Reliability
(foundational)
• Week 8: Designing an Analytics Project
• Week 1: Introduction to Data Analytics
& Chi-square • Week 9: Big Data & Machine Learning
• Week 2: Mechanics & Fundamentals • Week 10: Evaluating Data Analytics
• Week 3: t-tests & ANOVA
• Week 4: Introduction to Regression 
• Week 5: Multiple Regression  
Themes

29 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Your Continuous Assessment (20% of final grade)

• (10%) weekly KEATS tests – all 10 quizzes count


• (10%) tutorial participation grade – 9 out of 10 tutorials count (your ‘worst’ score
is deducted)

• You should receive your final grades on this element sometime around mid-April
Your Exam (80% of final grade)

Thursday 4th May 0900 – 1200; open-book; 100 marks total

50 marks Your Project 50 marks Analytics Scenario


• You will be asked questions about • You will be asked to report and critique
your analytics project the findings of an analytics project you
• These should reflect individual work have not seen before
and not be identical to your team • This will include an SPSS output
mates’ answers (though some
similarity will be unavoidable)

See the Keats page, Assessment section, for a past paper


What kinds of questions could be asked about my
group project?
• Describe aspects of the project procedure
• “Give a rationale for decisions made…”
• Why and how did you select your hypotheses?
• Describe your main findings
• What was the main strength / weakness of your
project?
• What could have been improved?
• What factors constrained your project?
About the open book element

• Instruction: “An A4-size folder (or similar)


containing written or printed notes and
materials, no books”
• Don’t rely on your notes! They are
primarily for when you get stuck.
• Make summaries, “cheat sheets", etc.

33 KING’S BUSINESS SCHOOL | kcl.ac.uk/business


Examples (1): What’s the story?

• A bank is interested in identifying factors that predict customers defaulting on


their mortgages
• They collate data on all of their mortgage customers over the past 5yrs
• Random samples of 200 cases who defaulted and 200 cases who did not
• Following a series of meetings with the Bank’s Mortgage Team, a list of likely
antecedents of defaulting were identified
• A logistic regression found a pseudo R-sq of .14.
• Age of customer (in yrs; Exp(b) = 1.040) and prior history of loan defaulting (no=0; yes=1;
Exp(b) = 1.219) had the two highest Walds and were sig (p<.01)
Examples (2): What’s the story?

• An analyst working for the British Dental Association is interested in the salary
of dentists; they want to see whether paying dentists more would lead to greater
profit for the dental practices they work for
• They download a dataset collected in 2015 by a recruitment agency about careers
of dental practitioners in London
• This included data on average salary for all workers in the dental practice and average client
satisfaction; who were texted after a visit with: “Good appointment? (clicked on emojis for sad
[1], non [2] & happy [3])”
• Salary & client satisfaction are correlated: r = .23; p = .015
• Multiple regression with DV of client satisfaction and controls for age and gender; std. beta
for salary and client satisfaction = .15; p = .049
What Should You Do Now?

• Catch up on core reading and consider wider reading and expand your lecture
notes / tutorial notes
• Check out the marking criteria
• Look at the past paper and the feedback that was given to students based on it
• Attend the online revision session, Thursday 27th April, 2pm
• Don’t panic!
Any questions?
Drop in NOW! Bush House N2.18

alison.mcfarland@kcl.ac.uk

Please fill out the module


evaluation – it closes today!

You might also like