You are on page 1of 22

Plenary

Introduction

Andrew Rogers
andrew.rogers@bristol.ac.uk

1
Introductions

• Dr Roberta Bernadi
• Unit director
• Lecturer for weeks 6-10 (AI for business)

• Dr Andrew Rogers
• Lecturer for weeks 1-5 (Data analytics for business)

2
A few words about the assignment

• Assignment due date: Wednesday December 14th


13:00 GMT

• (FYI the AI part (weeks 5-10) assessment due


data is 9th January 2023)

3
Predictive analytics for business

• Breadth vs. Depth


• Introduction to several techniques
• You may choose to “deep-dive” certain aspects for
your assignment.
• “Passing the course” vs. “Building towards your
career”
• Practical approach with some theoretical aspects
which helps to understand what is going on.

4
Topics for weeks 1 - 5

• Session1: Linear regression (spread over two weeks)


• Generating predictive models where we try and understand how a set of independent
variables predicts a dependent variable that we are interested in (e.g. How does the price of a
product, or advertising spend predict the volume sales)
• Session2: Logistic regression
• Generating a predictive model where a linear regression approach is not suitable (e.g.
predicting group membership such as a credit risk
• Session3: Exploratory Factor analysis (EFA), Confirmatory Factor analysis
(CFA)*, Structural Equation Modelling (SEM)*.
• Reducing a large data set into themes based on how the individual variables “link up” or
“move in similar ways”. Using this reduced data set to predict a dependent variable.
• Session4: Cluster analysis and discriminant analysis
• How do, for example, a group of consumers form similar groups, statistically, based on, for
example, their attitudes to life, their behaviour etc.. How can we predict other consumers’
group memberships based on some of these traits

* Brief introduction 5
Software exposure for weeks 1 - 5

• Week1: SPSS/Excel

• Week2: SPSS/Excel/RStudio

• Week3: SPSS/Excel/RStudio

• Week4: SPSS/AMOS Optional self study

• Week5: SPSS/Excel

6
Download SPSS

• IBM SPSS is a statistical package widely used in industry (Statistical Package for
the Social Sciences).
• It can be downloaded for free from the UoB software site for current students.
• Ensure you are signed into MyBristol (single sign on)
• Use the search facility of click the link below to get you to the software site.
https://e5.onthehub.com/WebStore/Welcome.aspx?vsro=8&ws=0fbbbbd6-0834-e211-aed3-f04da23e67f6

• Click on Start shopping


• Click on IBM SPSS.
• Click on SPSS statistics 28
• Click Add to cart (free)
• Choose whether you wish to have a Mac or Windows licence
• Make a note of the licence key (a very long string. I would recommend copying and pasting it
or you will need to type it in)
• Install the software. At some stage you will need the license key mentioned above.
Download AMOS

• AMOS is a Structural Equation Modelling software popular in it’s field


• It is a graphical package which allows you to build a pictorial structure for your
proposed model
• We will use this to look at Confirmatory Factor Analysis and also regression
models
• It is an SPSS “add on” and files can be interchangeably used in AMOS and SPSS
• It can be downloaded in a similar fashion to SPSS through the UoB website as
before (link below)

https://e5.onthehub.com/WebStore/Welcome.aspx?vsro=8&ws=0fbbbbd6-0834-e211-aed3-f04da23e67f6
Enable Excel’s Solver

• Solver is a package which comes as standard with Excel, which uses


optimisation algorithms to solve mathematical problems
• We will use Solver as an alternative to regression modelling
• Ensue you enable Solver in Excel through the following steps (next slide). The
option will then appear in the “Data” tab in Excel, on the right hand side.
• Choose File/Options/Add-inns/Manage Excel Add-inns/
Enable Excel’s Solver
Download R and RStudio

• R is an incredibly(!!!!) powerful and diverse (mainly statistical) package.


• It is open source and free to download
• It is growing in popularity and many analytical jobs now look for some
experience in R
• RStudio is a “wrap around”. It does not offer much difference in functionality
to R though I feel it offers an easier navigation around the software.
• You may use R or RStudio for this course (though the examples will be taught
in RStudio). The programming code is identical.
• I will demonstrate R in the final session (Bayesian regression) where we will
use it as the main focus.
• It has a much steeper learning curve than SPSS, though for those of you who
envisage a career in business/marketing analytics, it may be a useful addition
to your skill set.
Download R and RStudio

• Follow the steps to download R from https://www.r-project.org/


Download R and RStudio

• Download RStudio https://rstudio.com/


Download R and RStudio
SPSS/AMOS can be found in the start menu

15
Introduction to SPSS

• Tutor will demonstrate the below live


• Loading the software
• Structure of the main window view (data/variable)
• Output Window
• Opening .sav files
• Importing files (e.g. .CSV files)
• Syntax editing window

Tutor will demonstrate examples live


Variable structure within SPSS

• Variable structure
• Name: A limited length text (no spaces)
• Type: Selection of the nature of the data
• Label: More flexible text descriptor of the variable
• Values: If values are restricted to specific options
• Measure:
• Scale: ratio variable. Abides to mathematical principles (able to perform
addition/subtraction/multiplication etc.) e.g. in a race Mary finished in 54 seconds, John
in 63 seconds. Hence Mary was 63-54 = 9 seconds faster. Or John took 17% longer to
finish ((63-54)/54 *100)
• Ordinal: Only contains an order relationship (e.g. Mary finishes 1 st and John 2nd but no
way assessing the distance between them)
• Nominal: No mathematical comparison, the values are simply codes (e.g. 1= female, 2=
male)
• Changing variable structure options
Tutor will demonstrate examples live
Using the Table option

• Table using two ordinal variables


• Count
• Row %
• Column %
• Table %
Tutor will demonstrate examples live
• Table using one ordinal and one scale variable
• Mean
• Count
Tutor will demonstrate examples live

• Tables allow you to temporarily change the variable Measure option


• Analyse Age as a frequency
• Analyse Likert as an average score
Tutorial structure (Week 1 – 5)

• The tutorial time should be a time you dedicate to working on


this unit.
• There is some flexibility
• Recommended: Replicate asynchronous work from the lectures
• If you are completely happy with the lecture content, you may skip this step
though it is recommended you try n replicate the lecture results yourself.
• Mandatory: Work through the tutorial word document problems.
• These get progressively more difficult. Try and work through as many as you
can in the tutorial. You may continue in your own time. Use Office Hours if you
have questions.
• Optional: Work on your assignment by exploring the techniques

19
Assignment bases (suggestions only!)

• Multiple regression models


• Use of more complex functional forms
• Offset models for multiple product/brand
• Dummy variables
• Solver based estimation of parameters
• Compare different solutions/ways of estimating parameters
• Multinomial logistic regression models
• More complex functional form (e.g. use of interaction terms/offsets)
• Exploratory factor analysis
• Using the resulting models in a predictive model (linear/logistic)
• Comparison with CFA/SEM model
• Cluster analysis
• Discriminant model predicting clusters
• Comparison with logistic regression
• Simulations

20
Suggestion for the content structure

• What is the overriding business/academic/research issue you are addressing?


• What data are you using to achieve this?
• Brief description/summary stats
• Identification of possible dependent/independent variables
• How may they potentially answer the business issue
• What model/analysis will you be using?
• Overview of the model. Any comparative models you may use
• Build the analysis/model
• Report model diagnostics (stats)
• Interpret the model to meet the business need (e.g. to a lay audience)
• Limitations/future developments to improve the model
• Does it lend itself to a machine learning approach?
• Advantages and disadvantages if it does or not

21
Q&A

22

You might also like