Doing Bayesian Data Analysis With JASP: Darrell A. Worthy

Doing Bayesian Data
Analysis with JASP

Darrell A. Worthy
Texas A&M University
JASP
• As we have discussed Bayesian inference offers several advantages
that remain unavailable to researchers who continue to rely on
traditional methods.
• Researchers can update knowledge as the data come in, quantify
evidence for the alternative and null hypotheses, and monitor
evidence until there is sufficiently compelling evidence or available
resources have been exhausted.
• Jeffreys envisioned an evolutionary theory-building process.
• As a method for drawing scientific conclusions from data, Bayesian
inference seems more appropriate than classical inference.
JASP
• You may now be somewhat convinced that there is something to this
Bayesian data analysis stuff.
• Some researchers may see the utility of adopting Bayesian methods,
but have more practical concerns.
• These researchers may feel it is best to use an inclusive approach
where both classical and Bayesian results are reported.
• Both hardcore Bayesian enthusiasts and people on the fence have to
overcome the current difficulty in transitioning from Bayesian theory
into Bayesian practice.
JASP
• For many researchers it is difficult to obtain Bayesian answers to common
statistical questions where we would examine correlations, t tests, ANOVA
and others.
• Until recently these tests were unavailable in any software, let-alone user-
friendly software.
• In the absence of easy-to-use software few researchers feel enticed to learn
about Bayesian inference and few teachers feel enticed to teach it to their
students.
• What’s needed for Bayesian analysis to supplant the problematic and kludgy
frequentist methods is software that enables you to go out and practice and
report the results of Bayesian tests.
JASP
• To narrow the gap between Bayesian theory and practice the Amsterdam
group, partially funded by the European Economic Council (Bayes or Bust)
has developed JASP.
• JASP is a free, open source statistical package with a GUI similar to SPSS.
• JASP was originally conceived as allowing only Bayesian tests, but it now
offers classical tests as well.
• The Bayesian tests utilize the BayesFactor package written by Rouder and
Morey for R.
• I will use a tutorial on JASP by Wagenmakers and colleagues to present
some examples using JASP that you can follow along with on your laptops.
JASP philosophy
• JASP focuses on the statistical methods that researchers use most
often.
• Add-on modules can be used to implement more sophisticated and
specialized analyses.
• JASP also outputs tables in APA format that can be directly pasted into
Microsoft Word of LibreOffice.
• JASP uses progressive disclosure, which means that it starts with
minimal output and you check boxes for additional statistics.
JASP Philosophy
• JASP also allows users to save their data in a .jasp file on the Open
Science Framework (OSF).
• This allows reviewers and readers to have easy access to the data
themselves.
• It also facilitates collaboration as colleagues can share data through the
OSF.
• One goal of JASP is to make benefits of Bayesian data analysis more
widely available than they are now.
• A secondary goal is to reduce our field’s dependence on expensive
statistical software programs like SPSS.
Correlation example
• For the first example we return to the height advantage of US
presidents.
• We’re interested in the Pearson correlation r between the
proportion of the popular vote and the height ratio of the president’s
height divided by his or her competitor.
• In other words, we want to examine the evidence that the data
provide for the hypothesis that taller presidential candidates attract
more votes.
Correlation example
• The sample correlation r was .39

• This was significantly different from zero (p=.007, 95% CI [.116, .613]).
• Under a default uniform prior, the Bayes factor was 6.33 for a two-sided test and
12.61 for a one sided test.
• Now we will replicate these analyses in JASP.
• You can follow along on your laptop if you like.
Correlation example
• You should have already downloaded and installed the latest version
of JASP.
• This tutorial uses JASP 0.8.1.1, released March, 2017.
• I use a PC, but JASP should work on Mac the same way (I hope).
• You should see a blank interface the first time you open JASP
Correlation example
• Click on the File tab and it should default to Open
• Click on Computer and you can search for a file to open from your
computer.
Correlation example
• I could click on the BayesianCourse folder, but I will Browse since that
is the general method for finding any data file on your computer.
• When you click on Browse it will open up Windows file browser that
let’s you find any file.
• In this case if opens up to the last folder I saved a data file to which
was the Presidents.csv file
Correlation example
• Click on Presidents and select open.
• JASP should open the data file in its left window.
• Column 1 is just a subject identifier
• Columns 2 and 3 contain the variables we’re interested in.
Correlation example
• Because JASP uses R packages such as BayesFactor it requires files to
be saved in .csv (comma-separated values) format.
• This is simple and any excel file can be saved as a .csv file.
• The only requirement is that the first row needs to be labeled for the
data.
• In R these labels can be called directly in analyses with functions such
as lm, glm, lmer, and glmer.
Correlation example
• When you open the data JASP takes it’s best guess as to whether each
variable is continuous, ordinal, or categorical (nominal).
• The ruler signs next to each of our variables means JASP thinks these
are continuously valued data.
• If this were incorrect then we could click on the ruler sign and change
it to ordinal or nominal.
Correlation example
• Click on Regression and then Bayesian Correlation Pairs.
• This should cover up much of the Data window and show an Analysis
and Results window
Correlation example
• Click on Height Ratio and then click the right arrow, and repeat for
Popular Vote.
• The Results window will then show minimalist output of the r and BF
for the alternative hypothesis.
Correlation example
• JASP allows us to specify one-tailed hypotheses.
• In the Analysis window under Hypothesis check Correlated positively
• The Results window will immediately update with the one-tailed BF
Correlation example
• Check Correlated again under Hypothesis
• Now in the Analysis window under Plots check Scatterplot
• A scatter plot with the OLS regression line should come up in the Results window.
• In the Results window by clicking on Scatterplot you can copy this plot and paste
where you want.
Correlation example
• In the Analysis window check Prior and Posterior and Additional info
• The plot in the analysis will show the uniform prior as the dashed line
• The posterior distribution of r is the solid line
• It also shows the 95% credible interval (HDI)
Correlation example
• To visualize the odds they have been transformed into probabilities
and shown as proportion wheels.
• To transform odds into probabilities: odds/(odds+1).
• 6.33/7.33=.86, so the red area that represents support for the
alternative hypothesis.
Correlation example
• This figure plots examples of proportion wheels.
• A BF of 3 indicates substantial support for the
alternative hypothesis, yet the null hypothesis still
represents 25% of the circle’s area.
• Suppose you closed your eyes and threw a dart at
the BF=3 wheel.
• Your surprise at landing a dart in the white area
provides a good intuition of what the Bayes factor
conveys.
Correlation example
• Having the imaginary dart land in the white area would be somewhat
surprising, but not sufficiently surprising to warrant a strong claim
such as one that accompanies a published article.
• Yet many p values near the .05 boundary yield evidence that is
weaker than a BF of 3.
• When the competing models are equally likely a priori the probability
of making an error equals the size of the smaller area.
• Note that the Bayesian formulation refers to the probability of making
at error in this particular case, whereas in frequentist methods it is an
average across all possible data sets that could have been observed.
Correlation example
• Bayesian data analysis is a new field and there are not set ways to report the data.
• My recommendation here would be to report r, the BF10 , the 95% HDI, and
possibly the p value, if you must (Bayes factors should obviate that).
• Be sure to clearly state that you’re reporting the highest credibility interval; you
may also want to report the 95% confidence interval.
• JASP calls it the 95% CI, which is easily confusable with a confidence interval.
• I recommend calling it HDI or HCI; be consistent with whatever you do decide to
call the most credible interval.
• The scatterplot is, of course important to plot (Anscombe’s quartet).
• You may want to include the Prior and Posterior plot as well.
Bayesian t-test example
• Topolinski and Sparenberg (2012) presented data that provided
support for the hypothesis that clockwise movements induce
psychological states of temporal progression and an orientation
toward the future and novelty.
• Participants rolled kitchen towels either clockwise or counter
clockwise.
• The data showed that clockwise rollers reported more openness to
experience than counter-clockwise rollers.
• Wagenmakers and colleagues attempted to conduct a pre-registered replication of this
study.
• They proposed to collect a minimum of 20 participants in each condition.
• They then monitored the BF and proposed to stop the experiment once a BF of ten or
1/10 was reached.
• This would indicate strong evidence (the four levels beyond a BF of 3 are substantial,
strong, very strong, and decisive; Wagenmakers and colleagues propose changing
substantial to moderate).
• If strong evidence was not obtained then the experiment would stop once 50 subjects
ran in each condition.
• This specificity is needlessly precise from a Bayesian perspective, but note that they
used a higher threshold for a BF to warrant stopping early.
• In JASP click on File then Browse, and open KitchenRolls.csv
• JASP should open up a new window with the KitchenRolls data
• In the KitchenRolls file the dependent variable of interest is “mean
NEO”, which are the openness to experience scores.
• The “Rotation” variable is for group membership which has entries
either “counter” or “clock”.
• Click on the T-Tests tab, then Bayesian Independent Samples T-test.
• Select mean NEO as the dependent variable.
• Select Rotation as the grouping variable.
• Check the Descriptives box.
• We can see that the mean openness scores are actually higher in the
counter-clockwise group.
• Now check Prior and Posterior and Additional Info
• In the Results window a plot will appear showing the prior and
posterior distributions for effect size d
• The BF01 specifies the evidence for the null hypothesis which is 3.71
• The 95% HDI [-.508, .226] would not be contained in a ROPE between
-.1 and .1.
• From Kruschke’s model estimation approach we could not say that the
data support the null hypothesis.
• The BF01=3.71, however, indicates substantial support for the null.
• In the preregistered study Wagenmakers and colleagues specifically
proposed conducting a one-tailed test for the hypothesis that
openness to experience would be greater for the clockwise group.
• In the analysis window under Hypothesis check Group1>Group2
• This prior expectation, combined with data that show an opposite
pattern increased evidence for the null.
• We have not discussed posterior predictive checks or checks for how
the specification of the priors affected the data.
• The t-test uses a Cauchy prior which is a distribution similar to the t
distribution.
• The Cauchy prior width r is set to .707 as the default in JASP, and
Jeffreys recommended a width of 1.
• As the width r increases the evidence in favor of the null hypothesis
tends to increase.
• For now it’s best to stick to the default value JASP uses
• In the Analysis window check Bayes factor robustness check.
• This plots the evidence for either hypothesis (in this case the null) for
different default Cauchy prior width r values.
• In the Analysis window under Plots check Sequential analysis and Robustness
check
• Check BF01 so it will show evidence favoring the null
• The plot shows the BF as the data came in.
• Note that a BF of 10 was close after about 55 participants ran.
• Now check Descriptives plots under Plots
• This one might show up under Descriptives in the Results window
• The plot will show the means with error bars representing 95% HDIs
• The categories of evidential strength for Bayes
factors was inspired by Jeffreys 1961 paper.
• Wagenmakers and colleagues have adjusted
these somewhat.
• A Bayes factor of 3 indicates moderate rather
than substantial support for H1.
• These labels facilitate scientific communication,
but are really only approximate descriptions of
different standards of evidence.
• BFs will vary slightly for each MCMC run.
One-Way ANOVA example
• An experiment conducted in the 1970s suggests that pain threshold
depends on hair color.
• A pain tolerance test was administered to four groups according to
hair color: light blond, dark blond, light brunette and dark brunette.
• Open the PainThresholds.csv data file in JASP.
• This should open the data set in a new window.
• Click ANOVA and select Bayesian ANOVA
• Go to Descriptives Plots, select Hair Color and click on the arrow for
Horizontal axis.
• Under Display check the 95% credible interval box.
• In the Results window you should see a plot of the data.
• A classic one-way ANOVA yields a p value of .004.
• JASP uses the methodology proposed by Rouder et al., 2012.
• Cauchy priors are used to determine a multivariate effect size, which is
defined in terms of distance to the grand mean.
• Now drag Pain Tolerance to the Dependent Variable list and Hair Color as
a fixed factor.
• We get a BF of 11.97 for the alternative hypothesis.
• The first column list the models under consideration.
• The Null Model contains the grand mean and the Hair Color model
adds the effect of hair color.
• The BF10 column shows the Bayes factor for each model against the
null.
• The error % column is similar to a coefficient of variation, small
variability is better and more important with smaller BFs.
• Column P(M) indicates prior model probabilities.
• Column P(M|data) indicates the updated probabilities after the data.
• Column BFM indicates the degree to which the data have changed the
prior model odds (or the Bayes factor).
• This is taken from the relation between probability and odds or
11.97~=.923/.077
• JASP currently does not produce 95% HDIs for an overall effect size.
• Neither JASP nor the Bayesian framework currently have analogues of the
post-hoc tests typically applied to ANOVAs to follow up a significant effect.
• Correcting for multiple comparisons may not be necessary when using a
Bayesian perspective, as false alarms are not part of the design.
• I recommend reporting the Bayes factor, partial eta squared (from regular
ANOVA), and possibly the p and F values (the BF should obviate these).
• Post-hoc comparisons could be done traditionally or with Bayesian t tests.
• Currently there are no Bayesian corrections for multiple comparisons, and
there is a need to examine whether corrections are necessary.
Two-Way ANOVA example
• The next data set concerns the heights in inches of 235 singers in the
New York Choral Society in 1979.
• The singers’ voices were classified as soprano, alto, tenor, or bass and
recoded as very high, high, low, and very low.
• We are interested in predicting height of the singer from gender and
pitch.
• Go to File then Open and open the Singers.csv file.
• First we will do some traditional analyses and then we’ll do Bayesian
ANOVA.
• Click on ANOVA and then ANOVA
• Move height to the Dependent Variable
• Move Gender and Pitch to the Fixed Factors
• Click on Descriptive Plots in the Analysis window.
• Move Gender to Separate Lines
• Move Pitch to Horizontal Axis
• Go down to Display and select Error bars displaying 95% CI
• Now click on Additional Options.
• Under Marginal Means move Gender, Pitch and Gender * Pitch to the
right.
• This will show the means and cell means we might need to report.
• Now check Descriptive statistics
• This gives additional information
about each group
• Now check Estimates of effect size, and all three choices.

• These will be added to the ANOVA table above in the Results tab
• These analyses would provide you with the F values, and p values if you
choose to report them (or be bold and leave them out).
• They give you effect sizes to report and descriptive statistics.
• Now Click on Bayesian ANOVA and enter Height as the DV and Gender and
Pitch as IVs.
• The first column “Models” lists the five models under consideration.
• These models are specified similarly to separate regression models.
• Gender and Pitch models only contain those predictors.
• Gender + Pitch contains both, and an interaction term is added to the
final model.
• Now look at the BF10 column
• These BFs compare the evidence for each model against the null.
• This is analogous to examining the R2 value in each regression model.
• There is extreme evidence for all models except Pitch alone, for which
there is very strong evidence.
• The model that outperforms the null the most is the Gender + Pitch
model.
• The evidence against including the interaction term is the ratio of
their Bayes Factors: 8.192e+39/8.864e+38 = 9.24
• We could look at the powers 39 vs. 38 and conclude it’s a factor of
about 10.
• The P(M) column indicates the equal assignment of prior model
probability across the five models.
• Column P(M|data) indicates the posterior model probabilities.
• Almost all of the posterior mass is centered on the two main effects
models.
• Column BFM indicates the change from prior to posterior model odds.
• Only the Gender + Pitch model is supported by the data to the extent
that the data increased its model probability.
• These are interpreted as odds so values less than 1 indicate less
support from the model after examining the data.
• If we wished to test whether the interaction is “significant” in a Bayesian
framework (and avoid reporting F or p values) we showed how we can
divide the BF10 for the main effects model, by the BF10 for the model with the
interaction term.
• This yielded a Bayes factor for the main effects model versus the interaction
term which we would report to support the conclusion that there is not an
interaction effect.
• This Bayes factor can be obtained directly by marking the lower-order
variables as nuisance variables.
• This is similar to adding the interaction term in a second step of a
hierarchical regression.
• In analysis tab click on Model and then check Is Nuisance for Gender
and Pitch
• This should show a Bayes factor around an order of 10.
• Keep in mind that Bayes factors are based on estimations from MCMC
so they can change slightly across reanalyses.
• You can also redo the plot with 95% credible intervals instead of
confidence intervals.
• To report the results of a Bayesian ANOVA you can report the BF10 for
each main effect model (Pitch and Gender).
• To determine if there is a significant interaction you can report the
BF10 after designating everything but the interaction as nuisance.
• Currently JASP does not have Bayesian post-hoc tests, simple effects
tests, or a way to select cases (for example only altos and tenors).
• We could do comparisons manually, and in future releases these
functions should become available.
• If we wanted to test whether tenors differed from basses we could
open up the Singers.csv file in excel and create a column called TvsB
• We would not enter a value for Altos or Sopranos and we would enter
Tenors or Bass again for Tenors and Bassists (I’ve already done this).
• Once you save the .csv file JASP will automatically update the data in
its data window.
• Now you can run a Bayesian t test using the Grouping Variable TvsB
and the Dependent Variable Height.
• You can also get the Prior and posterior plots and the Descriptives
plot.
• If we were interested in a particular comparison between two groups
then Bayesian theorists do not currently discourage those
comparisons.
• Basses are about 1.5 inches taller than tenors on average and the BF
supporting the hypothesis that basses are taller than tenors is 11.6.
• We could go to ANOVA then ANOVA then under Post-Hoc tests click
Pitch and run a Tukey test.
• A Tukey test would give us a p value of .18 and we would fail to conclude that
tenors and basses differed significantly in height.
• A Bayesian t test suggests the opposite, that there is strong evidence that
tenors and basses differ in height.
• Which is correct?
• I would argue that if you are interested in whether voice pitch predicts height
among tenors and basses then the Bayesian conclusion is correct.
• If you were given $500 if you drew 5 people in a row who were over six feet
tall from a group of tenors and basses and you had information about their
pitch, would you pick all basses or would you not give pitch information any
credit since a post-hoc test between these groups was not significant?
Another Two-Way ANOVA example
• Now let’s look at another two-way ANOVA example; this time with a significant
interaction.
• The data are from Worthy, Otto and Maddox (2012) JEPLMC
• One factor was working memory (WM) load – people either performed a decision-
making task while also performing a WM-demanding dual task, or they performed
under single-task conditions.
• The other factor was task type – in one task choosing an option that led to smaller
immediate rewards, but larger future rewards was optimal.
• In the other task choosing the immediately rewarding option was optimal.
• Dual task people were predicted to do worse in the future-optimal task, but better in
the immediate-optimal task, because they could not learn the future effects of their
choices.
• Open up the WorthyJEPLMC2012Exp1.csv file in JASP
• Click on ANOVA then Bayesian ANOVA
• Enter Optimal Choices as the DV and WM and Task Type as fixed
factors.
• The BF10 is by far the largest for the full model in the bottom row.
• Task type appears to be a main effect, but the BF10 against the null suggests that
WM-load alone is not a main effect.
• This BF is less than 1/3 which means the null receives moderate support from
the data.
• Now click on Descriptives Plots
• Enter WM as Separate Lines and Task Type on the Horizontal Axis
• Under Display check 95% credible interval.
• We can clearly see a crossover interaction.
• To test whether the interaction warrants inclusion we can divide
(103430/258) = ~ 401.
• This suggests extreme evidence in favor of including the interaction.
• Click on model and then mark WM and Task Type as nuisance
variables.
• For comparison now click on ANOVA and then ANOVA.
• Add Proportion optimal as the DV and WM and Task Type as IVs.
• Under Additional Options select Estimates of effect size and check all.
• The ANOVA will lead to similar conclusions as examining the BF10
values from Bayesian ANOVA.
• However, note that the Bayesian ANOVA compares all possible
models and indicates which one received the most support from the
data.
• This indicated extreme support for the interaction, and that it was the
key aspect of the predictive model.
• This comparison is not provided by ANOVA or effect sizes.
Regression example
• The model-building and comparison approach of Bayesian analysis
using Bayes factors will also be seen when doing regression.
• We’ll next use another published study of mine as an example.
• Worthy, Byrne, and Fields (2014) examined how worry (PSWQ),
anxiety (BAI), and positive and negative mood (PANAS) predicted
decision-making behavior in a variant of the future-optimal task from
the previous example.
• It was predicted that worry may lead to a focus on immediate reward,
and thus high levels of worry would be negatively associated with task
performance.
Regression example
• In JASP open up the WorthyByrneFields2014Worry.csv data set
• Click on Regression and select Bayesian linear regression.
• Move Proportion Optimal to the DV and anxiety, worry, posmood, and
negmood as covariates.
• JASP will test every possible model.
Regression example
• The model with worry appears to receive the most support.
• Anxiety +worry, worry + posmood, and worry + negmood, receive
moderate support.
• If our goal is to ask which factors uniquely affect behavior worry
seems to be the only one.
Regression example
• Now under output select Effects
• This gives a BF for including each predictor, averaged across all models
that have been tested.
• P(incl) is actually the sum of the prior probabilities for models that
included that predicted (.063 * 8).
Regression example
• Now under Model check Is Nuisance for everything by worry.
• This is similar to testing its effect over and above the effect of the
other variables.
• Do the same for other variables as well.
Regression example
• We know worry seems to be the only variable that predicts decision-
making behavior in this task.
• We could now go use JASPs more refined tools for correlations to
examine the posterior distribution of r for worry-performance.
• Go to Regression then Bayesian Correlation Pairs and select worry and
Proportion Optimal.
Regression example
• Now select Scatterplot and Prior and Posterior and Additional info.
• There is strong evidence that r is different from zero.
• A reasonable rope between -.05 and .05 is outside the 95% HDI.
JASP
• JASP allows us to report BFs instead of p values.
• I recommend reporting BFs along with measures of effect size in place of F and p
values.
• If you want to be inclusive or an editor makes you then include both.
• One thing that we cannot currently get from JASP are posterior distributions of
some parameters, like regression coefficients.
• Kruschke provides some excellent high level scripts that give much more
information on parameter estimates (while not giving the same information we
get from BFs).
• For complete treatment of your data I recommend also estimating posterior
distributions for parameters using these scripts.

Doing Bayesian Data Analysis With JASP: Darrell A. Worthy

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Doing Bayesian Data Analysis With JASP: Darrell A. Worthy

Uploaded by

Copyright:

Available Formats

Doing Bayesian Data

Analysis with JASP

• The sample correlation r was .39

• Now check Estimates of effect size, and all three choices.

You might also like