0% found this document useful (0 votes)

29 views2 pages

Babies Learning Language - Methods (05-06)

The document discusses the challenges of fitting statistical models using frequentist methods, particularly with random effects in regression models. It introduces Bayesian methods and the brms package as a more efficient and user-friendly alternative for fitting complex models, allowing researchers to obtain interpretable results without extensive tinkering. The author advocates for the adoption of Bayesian approaches in data analysis practices due to their simplicity and effectiveness.

Uploaded by

Luis Luengo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views2 pages

Babies Learning Language - Methods (05-06)

Uploaded by

Luis Luengo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

vulnerable to all kinds of charges of post-hoc model tinkering to get the result you

want. We've had to specify lab best-practices about the order for pruning random effects –
kind of a guide to "tinkering until it works," which seems suboptimal. In sum, the models
are great, but the methods for fitting them don't seem to work that well.

Enter Bayesian methods. For several years, it's been possible to fit Bayesian regression
models using Stan, a powerful probabilistic programming language that interfaces with R.
Stan, building on BUGS before it, has put Bayesian regression within reach for someone who
knows how to write these models (and interpret the outputs). But in practice, when you
could fit an lmer in one line of code and five seconds, it seemed like a bit of a trial to hew
the model by hand out of solid Stan code (which looks a little like C: you have to declare
your variable types, etc.). We have done it sometimes, but typically only for models that
you couldn't fit with lme4 (e.g., an ordered logit model). So I still don't teach this set of
methods, or advise that students use them by default.

brms?!? A worked example

In the last couple of years, the package brms has been in development. brms is essentially
a front-end to Stan, so that you can write R formulas just like with lme4 but fit them with
Bayesian inference.* This is a game-changer: all of a sudden we can use the same syntax
but fit the model we want to fit! Sure, it takes 2-3 minutes instead of 5 seconds, but the
output is clear and interpretable, and we don't have all the specification issues described
above. Let me demonstrate.

The dataset I'm working on is an unpublished set of data on kids' pragmatic inference
abilities. It's similar to many that I work with. We show children of varying ages a set of
images and ask them to choose the one that matches some description, then record if they
do so correctly. Typically some trials are control trials where all the child has to do is
recognize that the image matches the word, while others are inference trials where they
have to reason a little bit about the speaker's intentions to get the right answer. Here are
the data from this particular experiment:

I'm interested in quantifying the relationship between participant age and the probability
of success in pragmatic inference trials (vs. control trials, for example). My model
specification is:

(3) correct ~ condition * age + (condition | subject) + (condition |

stimulus)

So I first fit this with lme4. Predictably, the full desired model doesn't converge, but here
are the fixed effect coefficients:

beta stderr z p
intercept 0.50 0.19 2.65 0.01
condition 2.13 0.80 2.68 0.01
age 0.41 0.18 2.35 0.02
condition:age -0.22 0.36 -0.61 0.54
Now let's prune the random effects until the convergence warning goes away. In the
simplified version of the dataset that I'm using here I can keep stimulus and subject
intercepts and still get convergence when there are no random slopes. But in the larger
dataset, the model won't converge unless i do just the random intercept by subject:

beta stderr z p
intercept 0.50 0.21 2.37 0.02
condition 1.76 0.33 5.35 0.00
age 0.41 0.18 2.34 0.02
condition:age -0.25 0.33 -0.77 0.44

Coefficient values are decently different (but the p-values are not changed dramatically in
this example, to be fair). More importantly, a number of fairly trivial things matter to
whether the model converges. For example, I can get one random slope in if I set the other
level of the condition variable to be the intercept, but it doesn't converge with either in
this parameterization. And in the full dataset, the model wouldn't converge at all if I didn't
center age. And then of course I haven't tweaked the optimizer or messed with the
convergence settings for any of these variants. All of this means that there are a lot of
decisions about these models that I don't have a principled way to make – and critically,
they need to be made conditioned on the data, because I won't be able to tell whether a
model will converge a priori!

So now I switched to the Bayesian version using brms, just writing brm() with the model
specification I wanted (3). I had to do a few tweaks: upping the number of iterations
(suggested by the warning messages from the output, changing to a Bernoulli model rather
than binomial (for efficiency, again suggested by the error message), but this was very
straightforward otherwise. For simplicity I've adopted all the default prior choices, but I
could have gone more informative.

Here's the summary output for the fixed effects:

estimate error l-95% CI u-95% CI
intercept 0.54 0.48 -0.50 1.69
condition 2.78 1.43 0.21 6.19
age 0.45 0.20 0.08 0.85
condition:age -0.14 0.45 -0.98 0.84

From this call, we get back coefficient estimates that are somewhat similar to the other
models, along with 95% credible interval bounds. Notably, the condition effect is larger
(probably corresponding to being able to estimate a more extremal value for the logit
based on sparse data), and then the interaction term is smaller but has higher error.
Overall, coefficients look more like the first non-convergent maximal model than the
second converging one.

The big deal about this model is not that what comes out the other end of the procedure is
radically different. It's that it's not different. I got to fit the model I wanted, with a
maximal random effects structure, and the process was almost trivially easy. In addition,
and as a bonus, the CIs that get spit out are actually credible intervals that we can reason
about in a sensible way (as opposed to frequentist confidence intervals, which are quite
confusing if you think about them deeply enough).

Conclusion

Bayesian inference is a powerful and natural way of fitting statistical models to data. The
trouble is that, up until recently, you could easily find yourself in a situation where there
was a dead-obvious frequentist solution but off-the-shelf Bayesian tools wouldn't work or
would generate substantial complexity. That's no longer the case. The existence of tools
like BayesFactor and brms means that I'm going to suggest that people in my lab go
Bayesian by default in their data analytic practice.

----
Thanks to Roger Levy for pointing out that model (3) above could include an age | stimulus
slope to be truly maximal. I will follow this advice in the paper.

* Who would have thought that a paper about statistical models would be called "the cave
of shadows"?
** Rstanarm did this also, but it covered fewer model specifications and so wasn't as
helpful.

Posted by Michael Frank at 11:18 PM 23 comments:

Labels: Education, Methods

Tu e s d a y, J a n u a r y 1 6 , 2 0 1 8

MetaLab, an open resource for theoretical synthesis

using meta-analysis, now updated
(This post is jointly written by the MetaLab team, with contributions from Christina
Bergmann, Sho Tsuji, Alex Cristia, and me.)

A typical “ages and stages” ordering. Meta-analysis helps us do better.

Developmental psychologists often make statements of the form “babies do X at age Y.”
But these “ages and stages” tidbits sometimes misrepresent a complex and messy research
literature. In some cases, dozens of studies test children of different ages using different
tasks and then declare success or failure based on a binary p < .05 criterion. Often only a
handful of these studies – typically those published earliest or in the most prestigious
journals – are used in reviews, textbooks, or summaries for the broader public. In medicine
and other fields, it’s long been recognized that we can do better.

Meta-analysis (MA) is a toolkit of techniques for combining information across disparate

studies into a single framework so that evidence can be synthesized objectively. The results
of each study are transformed into a standardized effect size (like Cohen’s d) and are
treated as a single data point for a meta-analysis. Each data point can be weighted to
reflect a given study’s precision (which typically depends on sample size). These weighted
data points are then combined into a meta-analytic regression to assess the evidential
value of a given literature. Follow-up analyses can also look at moderators – factors
influencing the overall effect – as well as issues like publication bias or p-hacking.*
Developmentalists will often enter participant age as a moderator, since meta-analysis
enables us to statistically assess how much effects for a specific ability increase as infants
and children develop.

Multiple Linear Regression Analysis
No ratings yet
Multiple Linear Regression Analysis
9 pages
Class 10 Multilevel Models
No ratings yet
Class 10 Multilevel Models
42 pages
Fixed vs. Random Effects in Data Analysis
No ratings yet
Fixed vs. Random Effects in Data Analysis
82 pages
Multiple Regression Inference Overview
No ratings yet
Multiple Regression Inference Overview
89 pages
Statistical Methods Evaluation Guide
No ratings yet
Statistical Methods Evaluation Guide
13 pages
Interaction Effects in Group Comparisons
No ratings yet
Interaction Effects in Group Comparisons
30 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
5 pages
Understanding Confidence and Prediction Intervals
No ratings yet
Understanding Confidence and Prediction Intervals
40 pages
Statistical Inference For Data Science Compress
No ratings yet
Statistical Inference For Data Science Compress
78 pages
Introduction to Regression Techniques
No ratings yet
Introduction to Regression Techniques
129 pages
Exploratory Data Analysis in Python
No ratings yet
Exploratory Data Analysis in Python
40 pages
Quiz 2 Cheatsheet v3
No ratings yet
Quiz 2 Cheatsheet v3
2 pages
Bayesian Power Analysis
No ratings yet
Bayesian Power Analysis
7 pages
Machine Learning and Pattern Recognition Bayesian Complexity Control
No ratings yet
Machine Learning and Pattern Recognition Bayesian Complexity Control
4 pages
Stata Data Analysis for ECON 139/239
No ratings yet
Stata Data Analysis for ECON 139/239
5 pages
KHDL
No ratings yet
KHDL
133 pages
Bootstrapping and Permutation Tests in R
No ratings yet
Bootstrapping and Permutation Tests in R
4 pages
Regression Analysis Guidelines and Data
No ratings yet
Regression Analysis Guidelines and Data
49 pages
Practical Bayesian Inference
100% (2)
Practical Bayesian Inference
322 pages
Econometrics Cheat Sheet
No ratings yet
Econometrics Cheat Sheet
4 pages
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
No ratings yet
Generalization Bounds and Representation Learning For Estimation of Potential Outcomes and Causal Effects
50 pages
Economics Problem Set on Hypothesis Testing
No ratings yet
Economics Problem Set on Hypothesis Testing
5 pages
Analysing Panel Data
No ratings yet
Analysing Panel Data
25 pages
Econometrics Trial Exam Instructions
No ratings yet
Econometrics Trial Exam Instructions
15 pages
Midterm Fall2011
No ratings yet
Midterm Fall2011
13 pages
Essential Statistical Tests for Engineers
No ratings yet
Essential Statistical Tests for Engineers
51 pages
Lecture 18 04042025 Sources of Bias Intro To Panel Data
No ratings yet
Lecture 18 04042025 Sources of Bias Intro To Panel Data
29 pages
Panel Lecture - Gujarati
100% (1)
Panel Lecture - Gujarati
26 pages
Modern Regression Homework 5-1
No ratings yet
Modern Regression Homework 5-1
8 pages
ABTutorial 6 Recentdevelopments
No ratings yet
ABTutorial 6 Recentdevelopments
29 pages
Propensity Score Analysis in R
No ratings yet
Propensity Score Analysis in R
74 pages
Understanding p-Values and Hypothesis Testing
No ratings yet
Understanding p-Values and Hypothesis Testing
2 pages
Heteroscedasticity in Econometrics Analysis
No ratings yet
Heteroscedasticity in Econometrics Analysis
8 pages
Mixed Effects Models in Panel Data Analysis
No ratings yet
Mixed Effects Models in Panel Data Analysis
31 pages
Probability Basics Lab Activity Guide
No ratings yet
Probability Basics Lab Activity Guide
3 pages
Type I and II Error Analysis in Hypothesis Testing
No ratings yet
Type I and II Error Analysis in Hypothesis Testing
5 pages
Assignment A (Hand In)
No ratings yet
Assignment A (Hand In)
6 pages
EC501 Lecture 03
No ratings yet
EC501 Lecture 03
30 pages
Quiz 1 Statistics - Coursera
No ratings yet
Quiz 1 Statistics - Coursera
1 page
Cox Model Spline Terms Explained
No ratings yet
Cox Model Spline Terms Explained
10 pages
Panel Guidelines
No ratings yet
Panel Guidelines
3 pages
Health Care Panel Data Analysis
No ratings yet
Health Care Panel Data Analysis
58 pages
Bootstrapping and Prediction Intervals in Statistics
No ratings yet
Bootstrapping and Prediction Intervals in Statistics
10 pages
Logit and Probit Models Explained
No ratings yet
Logit and Probit Models Explained
66 pages
DATA 1001 Course Notes
No ratings yet
DATA 1001 Course Notes
94 pages
Univariate Regression and Data Subsampling
No ratings yet
Univariate Regression and Data Subsampling
36 pages
ProbList5 24 SLN
No ratings yet
ProbList5 24 SLN
9 pages
R Programming for Statistical Analysis
No ratings yet
R Programming for Statistical Analysis
2 pages
Stata Margins Command for Predictions
No ratings yet
Stata Margins Command for Predictions
48 pages
Practice Final Exam S15 Ver2
No ratings yet
Practice Final Exam S15 Ver2
16 pages
Bayesian Models for Statistical Tests
No ratings yet
Bayesian Models for Statistical Tests
37 pages
Stata GRT Analysis Guide
No ratings yet
Stata GRT Analysis Guide
17 pages
Random Effects in ANOVA Analysis
No ratings yet
Random Effects in ANOVA Analysis
37 pages
DSO 510 16202 - Captions - English (United States)
No ratings yet
DSO 510 16202 - Captions - English (United States)
17 pages
Splines
No ratings yet
Splines
13 pages
R Statistical Analysis Techniques Guide
No ratings yet
R Statistical Analysis Techniques Guide
2 pages
Econometrics Questions and Solutions Guide
100% (1)
Econometrics Questions and Solutions Guide
11 pages
STAT 301 Cheatsheet PDF
No ratings yet
STAT 301 Cheatsheet PDF
2 pages
Management Science Seatwork 1 S/Y 2021-2022
No ratings yet
Management Science Seatwork 1 S/Y 2021-2022
2 pages
Enders Lee April 18 2010
No ratings yet
Enders Lee April 18 2010
33 pages
Sampling Distribution of Sample Means Using The Central Limit Theorem
No ratings yet
Sampling Distribution of Sample Means Using The Central Limit Theorem
6 pages
Hmac'
No ratings yet
Hmac'
19 pages
Solve Trig Equations Worksheet 12.15 Pp4yf1 PDF
No ratings yet
Solve Trig Equations Worksheet 12.15 Pp4yf1 PDF
3 pages
Types of Simulation Models Explained
No ratings yet
Types of Simulation Models Explained
15 pages
Real-Time Door Knock Detection System
No ratings yet
Real-Time Door Knock Detection System
9 pages
3.4 Exploring Polynomial Functions - Investigation
No ratings yet
3.4 Exploring Polynomial Functions - Investigation
7 pages
Data Structures Algorithms Mock Test I
No ratings yet
Data Structures Algorithms Mock Test I
6 pages
Forecasting Cloud Storage Costs Using Machine Learning On Usage Patterns
No ratings yet
Forecasting Cloud Storage Costs Using Machine Learning On Usage Patterns
7 pages
Presentation - SKCT (10.08.2022)
No ratings yet
Presentation - SKCT (10.08.2022)
15 pages
BFS and DFS Graph Algorithms
No ratings yet
BFS and DFS Graph Algorithms
14 pages
HCLT106-1-Jan-Jun2024-SA1-Exam Revision-ST-V.1-14052024
No ratings yet
HCLT106-1-Jan-Jun2024-SA1-Exam Revision-ST-V.1-14052024
20 pages
Exp - 2 - SC - 2024 (Sharon Suresh) - Jupyter Notebook
No ratings yet
Exp - 2 - SC - 2024 (Sharon Suresh) - Jupyter Notebook
4 pages
Week 11 - Paradigms of Machine Learning - I
No ratings yet
Week 11 - Paradigms of Machine Learning - I
35 pages
Probability Distributions in Biostatistics
No ratings yet
Probability Distributions in Biostatistics
13 pages
Logistic Regression Cheat Sheet
No ratings yet
Logistic Regression Cheat Sheet
2 pages
Coding Theory and Techniques
No ratings yet
Coding Theory and Techniques
1 page
Hill Climbing Algorithm Guide
No ratings yet
Hill Climbing Algorithm Guide
5 pages
Heat Transfer Calculation by Trial Method
No ratings yet
Heat Transfer Calculation by Trial Method
2 pages
Assignments
No ratings yet
Assignments
7 pages
Matrix-Chain Multiplication Solutions
No ratings yet
Matrix-Chain Multiplication Solutions
2 pages
Matrix Applications in 3D Imaging & SEO
No ratings yet
Matrix Applications in 3D Imaging & SEO
20 pages
BL5229 Intro
No ratings yet
BL5229 Intro
18 pages
Axiomatic Semantics: Programming Languages and Compilers (CS 421)
No ratings yet
Axiomatic Semantics: Programming Languages and Compilers (CS 421)
9 pages
Bayesian Analysis of Gas Pipeline Accidents
No ratings yet
Bayesian Analysis of Gas Pipeline Accidents
4 pages
Demand Forcasting Exercises - Q123456
No ratings yet
Demand Forcasting Exercises - Q123456
5 pages
Applying Operations Research To Railway Operations - Rajnish Kumar
No ratings yet
Applying Operations Research To Railway Operations - Rajnish Kumar
13 pages
Advanced Block Cipher Design
No ratings yet
Advanced Block Cipher Design
7 pages
Quantum Computing
No ratings yet
Quantum Computing
2 pages

Babies Learning Language - Methods (05-06)

Uploaded by

Babies Learning Language - Methods (05-06)

Uploaded by

vulnerable to all kinds of charges of post-hoc model tinkering to get the result you

brms?!? A worked example

(3) correct ~ condition * age + (condition | subject) + (condition |

Here's the summary output for the fixed effects:

Posted by Michael Frank at 11:18 PM 23 comments:

Labels: Education, Methods

MetaLab, an open resource for theoretical synthesis

A typical “ages and stages” ordering. Meta-analysis helps us do better.

Meta-analysis (MA) is a toolkit of techniques for combining information across disparate

You might also like