Multilevel Model Foundations

Multilevel Monopoly Foundation Readers:
Potentially Useful Links

On this page you will find links to a number of YouTube videos. At the end you will find a link to a handbook chapter, and an article.

I welcome suggestions for additional links. Write to multilevelmonopoly AT gmail.com

Important note: Within each section, video links are listed in no particular order

Section 1:
Stata and mixed models

This first section includes videos that BOTH use Stata and run multilevel models

#1

WHO: Mike Crowson, Associate Professor of Educational Psychology, University of Oklahoma (OK, USA). He researches, among other areas, moral psychology and how social psychology applies to a range of educational problems including reactions to students with disabilities.
WHAT: Walks through Stata output from a series of two-level multilevel models. Students are nested within schools. Math achievement is the outcome. Either this dataset, or one very like it, is used extensively in Bryk & Raudenbush (1992) and Raudenbush & Bryk (2002). Includes links to dataset and do file.
LENGTH: 44 minutes
DATASET: Real. High School and Beyond dataset from 1982. Link to dataset included.
URL if needed:

https://www.youtube.com/watch?v=5sB49ZThDTo

# 2

WHO: Mike Crowson
WHAT: Multilevel logistic models for a BINARY outcome, owning a particular Justin Bieber album. Two level models with students at level-1 and classrooms at level-2. Video uses the MENUs to generate commands.
LENGTH: 18 minutes
DATASET: Fictional
URL if needed:

https://www.youtube.com/watch?v=lKgTbjrEMPA&t=40s

# 2

# 3

WHO: Mike Crowson
WHAT: Multilevel logistic models for a BINARY outcome. Two level models with students at level-1 and classrooms at level-2. Also uses an effects coded predictor. Video explains the DO or PROGRAM file rather than menus.
LENGTH: 8 minutes
DATASET: Fictional, but includes link to dataset, and to do file.
URL if needed:

https://www.youtube.com/watch?v=BV2fXI6Hjaw

# 3

# 4

WHO: Mike Crowson
WHAT: LIST of six videos he has done about multilevel models in Stata. He also lists videos using R, SPSS, HLM and other programs.

LOTS of interesting stuff here in these other Stata videos listed here.
URL if needed:

https://sites.google.com/view/statistics-for-the-real-world/contents/multilevel-modeling-and-panel-regression

# 4

# 5

WHO: Chuck Huber, Director of Statistical Outreach for Stata (TX, USA).
WHAT: Introduction to Stata multilevel models using the xtmixed command. Years are nested within 48 states which in turn are nested within regions. SO this is actually a THREE LEVEL model, if you are interested in that. Outcome is gross state productivity. Spends a lot of time explaining the variance components in the random effects portion of the model
LENGTH: 10 minutes
DATASET: Real. Public Capital Productivity Data set.
URL for dataset: https://www.stata-press.com/data/r18/productivity.dta.
To CALL dataset from within Stata:

webuse productivity, clear.
Notes for data file lists two citations:
Baltagi, B. H., S. H. Song, and B. C. Jung. 2001. The unbalanced nested error component regression model. Journal of Econometrics 101: 357-381. https://doi.org/10.1016/S0304-4076(00)00089-0.
Munnell, A. H. 1990. Why has productivity growth declined? Productivity and public investment. New England Economic Review Jan./Feb.: 3-22.
URL if needed:

https://www.youtube.com/watch?v=KALxDwwqX1A

# 5

# 6

WHO: Chuck Huber
WHAT: Introduction to Stata multilevel models using the xtmixed command, part 2, focusing on the LONGITUDINAL variation in the outcome.
LENGTH: 9 minutes
DATASET: Productivity.dta, see above
URL if needed:

https://www.youtube.com/watch?v=rUWT_EWV6QI

# 6

Section 2:
Mostly Multilevel Modeling (ideas and models),
Mostly Without Stata

These links either explain the general ideas behind and reasons you want to use mixed models, or provide specific example results.

One (#11) uses clustered data but does not used a mixed model.

Save for # 11, those running particular analyses use R. # 11 uses Stata.

In terms of disciplines, most examples are from social scientists, although one ecological biologist (#12) discusses an example predicting Penguin bodymass.

If you are interested in public health research, # 15 and 16 may be especially helpful.

# 7

WHO: Andy Field, Professor of Quantitative Methods, University of Sussex (UK).
WHAT: An overview of how to model hierarchically structured data. Discusses fixed and random effects. Provides many examples of different types of hierarchically organized data in the early part of the video. There is an associated tutorial about how to do this in R and at the back end of the video you will see lots of R code.
LENGTH: 61 minutes
DATASET: fictional (zombies nested within rehabilitation clinics).
URL if needed:

https://www.youtube.com/watch?v=51SMnDN0ye0

# 7

# 8

WHO: Ian Brunton-Smith, Professor of Criminology and Research Methods, Department of Sociology, University of Surrey (UK).
WHAT: Overview of mixed models with random intercepts. Gives examples where level-1 are individuals and level 2 are neighborhoods, as well as examples where level-1 are repeated observations nested within individuals at level-2. Also mentions cross-classified model structures and multiple membership models. Part 1 of 3.
LENGTH: 9 minutes
DATASET: None
URL if needed:

https://www.youtube.com/watch?v=YLkXP3Edd80&list=PL-XAd1-IhZXZxcWfV0ErYVwPSvGzZ2Lup

# 8

# 9

WHO: Ian Brunton-Smith
WHAT: Reviews multilevel two level models with random intercepts. Part 2 of 3.
LENGTH: 9 minutes
DATASET: Some examples draw on data about fear of crime in England and Wales (level-1 = individuals, level-2 = areas). Resource page links to an earlier publication by the presenter (Brunton-Smith, I., & Sturgis, P. (2011). Do neighborhoods generate fear of crime? An empirical test using the British Crime Survey. Criminology, 49(2), 331-369. doi:10.1111/j.1745-9125.2011.00228.x). That study used three years of British Crime Survey data “covering 2002 to 2005.” In the slides here the data are sourced as “Crime Survey for England and Wales, 2013/14”
URL if needed:

https://www.youtube.com/watch?v=KmbtZNjvsNM&list=PL-XAd1-IhZXZxcWfV0ErYVwPSvGzZ2Lup&index=2
Supporting materials found here: https://www.ncrm.ac.uk/resources/online/all/?materials&id=20720

# 9

# 10

WHO: Ian Brunton-Smith
WHAT: Reviews multilevel two level models with random intercepts AND random slopes. Part 3 of 3.
LENGTH: 9 minutes
DATASET: see above
URL if needed:

https://www.youtube.com/watch?v=YCcLiIKMYL8

# 10

# 11

WHO: Sebastian Wai currently teaches economics at UNC Charlotte's Belk College of Business
WHAT: Uses Stata. Provides an example of city-level yearly crime rates for two different years. Although he uses a monolevel model in effect years, level 1, are nested within cities at level 2. He illustrates fixed effects models to control for city-level differences. He compares this to a “within-estimator” model that group mean centers the crime outcome variable by city, and the unemployment predictor variable by city, so city differences are discarded and only temporal variation remains. Discusses the difference between these two approaches. Also discusses how to xtset panel data, and then issue the Stata xtreg command with the fe (fixed effects) option. Also discusses differencing. Although multilevel commands (e.g., mixed) not used , material is helpful for thinking about cross sectional panel design data.
LENGTH: 13 minutes
DATASET: Real, crime2.dta. Originally from Wooldridge's Introductory Econometrics: A Modern Approach.

To access, in Stata, issue the following command

ssc install bcuse

After it installs issue this command:

bcuse crime2, clear

URL for video if needed:

https://www.youtube.com/watch?v=H95BHswbT3w

# 11

# 12

WHO: Chloe Fouilloux is currently “a postdoctoral researcher based at the University of Wisconsin, Madison” (WI, USA). She has published research on tadpole behavior and male rocket frogs’ defense of space, among other things.
WHAT: Working in R she examines several multilevel models using the glmm module to predict penguin bodymass. She explores both species and island as random effects, landing ultimately on a model with fixed effects for species, fixed effects for sex, and random effects for islands. Various graphical data displays clarify key points.
DATASET: Real, maintained by a scholar working in the area
LENGTH: 12 minutes
URL if needed:

https://www.youtube.com/watch?v=smYZdlbE9m8

# 12

# 13

WHO: Violet Brown is a postdoctoral researcher at Carleton College (MN, USA) working in the Carleton Perception Lab. The lab does research focusing “on how humans understand spoken language.”
WHAT: Uses the R package lme4 to explain random intercept models and the reasons for using them. Illustrates with different fictional data sets. Explains and illustrates Simpson’s paradox. Spends considerable time giving examples of dependencies across observations. Part 1 of 2.
LENGTH: 21 minutes
DATASET: Various fictional
URL if needed:

https://www.youtube.com/watch?v=3OFXxh4yORU

# 13

# 14

WHO: Violet Brown
WHAT: Using the R package lme4, she illustrates a model with varying slopes along with varying intercepts. She spends several minutes toward the end of the video talking through setting up the lmer R code. Later, she introduces ideas of empirical Bayes adjustments, discussing these as shrinkage or partial pooling. Finally, toward the very end, she does mention the correlation between group level intercepts and group level slopes. This last point is another illustration of the problems surfacing in Chapter 8 (pp. 98-101). Part 2 of 2.
LENGTH: 26 minutes
DATASET: Fictional
URL if needed:

https://www.youtube.com/watch?v=_UmY-3brJJ0

# 14

# 15

WHO: Julia Wrobel is an assistant professor of biostatistics in the Department of Biostatistics and Bioinformatics at Emory University (GA, USA). Her research analyzes and visualizes functional data.
WHAT: Introduces key ideas behind generalized estimating equations (GEE) or marginal models, and linear mixed models (LMM)/random effects models. Examples emphasize modeling longitudinal data. Discusses differences between GEE and LMM.
LENGTH: 10 minutes
DATASET: None
URL if needed:

https://www.youtube.com/watch?v=iijMU5T_1lY

# 15

# 16

WHO: Julia Wrobel
WHAT: Implements and interprets mixed models with a binary outcome (mixed effects logistic regression) using longitudinal data. Working in R using the glmer function within lme4 package
LENGTH: 25 minutes
DATASET: cbpp loaded in R; “this data set describes the serological incidence of CBPP in zebu cattle during a follow-up survey … in 15 commercial herds in the Boji district of Ethiopia.” Focus was “the within-herd spread of CBPP in newly infected herds.”

Second dataset, provided by the instructor, reports health status of women of childbearing age. With this second data set, the outcome is binary and the data are longitudinal with repeated observations for individuals. ALLOWS for a random slope of time. With this last model, where intercepts and time impacts can both vary across persons, she finds that the varying intercept and the varying slope for time correlate perfectly. She comments this “is a little crazy” and notes that “something might be a little bit off about this analysis.” Nice example of the discussion about “checking under the hood” with longitudinal models in Chapter 9, pp. 122-129.
URL if needed:

https://www.youtube.com/watch?v=EibODFjUtqs

# 16

Section 3:
Author supplies a handbook chapter and Stata code for a binary example

# 17

WHAT: 2023 handbook chapter provides introductory comments about mixed models, and illustrates with an analysis of whether observed pedestrians in small commercial centers patronized a local business or not.

CITATION: Taylor, R. B. (2023). Mixed (A.K.A. Multilevel) Models. In E. R. Groff & C. P. Haberman (Eds.), Understanding crime and place: A Methods handbook (pp. 242-251). Philadelphia, PA: Temple University Press.
DATASET: Taylor, R. B. (2006). Impact of Neighborhood Structure, Crime, and Physical Deterioration on Residents and Business Personnel in Minneapolis-St.Paul, 1970-1982. Inter-university Consortium for Political and Social Research, University of Michigan [distributor], 2006-01-18. https://doi.org/10.3886/ICPSR02371.v1

Handbook chapter

Do / Output files

Section 4: An Abstract and an article

# 18

You can access two items with the links below.

First, an abstract to one of the most widely cited articles in criminology:

Sampson, R. J., Raudenbush, S. W., & Earls, F. (1997). Neighborhoods and violent crime: A multi-level study of collective efficacy. Science, 277, 918-924. doi: doi: 10.1126/science.277.5328.918

DATASET: Search ICPSR at the University of Michigan for files that are part of the Project on Human Development in Chicago Neighborhoods (PHDCN)

https://www.icpsr.umich.edu/web/pages/

# 18

# 19

Second, an article looking at municipality cluster membership, based on violent crime rates, over time.

WHO: Lallen Johnson is an Associate Professor, in the School of Public Affairs at American University (DC, USA).

WHAT: Years (Level-1) are nested within municipalities at Level-2 focusing on the Philadelphia (PA)-Camden (NJ) metropolitan area. The focus is membership in spatial clusters classified according to violent crime rates.

OUTCOME: The outcome of interest is nominal: membership in a local cluster of relatively high violence, relatively low violence, or mixed violence locales within the broader metro region.

MODELING: Analysis uses LISA statistics to classify municipalities into spatial clusters, and a Stata add-on, glamm, to predict cluster classification. So this is a multilevel multinomial model.

MORE SPECIFICALLY: "Multilevel/mixed effects multinomial logistic regression models were used with demographic predictors and violent crime cluster classifications as the outcome. Years were nested within municipalities. Random intercepts for each jurisdiction for each binary contrast were included. Municipality-specific random effects 'accommodate longitudinal dependence' in the outcome data over time and represent 'unobserved heterogeneity' across municipalities (Rabe-Hesketh &
Skrondal,2012b, p. 659). The effects for specific predictors are therefore conditional on the municipality-level random effects. Models were fitted using GLLAMM (Generalized Linear Latent and Mixed Models)."

PERHAPS OF INTEREST IF: You want to know more about models with a multinomial outcome and a longitudinal data structure.

# 19

Home

Multilevel Monopoly Foundation Readers: Potentially Useful Links

Important note: Within each section, video links are listed in no particular order

Section 1: Stata and mixed models

#1

# 2

# 3

# 4

# 5

# 6

Section 2: Mostly Multilevel Modeling (ideas and models), Mostly Without Stata

# 7

# 8

# 9

# 10

# 11

# 12

# 13

# 14

# 15

# 16

Section 3: Author supplies a handbook chapter and Stata code for a binary example

# 17

Section 4: An Abstract and an article

# 18

# 19

Multilevel Monopoly Foundation Readers:
Potentially Useful Links

Section 1:
Stata and mixed models

Section 2:
Mostly Multilevel Modeling (ideas and models),
Mostly Without Stata

Section 3:
Author supplies a handbook chapter and Stata code for a binary example