Mixed modelling is very useful, and easier than you think! Mixed modelling is now well established as a powerful approach to statistical data analysis. It is based on the recognition of random-effect terms in statistical models, leading to inferences and estimates that have much wider applicability and are more realistic than those otherwise obtained. Introduction to Mixed Modelling leads the reader into mixed modelling as a natural extension of two more familiar methods, regression analysis and analysis of variance. It provides practical guidance combined with a clear explanation of the underlying concepts. Like the first edition, this new edition shows diverse applications of mixed models, provides guidance on the identification of random-effect terms, and explains how to obtain and interpret best linear unbiased predictors (BLUPs). It also introduces several important new topics, including the following: Use of the software SAS, in addition to GenStat and R. Meta-analysis and the multiple testing problem. The Bayesian interpretation of mixed models.Including numerous practical exercises with solutions, this book provides an ideal introduction to mixed modelling for final year undergraduate students, postgraduate students and professional researchers. It will appeal to readers from a wide range of scientific disciplines including statistics, biology, bioinformatics, medicine, agriculture, engineering, economics, archaeology and geography. Praise for the first edition: "One of the main strengths of the text is the bridge it provides between traditional analysis of variance and regression models and the more recently developed class of mixed models...Each chapter is well-motivated by at least one carefully chosen example...demonstrating the broad applicability of mixed models in many different disciplines...most readers will likely learn something new, and those previously unfamiliar with mixed models will obtain a solid foundation on this topic."- Kerrie Nelson University of South Carolina, in American Statistician, 2007
Preface xi
1 The need for more than one random-effect term 1 (51)
when fitting a regression line
1.1 A data set with several observations of 1 (1)
variable Y at each value of variable X
1.2 Simple regression analysis: Use of the 2 (7)
software GenStat to perform the analysis
1.3 Regression analysis on the group means 9 (1)
1.4 A regression model with a term for the 10 (3)
groups
1.5 Construction of the appropriate F test 13 (1)
for the significance of the explanatory
variable when groups are present
1.6 The decision to specify a model term as 14 (2)
random: A mixed model
1.7 Comparison of the tests in a mixed model 16 (1)
with a test of lack of fit
1.8 The use of Residual Maximum Likelihood 17 (4)
(REML) to fit the mixed model
1.9 Equivalence of the different analyses 21 (5)
when the number of observations per group is
constant
1.10 Testing the assumptions of the analyses: 26 (2)
Inspection of the residual values
1.11 Use of the software R to perform the 28 (5)
analyses
1.12 Use of the software SAS to perform the 33 (7)
analyses
1.13 Fitting a mixed model using GenStat's 40 (6)
Graphical User Interface (GUI)
1.14 Summary 46 (1)
1.15 Exercises 47 (4)
References 51 (1)
2 The need for more than one random-effect term 52 (35)
in a designed experiment
2.1 The split plot design: A design with more 52 (2)
than one random-effect term
2.2 The analysis of variance of the split 54 (8)
plot design: A random-effect term for the
main plots
2.3 Consequences of failure to recognize the 62 (2)
main plots when analysing the split plot
design
2.4 The use of mixed modelling to analyse the 64 (2)
split plot design
2.5 A more conservative alternative to the F 66 (1)
and Wald statistics
2.6 Justification for regarding block effects 67 (1)
as random
2.7 Testing the assumptions of the analyses: 68 (3)
Inspection of the residual values
2.8 Use of R to perform the analyses 71 (6)
2.9 Use of SAS to perform the analyses 77 (4)
2.10 Summary 81 (1)
2.11 Exercises 82 (4)
References 86 (1)
3 Estimation of the variances of random-effect 87 (50)
terms
3.1 The need to estimate variance components 87 (1)
3.2 A hierarchical random-effects model for a 87 (4)
three-stage assay process
3.3 The relationship between variance 91 (2)
components and stratum mean squares
3.4 Estimation of the variance components in 93 (2)
the hierarchical random-effects model
3.5 Design of an optimum strategy for future 95 (3)
sampling
3.6 Use of R to analyse the hierarchical 98 (2)
three-stage assay process
3.7 Use of SAS to analyse the hierarchical 100(2)
three-stage assay process
3.8 Genetic variation: A crop field trial 102(4)
with an unbalanced design
3.9 Production of a balanced experimental 106(4)
design by 'padding' with missing values
3.10 Specification of a treatment term as a 110(2)
random-effect term: The use of mixed-model
analysis to analyse an unbalanced data set
3.11 Comparison of a variance component 112(1)
estimate with its standard error
3.12 An alternative significance test for 113(3)
variance components
3.13 Comparison among significance tests for 116(1)
variance components
3.14 Inspection of the residual values 117(1)
3.15 Heritability: The prediction of genetic 117(5)
advance under selection
3.16 Use of R to analyse the unbalanced field 122(3)
trial
3.17 Use of SAS to analyse the unbalanced 125(3)
field trial
3.18 Estimation of variance components in the 128(2)
regression analysis on grouped data
3.19 Estimation of variance components for 130(2)
block effects in the split-plot experimental
design
3.20 Summary 132(1)
3.21 Exercises 133(3)
References 136(1)
4 Interval estimates for fixed-effect terms in 137(28)
mixed models
4.1 The concept of an interval estimate 137(1)
4.2 Standard errors for regression 138(4)
coefficients in a mixed-model analysis
4.3 Standard errors for differences between 142(2)
treatment means in the split-plot design
4.4 A significance test for the difference 144(3)
between treatment means
4.5 The least significant difference (LSD) 147(4)
between treatment means
4.6 Standard errors for treatment means in 151(6)
designed experiments: A difference in
approach between analysis of variance and
mixed-model analysis
4.7 Use of R to obtain SEs of means in a 157(2)
designed experiment
4.8 Use of SAS to obtain SEs of means in a 159(2)
designed experiment
4.9 Summary 161(2)
4.10 Exercises 163(1)
References 164(1)
5 Estimation of random effects in mixed models: 165(27)
Best Linear Unbiased Predictors (BLUPs)
5.1 The difference between the estimates of 165(3)
fixed and random effects
5.2 The method for estimation of random 168(2)
effects: The best linear unbiased predictor
(BLUP) or 'shrunk estimate'
5.3 The relationship between the shrinkage of 170(6)
BLUPs and regression towards the mean
5.4 Use of R for the estimation of fixed and 176(2)
random effects
5.5 Use of SAS for the estimation of random 178(4)
effects
5.6 The Bayesian interpretation of BLUPs: 182(5)
Justification of a random-effect term without
invoking an underlying infinite population
5.7 Summary 187(1)
5.8 Exercises 188(3)
References 191(1)
6 More advanced mixed models for more elaborate 192(25)
data sets
6.1 Features of the models introduced so far: 192(1)
A review
6.2 Further combinations of model features 192(3)
6.3 The choice of model terms to be specified 195(2)
as random
6.4 Disagreement concerning the appropriate 197(7)
significance test when fixed- and
random-effect terms interact: 'The great
mixed-model muddle'
6.5 Arguments for specifying block effects as 204(5)
random
6.6 Examples of the choice of fixed- and 209(4)
random-effect specification of terms
6.7 Summary 213(2)
6.8 Exercises 215(1)
References 216(1)
7 Three case studies 217(78)
7.1 Further development of mixed modelling 217(1)
concepts through the analysis of specific
data sets
7.2 A fixed-effects model with several 218(15)
variates and factors
7.3 Use of R to fit the fixed-effects model 233(4)
with several variates and factors
7.4 Use of SAS to fit the fixed-effects model 237(5)
with several variates and factors
7.5 A random coefficient regression model 242(4)
7.6 Use of R to fit the random coefficients 246(1)
model
7.7 Use of SAS to fit the random coefficients 247(2)
model
7.8 A random-effects model with several 249(17)
factors
7.9 Use of R to fit the random-effects model 266(8)
with several factors
7.10 Use of SAS to fit the random-effects 274(8)
model with several factors
7.11 Summary 282(1)
7.12 Exercises 282(12)
References 294(1)
8 Meta-analysis and the multiple testing problem 295(55)
8.1 Meta-analysis: Combined analysis of a set 295(1)
of studies
8.2 Fixed-effect meta-analysis with 296(5)
estimation only of the main effect of
treatment
8.3 Random-effects meta-analysis with 301(2)
estimation of study x treatment interaction
effects
8.4 A random-effect interaction between two 303(4)
fixed-effect terms
8.5 Meta-analysis of individual-subject data 307(5)
using R
8.6 Meta-analysis of individual-subject data 312(6)
using SAS
8.7 Meta-analysis when only summary data are 318(8)
available
8.8 The multiple testing problem: Shrinkage 326(12)
of BLUPs as a defence against the Winner's
Curse
8.9 Fitting of multiple models using R 338(2)
8.10 Fitting of multiple models using SAS 340(2)
8.11 Summary 342(1)
8.12 Exercises 343(5)
References 348(2)
9 The use of mixed models for the analysis of 350(29)
unbalanced experimental designs
9.1 A balanced incomplete block design 350(4)
9.2 Imbalance due to a missing block: 354(4)
Mixed-model analysis of the incomplete block
design
9.3 Use of R to analyse the incomplete block 358(2)
design
9.4 Use of SAS to analyse the incomplete 360(2)
block design
9.5 Relaxation of the requirement for 362(6)
balance: Alpha designs
9.6 Approximate balance in two directions: 368(5)
The alphalpha design
9.7 Use of R to analyse the alphalpha design 373(1)
9.8 Use of SAS to analyse the alphalpha design 374(2)
9.9 Summary 376(1)
9.10 Exercises 377(1)
References 378(1)
10 Beyond mixed modelling 379(75)
10.1 Review of the uses of mixed models 379(1)
10.2 The generalized linear mixed model 380(8)
(GLMM): Fitting a logistic (sigmoidal) curve
to proportions of observations
10.3 Use of R to fit the logistic curve 388(2)
10.4 Use of SAS to fit the logistic curve 390(2)
10.5 Fitting a GLMM to a contingency table: 392(11)
Trouble-shooting when the mixed modelling
process fails
10.6 The hierarchical generalized linear 403(7)
model (HGLM)
10.7 Use of R to fit a GLMM and a HGLM to a 410(5)
contingency table
10.8 Use of SAS to fit a GLMM to a 415(3)
contingency table
10.9 The role of the covariance matrix in the 418(3)
specification of a mixed model
10.10 A more general pattern in the 421(10)
covariance matrix: Analysis of pedigrees and
genetic data
10.11 Estimation of parameters in the 431(10)
covariance matrix: Analysis of temporal and
spatial variation
10.12 Use of R to model spatial variation 441(3)
10.13 Use of SAS to model spatial variation 444(3)
10.14 Summary 447(1)
10.15 Exercises 447(5)
References 452(2)
11 Why is the criterion for fitting mixed 454(23)
models called Residual Maximum Likelihood?
11.1 Maximum likelihood and residual maximum 454(1)
likelihood
11.2 Estimation of the variance σイ from 455(1)
a single observation using the
maximum-likelihood criterion
11.3 Estimation of σイ from more than 455(2)
one observation
11.4 The オ-effect axis as a dimension within 457(3)
the sample space
11.5 Simultaneous estimation of オ and 460(2)
σイ using the maximum-likelihood
criterion
11.6 An alternative estimate of σイ 462(3)
using the REML criterion
11.7 Bayesian justification of the REML 465(1)
criterion
11.8 Extension to the general linear model: 465(5)
The fixed-effect axes as a sub-space of the
sample space
11.9 Application of the REML criterion to the 470(2)
general linear model
11.10 Extension to models with more than one 472(1)
random-effect term
11.11 Summary 473(1)
11.12 Exercises 474(2)
References 476(1)
Index 477