# Effect of sample size on performance

Fixed effects and their standard errors for LME models are computed based on maximum-likelihood estimates of the random effects covariance matrix. We believe that maximal-LME fixed-effects inferences remain slightly anti-conservative because maximum-likelihood estimation is biased to underestimate the size of random-effects variances, which deflates fixed-effects standard errors. If this is the case, then we should find that Type I error becomes increasingly nominal in maximal LME analyses as the number of subjects and/or items increases and certainty in the estimate of the random-effects covariance matrix correspondingly increases. In a set of informal simulations, we found that this was indeed the case.

Type I error as a function of the number of items in a within/within experiment

We see signs of this in the results presented in the main paper, where
Type I error rates are consistently higher for 12-item designs than
for 24-item designs. The current analysis extends this line of
reasoning. In the above figure, Type I error rates are plotted as a
function of the number of items. For maximal LME, anticonservativity
disappears quickly as the number of items increases. This behavior
contrasts with that of min-\(F'\), which is conservative across the
board; F_{1}+F_{2}, which is conservative for few items but switches to
being slightly anti-conservative as the number of items increases; and
random-intercepts LME, which becomes more and more anticonservative
with more items.

Of course, we do not always have the luxury of using
a large number of items and subjects in our analyses. This
anti-conservativity in the face of a limited number of clusters and
corresponding uncertainty in the random effects is in fact exactly the
kind of problem that the use of Bayesian inference and Markov-chain
Monte Carlo on the fixed effects is intended to address (Baayen, Davidson, & Bates, 2008; Gelman & Hill, 2007). Unfortunately,
these techniques are not yet available out of the box for
random-slopes models in any LME implementation we are aware of, but we
hope that they become readily available in the future and that they
eliminate the anti-conservativity observed in the present
simulations.^{1}

## References

Baayen, R. H., Davidson, D. J., and Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. *Journal of Memory and Language, 59*, 390-412.

Gelman, A. and Hill, J. (2006). *Data Analysis Using Regression and Multilevel/Hierarchical Models.* Cambridge: Cambridge University Press.

Plummer, Martyn (2003). JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20–22, Vienna, Austria. ISSN 1609-395X.

## Footnotes:

^{1} Random-slope models can be implemented and run from
R on any platform using the software package JAGS
(Plummer, 2003) and the authors have done so in some of their
own analyses, but the process remains sufficiently time-consuming and
error-prone that we do not roamed the practice at this point.