This web page contains
materials for this book:
By clicking here you can download a zipped
file MLBOOK.ZIP (latest version March 19, 2007; size 1.1 MB)
containing the following files:
 WEXAMPLE20.PDF, a 35page introduction (in a pdf file,
to be read by Acrobat Reader,
current version March 19, 2007) to MLwiN 2.0
that uses the data set MLBOOK1.DAT also used in Chapters 4, 5,
8, and 9 in this textbook, and the data set IS9412.DAT used in
Examples 14.114.3 of Chapter 14;
 WEXAMPLE.PDF, a 20page introduction (in a pdf file,
to be read by Acrobat Reader,
current version July 2003) to MLwiN 1.10
that uses the data set MLBOOK1.DAT also used in Chapters 4, 5,
8, and 9 in this textbook, and the data set IS9412.DAT used in
Examples 14.114.3 of Chapter 14;
 MLBOOK1.DAT, a MLwiN macro that contains the data
and definition of variable names for most of the examples
in Chapters 4, 5, 8, 9, and 13,
and which is used in several of the following files:
 CH458.OBE, a macro that can be used for MLwiN
to produce most of the examples of Chapters 4, 5, and 8;
this macro uses the data set MLBOOK1.DAT;
to run it, please take care that you are acquainted
with the use of the macro facility in MLwiN,
and first use a text editor to have a look at this macro
to know approximately what it does;
 CH9.OBE, a macro that can be used for MLwiN
to produce most of the examples of Sections 9.4 and 9.5;
this macro can be run after running MLBOOK1.OBE and the same
advice applies as given for the latter macro;
this macro uses various
macros for
checking model assumptions that can be downloaded
separately;
 CH9_6.OBE, a macro that is like CH9.OBE but now for
section 9.6, and which also requires the
model checking macros.
 CH12_1.DAT, a MLwiN macro that contains the data
and definition of variable names for the examples
in Section 12.1, and that is used in the next file:
 CH12_1.OBE, a macro that can be used for MLwiN
to produce the examples of Section 12.1;
this macro uses the data set CH12_1.DAT;
 CH13.OBE, a MLwiN macro that is like CH9.OBE but now for
Chapter 13;
this macro uses the data set MLBOOK1.DAT;
 MLB_HLM.ZIP, itself a zipped file that contains data
and setups for producing the examples of Chapters 4 and 5 with
HLM version 4;
 IS9412.DAT, a MLwiN macro for input of the data set used in Examples 14.114.3;
 IS12TRANS.MAC, aMLwiN macro for transformation of these data
as required for Example 14.3;
 IS12.MAC, a MLwiN macro for calculating some results presented in Example 14.1;
 T14_3.txt, a text file explaining how to use MIXOR
to produce the results presented in Table 14.3 in Example 14.4;
 BEATE1.DAT, BEATE1.DEF, and BEATE2.DEF, a data file and two MIXOR
definition files used to produce Table 14.3.
The file MLBOOK.ZIP can be unzipped by using PKUNZIP or WINZIP.
Note that data sets used in this book can be downloaded in other formats
(including SAS, STATA, and SPSS)
from
the UCLA webpages on multilevel analysis.
(with kind gratitude to all the people who point out these errors
to me  not that finding out the errors makes me happy....)

General
Remarks:
 UCLA maintains a webpage with data sets in various formats for this book
(including SAS, STATA, and SPSS)
at
http://www.ats.ucla.edu/stat/examples/ma_snijders/default.htm,
 Other recent general books on multilevel analysis are
 D.R. Cox and P.J. Solomon, Components of Variance,
Chapman & Hall/CRC, 2002.
 Jan de Leeuw and Erik Meijer (eds.),
Handbook of Multilevel Analysis. Springer, 2008.
 Andrew Gelman and Jennifer Hill,
Data Analysis Using Regression and Multilevel/Hierarchical Models.
Cambridge University Press, 2007.
 J.J. Hox,
Multilevel Analysis, Techniques and
Applications, Lawrence Erlbaum Associates, 2nd ed., 2010.
 In German:
Wolfgang Langer,
Mehrebenenanalyse,
Wiesbaden: VS Verlag für Sozialwissenschaften, 2004.
 A.H. Leyland and H. Goldstein (eds.),
Multilevel Modelling of Health Statistics,
New York: Wiley, 2001.
 Judith D. Singer & John B. Willett ,
Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence
New York: Oxford University Press, 2003.
 A. Skrondal and S. RabeHesketh,
Generalized latent variable modeling: Multilevel, longitudinal and structural equation models,
Chapman & Hall/ CRC Press, 2004.
 G. Verbeke and G. Molenberghs,
Linear Mixed Models for Longitudinal Data,
New York: Springer, 2000.
 The second edition of Bryk and Raudenbush (1992) has appeared, now with reversed
author order:
Raudenbush, Stephen W., & Bryk, Anthony S.
Hierarchical Linear Models. Applications and Data Analysis Methods.
Newbury Park, CA: Sage, 2nd ed., 2002.
 The fourth edition of H. Goldstein, Multilevel Statistical Methods, appeared early 2011.
 Interesting is also the Glossary for Multilevel Analysis by Ana V. Diez Roux
in
Journal of Epidemiology and Community Health (2002), 56, 588594,
also published online in the Epidemiological Bulletin,
part 1 in vol. 24 (2003) no. 3 ,
part 2 in vol. 24 (2003) no. 4 and
part 3 in vol. 25 (2004) no. 1 .

Chapter 1
Remarks:
 In the prehistory of multilevel analysis, and of metaanalysis, the following paper
Cochran, W. G. (1954), The combination of estimates from different experiments;
Biometrics, 10, 101129
deserves to be mentioned: it indicates an approach to metaanalysis
in which error variability and true variability between results
from different experiments are separated.

Chapter 3
Remarks:
 Formula (3.13) for the standard error of the intraclass correlation
coefficient was given already in R.A. Fisher, Statistical Methods
for Research Workers, London: Hafner Press.
In the 13th edition (1958), it is in Section 39, at the bottom
of p. 220. Good old Fisher.
 Formula (3.32) for the relation between withingroup, betweengroup,
and total correlations was given by Robinson in his 1950 paper
on ecological correlations.
Corrections:
 On page 30:
line 4, "U_x on U_y" should be "U_y on U_x"
formula on line 6, U_xj and U_yj must be interchanged;
line 8, "regression coefficient of X on Y" should be
"regression coefficient of Y on X".
 On page 32, line 5 from below: "total regression" should be
"total correlation".

Chapter 4
Remarks:
 An example of a threelevel random intercept model
with special interest for the intraclass correlation coefficients
at levels two and three, is
Siddiqui, O., Hedeker, D., Flay, B.R., and Hu, F. (1996)
Intraclass correlation estimates in a schoolbased smoking
prevention study.
American Journal of Epidemiology, 144, 425433.
Corrections:
 On page 61, line 19, "mean squared error" should be
"root mean squared error".

Chapter 5
Corrections:
 On page 68, the sentence on lines 56 from below should read:
Further, the intraclass correlation will be nil for highSES children,
whereas for lowSES children it will be positive.
 On page 83, in the last two formula lines, delta_0jk must be delta_00k
and delta_1jk must be delta_10k.
 On page 85, the word "school" on line 5 and "schools" on line 7
should be "class" and "classes", respectively.
The argument is a bit unclear here, and can be clarified as follows.
In the population of classes, when we draw a random class in a random school,
the total slope variance is 0.0024+0.0019.
Therefore, the classes with 2.5% lowest and highest effect of pretest
have regression coefficients 0.146 +/ 2 √(0.0024+0.0019).

Chapter 6
Remarks:
 The approach for halving the pvalues explained in Section 6.2.1
is correct for the case with 1 degree of freedom (testing the random intercept)
but not for more degrees of freedom (testing random slopes).
Then the correct procedure is, given that the number of degrees of freedom is r,
to calculate the pvalues for the chisquared distributions for d.f. = r
and d.f. = r1, and average these two pvalues.
See Stram and Lee (Biometrics, 1994) and Molenberghs and Verbeke
(The American Statistician, 2007).
 For testing fixed effects with small sample sizes,
it is better to use special versions of the
Wald t and Ftests.
The Satterthwaite approximation to the number of degrees of freedom
performs quite well. See the following publications:
M.G. Kenward and J.H. Roger (1997),
Small sample inference for fixed effects from restricted maximum
likelihood.
Biometrics, 53, 983997.
D.A. Elston (1998),
Estimation of the denominator degrees of freedom of Fdistributions
for assessing Wald statistics for fixedeffect factors in unbalanced mixed models.
Biometrics, 41, 477486.
Adapted versions of likelihood ratio tests
are presented by:
S.J. Welham and R. Thompson (1997),
Likelihood ratio tests for fixed model terms using
residual maximum likelihood.
Journal of the Royal Statistical Society Series B,
59, 701714.
Zucker, D.M., Lieberman, O., and Manor, O. (2000).
Improved small sample inference in the mixed linear model:
Bartlett correction and adjusted likelihood.
Journal of the Royal Statistical Society, Series B,62, 827838.
A comparison of various tests for small sample sizes in a simulation study
is given in:
Manor, O., and Zucker, D.M. (2004),
Small sample inference for the fixed effects in the mixed linear model.
Computational Statistics and Data Analysis, 46, 801817.
 The tests for variance parameters mentioned in Section 6.3
with the reference Berkhof and Snijders (1998) now are documented in
Berkhof, Johannes, and Snijders, Tom A.B. (2001),
Variance Component Testing in Multilevel Models.
Journal of Educational and Behavioral Statistics, 26, 133152.
This topic is treated also in
Geert Verbeke and Geert Molenberghs (2003),
The Use of Score Tests for Inference on Variance Components,
Biometrics, 59, 254262.
 In Example 6.1 on p. 8687, the variable U_1j which has
the random slope is the raw IQ variable IQ_ij.
This is not stated explicitly in the text.
 In point 3 on p. 92, the sentence "and if a variable has a random slope,
it should normally also have a fixed effect"
should be understood as referring to how you specify the model,
and not to the significance of the model terms.
In other words, it intends to say that if you include a random
slope in the model, you normally also should include the
corresponding fixed effect.
Corrections:
 On page 89, line 5 from bottom,
"three" should be "four".

Chapter 7
Remarks:
 In Section 7.2.1, it is implicitly assumed that the group means
of the X variables are uncorrelated with the
Z variables. If this assumption does not hold,
formula (7.8) needs an additional covariance term.
 The third term on the right hand side of formula (7.10)
on p. 108 (the term with mu prime T) must be multiplied by 2.

Chapter 8
Corrections:
 On p. 112, lines 67 from the bottom, the range of the levelone
variance should be even wider, between 21.19 and 54.47.
 On p. 113, formula (8.3), the number 2.223
should be 3.236 (both occurrences).
Figure 8.1 should be modified accordingly.

Chapter 9
Remarks:
 Two interesting articles on ignoring the multilevel structure,
or ignoring one level in a multilevel analysis,
are
Mirjam Moerbeek (2004), The Consequence of Ignoring a Level of Nesting
in Multilevel Analysis, Multivariate Behavioral Research, 39, 129149;
Johannes Berkhof and Jarl K. Kampen (2004),
Asymptotic Effect of Misspecification in the Random Part of the
Multilevel Model, Journal of Educational and Behavioral Statistics,
29, 201218.
 The test for variance parameters mentioned in Section 9.2.2
with the reference Berkhof and Snijders (1998) now is documented in
Berkhof, Johannes, and Snijders, Tom A.B. (2001),
Variance Component Testing in Multilevel Models.
Journal of Educational and Behavioral Statistics, 26, 133152.
 (Addition to Section 9.7)
For nonnormal distributions of the residuals,
a sandwichtype estimator for the standard errors of the fixed coefficient
estimates is given by
K.H. Yuan & P.M. Bentler, On normal theory based inference for multilevel models
with distributional violations. Psychometrika, 67, 539562.
 Robustness of multilevel methods to nonnormal distributions of the
random effects was studied by
C.J.M. Maas and J.J. Hox (2004),
'The influence of violations of assumptions on multilevel parameter estimates and their standard
errors',
Computational Statistics and data Analysis, 46, 427440.
Corrections:
 On p. 122, line 18,
( X_{ij}  X_{.j} Z_{j} )
should be
( X_{ij}  X_{.j} ) Z_{j} .

Chapter 10
Remarks:
 To obtain information about the program PINT
(newest version: 2.11, April 30, 2003),
used in Section 10.4, or to download it if you wish,
click here.
 A recent overview of power and sample size determination in
multilevel models, mentioning a number of computer programs, is
Snijders, T.A.B.,
Power and Sample Size in Multilevel Linear Models;
p. 15701573 in B.S. Everitt and D.C. Howell (eds.), Encyclopedia
of Statistics in Behavioral Science. Chicester (etc.):
Wiley, 2005 (Volume 3).
 In the book
Multilevel Modelling of Health Statistics,
Leyland, A. and Goldstein, H. (eds.) (2001), Wiley,
I wrote Chapter 11
on Sampling, focusing on sample size determination.
 There are several papers written by Steve Raudenbush, some in collaboration
with Xiaofeng (Steve) Liu, on power and sample sizes in cluster randomized trials,
multisite randomized trials, and repeated measures with polynomial effects.
References and the associated computer program OD (for Optimal Design),
with an extensive manual, are at
http://sitemaker.umich.edu/groupbased .
 For calculating sample sizes in intervention studies
with randomization between clusters you may consult
Hayes, R.J., and Bennett, S. (1999)
Simple sample size calculation for clusterrandomized trials.
International Journal of Epidemiology, 28, 319326.
 A paper on calculating power in twolevel and threelevel longitudinal settings,
together with SAS macros, is available from
the School Success Profile website at UNC.
 The (large!) effects of intraclass correlations on the desired
sample sizes to obtain a given preassigned power are discussed in
Siddiqui, O., Hedeker, D., Flay, B.R., and Hu, F. (1996)
Intraclass correlation estimates in a schoolbased smoking
prevention study.
American Journal of Epidemiology, 144, 425433.
 Another important paper about sample sizes in
multilevel analysis used for longitudinal data is the following:
Hedeker, D., Gibbons, R.D., and Waternaux, C. (1999)
Sample size estimation for longitudinal designs with attrition:
Comparing timerelated contrasts between two groups.
Journal of Educational and Behavioral Statistics, 24, 7093.
There is a link to the associated program RMASS2
at
a web page of Don Hedeker.
 An important design question in multilevel experimental studies is,
at which level the randomization should take place.
This is discussed in
Mirjam Moerbeek (2005).
Randomization of clusters versus randomization of persons within clusters:
Which is preferable? The American Statistician, 59, 7278.
 A review paper on optimal designs for variance components models,
i.e., models with no fixed effects but only (nested or crossed)
random effects  such as the (in)famous empty model  is
A.I. Khuri (2000). 'Designs for variance components estimation:
Past and present', International Statistical Review,
68, 311322.
However, he doesn't even refer to Donner.

Chapter 11
Remarks:
 There are a variety of other specifications possible for designs
with crossed random coefficients.
An important specification possibility is a random interaction
of two crossed factors.
E.g., in Example 11.1 (Sustained primary school effects),
this would be a random interaction of the primary school  secondary school combination.
In such a design, there is a random main effect of primary school,
a random main effect of secondary school, as well as a random interaction effect
of primary school by secondary school.
One way of specifying such models is as follows.
Let the crossed factors be A and B. Define the factor C
as the set of categories defined by all combinations of values of A and of B.
(In the example: all primary school  secondary school combinations).
Then C is nested in A and also nested in B.
A threelevel hierarchical model is obtained by using C and A as nested factors
(it is also possible to use C and B; the greatest stability will usually be obtained
by using C and A, if the letter A denotes the factor with more units than B).
Factor B can then be used as a factor which is crossed with A.
The structure thus obtained is the same as what is described
in the first paragraph of Section 11.2 (a crossed random effect in a threelevel model,
where the extra factor is crossed with the levelthree units).
This type of model is described in Chapter 12 of Raudenbush and Bryk (2002).
 Papers about a mixedmodel (= multilevel) approach to Generalizability Theory:
Tony Vangeneugden, Annouschka Laenen, Helena Geys, Didier Renard, and Geert Molenberghs (2004).
Applying linear mixed models to estimate reliability in clinical trial
data with repeated measurements.
Controlled Clinical Trials 25, 13 30.
Tony Vangeneugden, Annouschka Laenen, Helena Geys, Didier Renard, and Geert Molenberghs (2005).
Applying Concepts of Generalizability Theory on Clinical Trial Data
to Investigate Sources of Variation and Their Impact on Reliability.
Biometrics 61, 295304.

Chapter 12
Remarks:
 A very interesting and useful article is
G. Verbeke, B. Spiessens, and E. Lesaffre (2001).
'Conditional linear mixed models',
The American Statistician, 55, 2534.
They propose to condition on sufficient statistics for
the crosssectional effects, so that conclusions about
longitudinal effects can be made without interference by
assumptions about crosssectional effects.
 The following recent book is a collection of papers
that can serve as a general introduction in using (mainly) multilevel models
for repeated measures:
D.S. Moskowitz and S.L. Hershberger (eds.),
Modeling intraindividual variability with repeated measures data,
Lawrence Erlbaum, 2002.
 The paper by Maas and Snijders (mentioned as submitted, 1999) was finally published in 2003;
the reference is
Maas, Cora J.M., and Snijders, Tom A.B.,
The multilevel approach to repeated measures for complete and incomplete data
Quality and Quantity, 37 (2003), 7189.
This article is about the link between the traditional MANOVA approach to
repeated measures and the multilevel approach.
 The use of multilevel analysis for repeated measures with
(in addition to time) a second withinsubject variable is treated in
B. Kato, H. Hoijtink, C. Verdellen, M. Hagenaars, A. Van Minnen, and G. Keijsers,
Application of Multilevel Models to Structured Repeated Measurements,
Quality & Quantity 39 (2005), 711–732.
 Another introduction in the use of multilevel analysis for
repeated measures
is
H. Quene and H. van den Berg (2004).
'On multillevel modeling of data from repeated measures designs:
a tutorial',
Speech Communication 43, 103121.
 The use of splines for modeling nonlinear effects in
multilevel models was proposed and discussed also by
Pan, H., and Goldstein, H. (1998).
Multilevel repeated measures growth modelling using extended spline functions.
Statistics in Medicine, 17, 27552770.
 To investigate in a repeated measures studies whether missingness
of data (dropout, etc.) is at random, the patternmixture
approach (proposed by Little in 1993)
is very useful. It is based on indicating missingness
patterns by dummy variables, and using these as explanatory
variables.
The use of this approach in multilevel analysis
is explained in
Hedeker, D., and Gibbons, R.D. (1997). Application of randomeffects
patternmixture models for missing data in longitudinal studies.
Psychological Methods, 2, 6478.
For the analysis of repeated dichotomous or ordinal data,
an illustration is given in
Hedeker, D., and Rose, J.S. (2000),
The natural history of smoking: a patternmixture randomeffects
regression model.
In: Rose, J.S., Chassin, L., Presson, C.C., and Sherman, S.J. (Eds.),
Multivariate Applications in Substance Use Research, p. 79112.
Mahwah, NJ: Lawrence Erlbaum Associates.
Corrections:
 With "length" I mean "height". An error in my use of English.
 In Example 12.6 on page 180, 2 lines below Table 12.6,
"Model 1 of Table 12.1" should be "Model 2 of Table 12.1".
 In example 12.7 on page 183, 11 lines below Table 12.7,
"gamma_01 + U_0i" should be "gamma_10 + U_1i".

Chapter 14
Remarks:
 A good recent overview of multilevel modeling for discrete outcome
variables is
A. Agresti, J.G. Booth, J.P. Hobert, and Brian Caffo (2000),
Randomeffects modeling of categorical response data.
Sociological Methodology2000, 2780.
 Introductions and examples of multilevel logistic regression,
multinomial regression, and Poisson regression
can be found in the book
Multilevel Modelling of Health Statistics,
Leyland, A. and Goldstein, H. (eds.) (2001), Wiley.
 There are cases where the deviance calculated by the Laplace
approximation in HLM5 goes up instead of down when effects
are added in discrete multilevel models.
So the approximation is not always very good.
The remark on p. 220 that the deviance statistic produced in this way
can be used for chisquared tests requires further investigation.
 For the calculation of sigmasquared_F in the definition
of the explained variance (pages 225226 and 231233),
just calculate the linear predictor as a new variable
in your data set and calculate the variance of this variable.
 There are a variety of ways of expressing effect sizes and
random intercept variability in multilevel logistic regression models.
In addition to what is mentioned in Section 14.3,
interesting proposals are the Median Odds Ratio for expressing
cluster heterogeneity = random intercept variability;
and the Interval Odds Ratio for expressing sizes of fixed effects.
See
K. Larsen, J.H. Petersen, E. BudtzJørgensen, and L. Endahl (2004), Interpreting
parameters in the logistic regression model with random
effects. Biometrics 56, 909914.
K. Larsen and J. Merlo (2005), Appropriate Assessment of Neighborhood Effects
on Individual Health: Integrating Random and Fixed Effects in Multilevel
Logistic Regression. American Journal of Epidemiology, 161, 8188.
 A model for multilevel ordinal data that allows nonproportional
odds for a subset of the explanatory variables is presented in
Hedeker, D. and Mermelsten, R.J. (1998),
A multilevel thresholds of change model for analysis
of stages of change data.
Multivariate Behavioral Research, 33, 427455.
 Issues related to the scaling of coefficients and variances like those treated
in Section 14.3.5, were already discussed for singlelevel models in
Winship, C. and Mare, R.D. (1984), Regression models with ordinal variables.
American Sociological Review, 49, 512525.
For multiple ordered categories in multilevel models, they are treated in
Fielding, A. (2004). Scaling for Residual Variance Components of Ordered Category
Responses in Generalised Linear Mixed Multilevel Models.
Quality and Quantity 38, 425433.
A further treatment, with a proposal for how to put
the estimates on a common scale, is given by
Bauer, Daniel J. (2009).
A note on comparing the estimates of models for clustercorrelated
or longitudinal data with binary or ordinal outcomes.
Psychometrika 74, 97105.
Corrections:
 On p. 210 in Formula (14.7), there should be a bar over the
Y; this refers to the group average defined in (14.4).
 In Table 14.3 (page 222): in the list of random effects,
tau_04 should be tau_02;
and the deviance for Model 1 should be 1582.29.

Chapter 15
Remarks:

For Section 15.1:

A review of multilevel analysis possibilities in a variety of programs
is given at the Multilevel Modeling website at
http://www.cmm.bristol.ac.uk/learningtraining/multilevelmsoftware/index.shtml.
Programs included are aML, BMDP, EGRET, GENSTAT, HLM, LIMDEP, LISREL,
MIXREG, MLwiN, Mplus, R, SAS, SPSS, STATA, SYSTAT,WINBUGS.

A brief introduction to multilevel modeling with help on how to estimate multilevel models
using SPSS, Stata, and SAS is given at
the website of the Stat/Math center of Indiana University. It was written by Jeremy Albright.
 The website for MIXOR now is
http://tigger.uic.edu/~hedeker/mix.html .
 The website for HLM now is
http://ssicentral.com/hlm/ .
 For Section 15.2.1 (SAS):
An introduction to using SAS Proc Mixed for multilevel analysis,
specifically geared at longitudinal data, is
J. D. Singer, Fitting individual growth models using SAS PROC MIXED,
p. 135170 in
D.S. Moskowitz and S.L. Hershberger (eds.).
Modeling intraindividual variability with repeated measures data
Lawrence Erlbaum, 2002.
An extensive introduction to mixed model / multilevel analysis in SAS is
G. Verbeke and G. Molenberghs (eds.),
Linear Mixed Models in Practice; A SASOriented Approach ,
New York: Springer, 1997.
 For Section 15.2.2 (SPSS):
Multilevel modeling in
SPSS is more extended from version 11.5 onward. See
 For Section 15.2.4 (Stata):
The xtmixed
command in Stata can fit many multilevel models for
continuous outcome variables.
A Stata program to fit very
general multilevel models also for discrete outcome variables is
gllamm (for Generalized Linear Latent And Mixed
Models); it may be downloaded from http://www.gllamm.org . The
author of the program is Sophia RabeHesketh.
The program is
rather slow but very flexible. It includes methods for normal and
nonnormal response distributions, with good possibilities for
latent variables. Extensive information is given on the webpage.
References:
S. RabeHesketh, A. Skrondal and A. Pickles
(2004). Generalized multilevel structural equation modelling.
Psychometrika, 69 , 167190.
A. Skrondal and S.
RabeHesketh (2004). Generalized latent variable modeling:
Multilevel, longitudinal and structural equation models. Chapman
& Hall/ CRC Press.
S. RabeHesketh and A. Skrondal (2008). Multilevel and Longitudinal Modeling using Stata, 2nd ed.
College Station, TX: Stata Press.
 The remark about BUGS is incorrect; BUGS does not require
balanced data, and can handle a large range of multilevel models.
The book by Gelman and Hill (see above and below)
gives a lot of examples.
 The statistical computer program R has a lot of methods
for multilevel analysis, in particular in the libraries
nlme and lmer4. An extensive textbook is
Jose Pinheiro and Douglas Bates, Mixedeffects models in S and SPLUS. Springer, 2000.
Some other helpful texts are
John Fox,
Linear Mixed Models. Appendix to `An R and SPLUS Companion to Applied Regression'.
Douglas Bates, Examples from Multilevel Software Comparative Reviews.
Andrew Gelman and Jennifer Hill,
Data Analysis Using Regression and Multilevel/Hierarchical Models.
Cambridge University Press, 2007.
 Another program
for estimating random effects models for dichotomous outcome
variables is EGRET, see see http://www.statcon.de/.
Back to the multilevel page
of Tom Snijders