CALDAR Logo

Center for Advancing
Longitudinal Drug Abuse Research

Frequently Asked Questions


Our FAQ is designed to address specific issues in longitudinal research and analysis. While they will address a broad range of issues, each question will focus narrowly on a specific topic. Answers will draw on our experience in actually implementing longitudinal research and analysis applications, with references to more detailed literature as we find relevant. We are not intending to provide comprehensive background information that is often well-covered in textbooks, workshops, and lectures. Nor do our answers purport to be a complete review of relevant literature. Our answers may identify areas which still need further research or areas where considerable controversy exists, without giving solutions.

We welcome input and comments from you. To submit questions or to join our dialogue, please email David Huang at yhuang@ucla.edu.


The availability of alternative software for estimation of mixture models necessitates consideration of differences in the basic models utilized. One major difference between two of the primary proponents of the application of mixture models for characterizing trajectories over time (D. Nagin and B. Muthen) involves within-class variability. The basic model underlying Nagin's method (with estimation using SAS proc traj) assumes no variation in growth parameters within each class, thus any individual deviation from the class mean trajectories are attributed to random error. Muthen’s model (with estimation using Mplus) allows for within-class variation in individual trajectories. (Note that with appropriate constraints on within-class variation, Mplus can estimate similar models to SAS proc traj.)

There appear to be no clear-cut decision criteria for which model to choose. Researchers should consider the conceptual appropriateness of the models; see, e.g. Muthen (2004) or Nagin (2005) for detailed description of relevant statistical models. In addition, some application results may be helpful. Our experience suggests that estimation is sometimes easier (faster, more likely to converge) with simpler models; constraining growth factor variances and covariances to zero can produce a simpler model. An estimation advantage of allowing variation in these parameters (individual variation about the group mean) is that fewer trajectory groups may be required to specify a satisfactory model (Nagin & Trembley, 2005). However, the layering of heterogeneity (both within and between trajectory groups) may make the more complex model particularly vulnerable to model specification errors (Bauer & Curran 2003). Muthen (2004) has suggested that use of both approaches may be useful, for example with the simpler model group-based trajectory approach used as a first step to identify cut points on the growth factors, then relaxing variance/covariance constraints in a growth mixture model.

Both statistical and conceptual/theoretical criteria should be considered when deciding on number of latent trajectory classes. From the statistical perspective, several model fit indices can be used. The most frequently employed indices include AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and the sample size-adjusted BIC (ABIC) (cf. Sclove 1987; Yang 1998). Additional tests have also been suggested, including Lo-Mendell-Rubin likelihood ratio test (LMR LRT), Bootstrap LRT (BLRT), and multivariate skewness and kurtosis test (SK test) (Muthen 2003, 2004). The three indices and LMR LRT and SK tests are available in Mplus, whereas only BIC and AIC are available in PROC TRAJ. A recommended strategy is to estimate a series of models with progressively greater numbers of trajectory classes and compare fit indices. In general, lower AIC, BIC or adjusted BIC absolute values indicate a better model. One would continue the estimation and index comparison until the point where a greater number of classes results in larger fit values (that is, until a minimum fit index value is found). When considering LMR LRT test statistic results, a small p-value suggests that the model with k classes is preferred over k-1 classes. For the SK test, a large probability indicates that the model with k classes accurately reproduces higher order sample skewness and kurtosis.

But selection of the number of latent classes should also be considered within the context of the study objectives and conceptual or theoretical perspectives (Acock 2005; Nagin 2005). Parsimony is often desired in order to facilitate interpretation. A solution with a large number of latent classes may statistically distinguish classes with little practical difference. In addition, latent class sizes may become too small for reliable interpretation. Pattern distinctions should be relevant to the study purpose. Note that when working with a very large sample size, our experience suggests that the BIC continues to decrease for models with ever-increasing numbers of classes. For decisions about numbers of classes in this case, the researcher should consider the interpretability of the classes, whether class distinctions have any important theoretical or practical value, and whether the numbers of cases estimated as belonging to the classes become too small for reliable interpretation.

How does one decide on which fit or test statistic to use? And what does one do when different indices or tests suggest different results? Simulation studies report inconsistent results in the performance of fit indices and test statistics. Some studies support the adjusted BIC (Yang 2006; Tofighi & Enders 2007) and a few support the BIC (Nylund 2006). However, it appears that simulation results (and the resulting inconsistencies) depend heavily on assumptions about the population (e.g. normality, separation of latent classes, number of classes), the estimation model, and the match between them. At the moment, there are few agreed-upon ways of assessing how well one's empirical data matches the simulation assumptions. The literature in this area is expanding. Several authors have recommended use of multiple statistics, along with theoretical and practical considerations (Acock 2005; Bauer & Curran, 2004; Nagin, 2005; Nagin & Tremblay, 2005).

A number of statistical criteria have been proposed in the literature to facilitate decisions about the number of classes in growth mixture modeling (GMM). The first category of these criteria includes the information-based indices. The popular indices include AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and the sample size-adjusted BIC (ABIC) (Sclove, 1987). The second category includes some statistical tests, which are not yet commonly used; two examples are the Lo-Mendell-Rubin likelihood ratio test (LMR LRT; Lo, Mendell & Rubin, 2001) and Bootstrap LRT (BLRT; McLachlan, 1987; McLachlan & Peel, 2000). Lo, Mendell and Rubin (2001) attempted to overcome the limitation of the likelihood ratio test which theoretically does not follow a traditional chi-square distribution in the context of mixture modeling and analytically derived a new asymptotic distribution of the LRT in that context. McLachlan (1987) and McLachlan and Peel (2000) adopted another approach to overcome the same problem. They suggested using the bootstrap method to imitate the sampling distribution of LRT in the mixture context and proposed their BLRT. Although the statistical tests can test the null hypothesis on the number of classes (e.g., H0: k-1 vs. k classes in population) which the information-based indices cannot do, the tests have not achieved the same popularity of the information-based indices in mixture modeling as in other areas of statistics due to continuing limitations. For example, LMR LRT has been criticized by Jeffries (2003) for some theoretical drawbacks in the mathematical proof for the LMR LRT. Since it is unclear to what extent the critique relates to its use in practice, LMR LRT is still computed in Mplus (Muthen & Muthen, 2006) and further studies on its applicability are warranted. The BLRT is a computationally intensive method and varies with the random sequence used for bootstrapping in each study. It was very recently implemented in Mplus, and research support for its use is also very limited. Finally, in addition to the two categories of criteria described above, Muthen (2003) also proposed a multivariate skewness and kurtosis test (SK test) which is based on the goodness-of-fit in term of skewness and kurtosis. This SK test is analogous to the goodness-of-fit test used in structural equation models, but there is also a lack of research and detailed documentation on it for GMM use.

In Mplus and SAS proc traj, missing data on dependent variables (observations over time, y1, y2, y3…, from which to estimate trajectories) are assumed to be missing at random (MAR). Growth mixture models are estimated by using all available observations on the dependent measure; and thus, subjects are included in the analysis if they have at least one observation with valid data on the dependent variable. However, there is no missing data theory for covariates given that the model is estimated conditioned on the covariates. Therefore, subjects with missing data on the covariates are not included in the analysis (communication from Linda Muthen for Mplus).

As an example, consider the missing pattern in the following data set with 3 subjects, 3 observations over time on dependent variable Y (y1, y2, y3), and a covariate X. Subject 1 has missing data on all Y observations, but has data available on covariate X. Subject 2 has data for all Y, but missing data on X; and subject 3 has missing data on y2.

Subject

y1

y2

y3

X

1

missing

missing

missing

data

2

data

data

data

missing

3

data

missing

data

data

In an unconditional growth mixture model without covariate or a model with a covariate, subject 1 would be excluded from estimation. In a growth mixture model with a covariate, subject 2 would be excluded from the model. Subject 3 (using available data) would be included in either type of model.

Here is SAS code for computing predicted values at each time point for each subject (for a growth model based on 28 observations over time (dependent variables arrest18, arrest19, ….arrest45; time indicators age18, age19,…age45). Estimated parameters are beta0, beta1, beta2, beta3 and the computed predicted value is EST.

array aa (28) arrest18-arrest45 ;
array t (28) age18-age45 ;
do i=1 to 28 ;
arrest=aa(i) ;
tt=t(i) ;
AZ=EXP(beta0+beta1*t(i)+beta2*t(i)*t(i)+beta3*t(i)*t(i)*t(i)) ;
AU=(alpha0+alpha1*t(i)+alpha2*t(i)*t(i)) ;
AP=EXP(AU)/(1+EXP(AU)) ;
EST=(1-AP)*AZ ;
end;

The mean of predicted values at each time point for each latent class can then be computed as the average of the predicted values for subjects within the latent class. These means of predicted values are available in the output of the SAS proc traj and are shown on graphs available in Mplus.

Some Useful References

Acock A.C. (2005). Growth curves and extensions using Mplus. At http://oregonstate.edu/~acock/growth-curves/

Bauer D.J. & Curran P.J. (2003). Distributional assumptions of growth mixture models : implications for overextraction of latent trajectory classes. Psychological Methods 8(3), 338-363.

Bauer, D. & Curran, P. (2004). The integration of continuous and discrete latent variable models: potential problems and promising opportunities. Psychological Methods, 9(1), 3-29.

De Fraine, B., Van Damme, J, & Onghena, P. Predicting longitudinal trajectories of adolescent academic self-concept : An application of growth mixture models. Technical Report from IAP Statistics Network. At www.stat.ucl.ac.be/IAP

D’Unger, A.V. & Land, K.C., McCall, P.L. (2002). Sex Differences in Age Patterns of Delinquent/Criminal Careers : Results from Poisson Latent Class Analyses of the Philadephia Cohort Study. Journal of Quantitative Criminology, 18(4), 349-375.

Eggleston, E.P. & Laub, J.H., Sampson, R.J. (2004). Methodological sensitivities to latent class analysis of long-term criminal trajectories. Journal of Quantitative Criminology, 20 (1).

Fergusson, D.M. & Horwood, L.J. (2002). Male and Female Offending Trajectories. Development and Psychopathology, 14, 159-177.

Jeffries, N. (2003). A note on “Testing the number of components in a normal mixture.”
Biometrika, 90, 991-994.

Lo, Y., Mendell, N.R., & Rubin, D.B. (2001). Testing the number of components in a normal mixture. Biometrika, 88, 767-778.

McLachlan, G.J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Applied Statistics, 36, 318-324.

McLachlan, G. J., & Peel, D. (2000). Finite mixture models. New York: John Wiley.

Muthen, B. (2003). Statistical and substantive checking in growth mixture modeling : Comment on Bauer and Curran. Psychological Methods, 8, 369-377.

Muthén, B. (2004). Latent variable analysis: Growth mixture modeling and related techniques for longitudinal data. In D. Kaplan (ed.), Handbook of quantitative methodology for the social sciences (pp. 345-368). Newbury Park, CA: Sage Publications.

Muthén, L.K., & Muthén, B.O. (2006). Mplus user’s guide [Computer software and manual], (4th Ed.). Los Angeles: Muthén & Muthén.

Nagin, D.S. (2005). Group-Based Modeling of Development. Boston, MA: Harvard University Press.

Nagin D. S. & Tremblay R.E. (2005). Developmental trajectory groups: fact or a useful statistical fiction ? Criminology (43) 4, 873-904.

Nagin, D., & Paternoster, R. (2000). Population heterogeneity and state dependence: State of the evidence and directions for future research. Journal of Quantitative Criminology, 16(2), 117-144.

Nylund K.L., Asparouhov T. & Muthen B.O. (2006). Deciding on the number of classes in latent class analysis and growth mixture modeling : A Monte Carlo simulation study. At http://www.statmodel.com/recpapers.shtml

Olsen, M.K. & Schafer, J.L. (2001). A two-part random effects model for semicontinuous longitudinal data. J. of the American Statistical Association, 96, 730-745.

Sclove, L.S., 1987. Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52, 333-343.

Tofighi, D & Enders, C. K. (2007). Identifying the correct number of classes in growth mixture models.

Duncan, T. (2002). Growth mixture modeling of adolescent alcohol use data. www.ori.org/methodology

Yang, C.C. (2006). Evaluating latent class analyses in qualitative phenotype identification. Computational Statistics & Data Analysis, 50, 1090-1104.

Yang, C.C. (1998). Finite mixture model selection with psychometrics applications. Ph.D. Dissertation. University of California, Los Angeles, CA.