.BG
.FN emclust1
.TL
BIC from hierarchical clustering followed by EM 
for a parameterized Gaussian mixture model.
.SH DESCRIPTION
Bayesian Information Criterion for various numbers of clusters computed from 
hierarchical clustering followed by EM for a selected parameterization
of Gaussian mixture models possibly with Poisson noise.
.CS
emclust1(data, nclus, modelid, equal=F, noise, Vinv)
.PP
.RA
.AG data
matrix of observations.
.OA
.AG nclus
An integer vector specifying the numbers of clusters for which the BIC is to
be calculated. Default: 1:9 without noise; 0:9 with noise.
.AG modelid
An integer or vector of two integers specifying the model(s) to be used in
the hierarchical clustering and EM phases of the BIC calculations.
The allowed values or `modelid' and their interpretation are as follows:
`"EI"' : uniform spherical, `"VI"' : spherical, `"EEE"' : uniform variance, 
`"VVV"' : unconstrained  variance, `"EEV"' : uniform shape and volume, 
`"VEV"' : uniform shape.
Default: `c("VVV","VVV")' (unconstrained variance for both phases)
.AG k
If `k' is specified, the hierarchical clustering phase will use a sample of
size `k' of the data in the initial hierarchical clustering phase. The
default is to use the entire data set.
.AG equal
Logical variable indicating whether or not the mixing proportions are
equal in the model. The default is to assume they are unequal.
.AG noise
A logical vector of length equal to the number of observations in the data,
whose elements indicate an initial estimate of noise (indicated by `T') in
the data. By default, `emclust1' fits Gaussian mixture models in which it is 
assumed there is no noise. If `noise' is specified, `emclust1' will fit a
Gaussian mixture with a Poisson term for noise in the EM phase.
.AG Vinv
An estimate of the inverse hypervolume of the data region (needed only if
`noise' is specified). Default : determined by the function `hypvol'.
.RT
Bayesian Information Criterion for the six mixture models and specified
numbers of clusters. Auxiliary information returned as attributes.
.SH NOTE
The reciprocal condition estimate returned as an attribute ranges in value
between 0 and 1. The closer this estimate is to zero, the more likely it is
that the corresponding EM result (and BIC) are contaminated by roundoff error.
.SH REFERENCES
C. Fraley and A. E. Raftery, How many clusters? Which clustering method?
Answers via model-based cluster analysis. \fIComputer Journal,
\fR41:578-588 (1998).

C. Fraley and A. E. Raftery, \fIMCLUST:Software for model-based cluster
and discriminant analysis. \fRTechnical Report No. 342, Department of
Statistics, University of Washington (1998).

R. Kass and A. E. Raftery, Bayes Factors. \fIJournal of the American 
Statistical Association\fR90:773-795 (1995).
.SA
`summary.emclust1', `emclust', `mhtree', `me'
.EX
> iris.matrix <- matrix(aperm(iris, c(1,3,2)), 150, 4)
> dimnames(iris.matrix) <- list(NULL, dimnames(iris)[[2]])
> emclust1(iris.matrix, modelid = c(2,4))

.KW clustering
.WR

