emclust1 {mclust} | R Documentation |
Bayesian Information Criterion for various numbers of clusters computed from hierarchical clustering followed by EM for a selected parameterization of Gaussian mixture models possibly with Poisson noise.
emclust1(data, nclus, modelid, k, equal=F, noise, Vinv)
data |
matrix of observations. |
nclus |
An integer vector specifying the numbers of clusters for which the BIC is to be calculated. Default: 1:9 without noise; 0:9 with noise. |
modelid |
An integer or vector of two integers specifying the model(s) to be used
in the hierarchical clustering and EM phases of the BIC
calculations. The allowed values or modelid and their
interpretation are as follows: "EI" : uniform spherical,
"VI" : spherical, "EEE" : uniform variance, "VVV" :
unconstrained variance, "EEV" : uniform shape and volume,
"VEV" : uniform shape.
Default: c("VVV","VVV") (unconstrained variance for both phases)
|
k |
If k is specified, the hierarchical clustering phase will use a
sample of size k of the data in the initial hierarchical
clustering phase. The default is to use the entire data set.
|
equal |
Logical variable indicating whether or not the mixing proportions are equal in the model. The default is to assume they are unequal. |
noise |
A logical vector of length equal to the number of observations in the
data, whose elements indicate an initial estimate of noise (indicated by
T ) in the data. By default, emclust1 fits Gaussian mixture
models in which it is assumed there is no noise. If noise is
specified, emclust1 will fit a Gaussian mixture with a Poisson
term for noise in the EM phase.
|
Vinv |
An estimate of the inverse hypervolume of the data region (needed only
if noise is specified). Default : determined by the function
hypvol .
|
Bayesian Information Criterion for the six mixture models and specified numbers of clusters. Auxiliary information returned as attributes.
The reciprocal condition estimate returned as an attribute ranges in value between 0 and 1. The closer this estimate is to zero, the more likely it is that the corresponding EM result (and BIC) are contaminated by roundoff error.
C. Fraley and A. E. Raftery, How many clusters? Which clustering method? Answers via model-based cluster analysis. Technical Report No. 329, Dept. of Statistics, U. of Washington (February 1998).
R. Kass and A. E. Raftery, Bayes Factors. Journal of the American Statistical Association90:773-795 (1995).
summary.emclust1
, emclust
, mhtree
, me
data(iris) emclust1(iris[,1:4], nclus=2:3, modelid = c("VVV","EEV")) data(chevron) noisevec _ rep(0, nrow(chevron)) noisevec[chevron[,2]>60] _ 1 bicvals _ emclust1(chevron, noise=noisevec, nclus=0:5) sumry _ summary(bicvals, chevron) plot(chevron, col=ztoc(sumry$z), pch=ztoc(sumry$z))