svm {e1071} | R Documentation |
svm
is used to train a support vector machine. It can be used to carry
out general regression and classification (of nu and epsilon-type), as
well as density-estimation. A formula interface is provided.
svm(formula, data = list(), subset, na.action=na.fail, ...) svm(x, y=NULL, subset, na.action=na.fail, type=NULL, kernel="radial", degree=3, gamma=1/dim(x)[2], coef0=0, cost=1, nu=0.5, class.weights=NULL, cachesize=40, tolerance=0.001, epsilon=0.5, shrinking=TRUE, cross=0, fitted=TRUE, ...)
formula |
a symbolic description of the model to be fit. Note, that an intercept is always included, whether given in the formula or not. |
data |
an optional data frame containing the variables in the model. By default the variables are taken from the environment which `svm' is called from. |
x |
a data matrix or a vector. |
y |
a response vector with one label for each row/component of x . Can be either
a factor (for classification tasks) or a numeric vector (for regression). |
type |
svm can be used as a classification
machine, as a regresson machine or a density estimator. Depending of whether y is
a factor or not, the default setting for type is C-classification or eps-regression , respectively, but may be overwritten by setting an explicit value.Valid options are:
|
kernel |
the kernel used in training and predicting. You
might consider changing some of the following parameters, depending
on the kernel type.
|
degree |
parameter needed for kernel of type polynomial (default: 3) |
gamma |
parameter needed for all kernels except linear
(default: 1/(data dimension)) |
coef0 |
parameter needed for kernels of type polynomial
and sigmoid (default: 0) |
cost |
cost of constraints violation (default: 1)it is the `C'-constant of the regularization term in the Lagrange formulation. |
nu |
parameter needed for nu-classification and one-classification |
class.weights |
a named vector of weights for the different classes, used for asymetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named. |
cachesize |
cache memory in MB (default 40) |
tolerance |
tolerance of termination criterion (default: 0.001) |
epsilon |
epsilon in the insensitive-loss function (default: 0.5) |
shrinking |
option whether to use the shrinking-heuristics
(default: TRUE ) |
cross |
if a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Sqared Error for regression |
fitted |
indicates whether the fitted values should be computed
and included in the model or not (default: TRUE ) |
... |
additional parameters for the low level fitting function
svm.default |
subset |
An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |
na.action |
A function to specify the action to be taken if `NA's are found. The default action is for the procedure to fail. An alternative is `na.omit', which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.) |
For multiclass-classification with k levels, k>2, libsvm
uses the
`one-against-one'-approach, in which k(k-1)/2 binary classifiers are
trained; the appropriate class is found by a voting scheme.
If the predictor variables include factors, the formula interface must be used to get a
correct model matrix.
An object of class "svm"
containing the fitted model, including:
sv |
the resulting support vectors |
index |
the index of the resulting support vectors in the data matrix |
coefs |
the corresponding coefficiants times the training labels |
rho |
the negative intercept |
(Use summary
and print
to get some output).
David Meyer (based on C/C++-code by Chih-Chung Chang and Chih-Jen Lin)
david.meyer@ci.tuwien.ac.at
data(iris) attach(iris) ## classification mode # default with factor response: model <- svm (Species~., data=iris) # alternatively the traditional interface: x <- subset (iris, select = -Species) y <- Species model <- svm (x, y) print (model) summary (model) # test with train data pred <- predict (model, x) # Check accuracy: table (pred,y) ## try regression mode on two dimensions # create data x <- seq (0.1,5,by=0.05) y <- log(x) + rnorm (x, sd=0.2) # estimate model and predict input values m <- svm (x,y) new <- predict (m,x) # visualize plot (x,y) points (x, log(x), col=2) points (x, new, col=4) ## density-estimation # create 2-dim. normal with rho=0: X <- data.frame (a=rnorm (1000), b=rnorm (1000)) attach (X) # traditional way: m <- svm (X, gamma=0.1) # formula interface: m <- svm (~., data=X, gamma=0.1) # or: m <- svm (~a+b, gamma=0.1) # test: newdata <- data.frame(a=c(0,4), b=c(0,4)) predict (m, newdata) # visualization: plot (as.matrix(X)) points (as.matrix(m$SV), col=2)