# Support Vector Machine

A support vector machine (SVM) is a classification algorithm that selects a hyperplane that separates observations into distinct categories. Support vector machines optimize:

- The distance between the hyperplane and the nearest points of data (largest minimum margin)
- Classification rate (correctly separates the observations)

SVM can perform not only linear classification, but also nonlinear classification, which is very efficient.

To illustrate we will be using the iris dataset (Fisher 1936, Anderson 1935), which is a part of the *datasets* package in R. If you type 'iris' into RStudio's Help search box, you will find that this dataset gives measurements in centimeters for sepal length and width as well as petal length and width for 50 flowers, each of which belong to one of three species of iris. As we see in the correlation plots shown below, it is difficult to classify the species of iris when only using sepal length and sepal width. Instead, it is easier to separate the iris dataset by the petal length and petal width.

```
sp = sample(150, 100)
iris_train = iris[sp, c("Petal.Length", "Petal.Width", "Species")]
iris_test = iris[-sp, c("Petal.Length", "Petal.Width", "Species")]
```

We will take two thirds of all the observations from the iris dataset as our training dataset and the remaining third will be our testing dataset. We will fit the support vector machine to the **iris_train** dataset. To use the function `svm()`

, we first need to install and load the package **e1071** :

```
svm(formula, data = , kernel = ,
cost = , scale = )
```

*formula*: the format outcome ~predictor1 + predictor2 + predictor3 + ….*data*: specify the data frame*kernel*: It can be “linear”, “polynomial”, etc.*cost*: cost of constraints violation*scale*: A logical vector (T or F) indicating the variables to be scaled

```
install.packages("e1071")
library(e1071)
svmfit = svm(Species~., data = iris_train, kernel = "linear", cost = 0.1, scale = FALSE)
print(svmfit)
##
## Call:
## svm(formula = Species ~ ., data = iris_train, kernel = "linear",
cost = 0.1, scale = FALSE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.1
## gamma: 0.5
##
## Number of Support Vecotrs: 52
```

plot(svmfit, iris_train)

From the result above, we can see there are 46 support vectors. The plot above shows how the SVM fits the **iris_train** data. Because we are randomly sampling, when we resample using the `sample()`

function, the number of support vector may change slightly, therefore it is important to plot the support vector machine.

Above we use **cost=0.1**. The cost parameter tells the SVM optimization how much you want to avoid misclassifying each training example. For large cost parameters, the optimization will choose a smaller-margin hyperplane. A very small cost parameter will cause the optimizer to look for a larger-margin separating the hyperplane, even if that hyperplane misclassifies more points.

Below are the Support Vector Machine plots with costs equal to 1,10, and 100. The two hyperplanes shift inside because of larger margin.

```
## Call:
## svm(formula = Species ~ ., data = iris_train, kernel = "linear",
cost = 1, scale = FALSE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM_Kernel: linear
## cost: 1
## gamma: .5
##
## Number of Support Vectors: 25
```

```
## Call:
## svm(formula = Species ~ ., data = iris_train, kernel = "linear",
cost = 10, scale = FALSE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM_Kernel: linear
## cost: 10
## gamma: .5
##
## Number of Support Vectors: 16
```

```
## Call:
## svm(formula = Species ~ ., data = iris_train, kernel = "linear",
cost = 100, scale = FALSE)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM_Kernel: linear
## cost: 100
## gamma: .5
##
## Number of Support Vectors: 12
```