For digit recognition, we use Support Vector Machine (SVM) as a learning machine to perform multi-class classification.

The way SVM works is to map vectors into an N-dimensional space and use an (N-1)-dimensional hyperplane as a decision plane to classify data. The task of SVM modeling is to find the optimal hyperplane that separates different class membership.

### Example 1

Let's take a look at a simple schematic example where every object either belongs to GREEN or RED.

SVM finds the line defining the boundary between the two types. Then, it can classify a new object by looking at on which side of the line it falls.

However, it is unlikely that we can always have a linear dividing boundary. Rather than fitting nonlinear curves to the data, we can map each object into a different space via a kernel function where a linear dividing hyperplane is feasible.

The concept of the kernel mapping function is so powerful that SVM can perform separation with very complex boundaries. The kernel function we use in this project is radial basis function (RBF).

**SVM Working Principles**

- Vectorize each instance into an array of features (attributes).
- Model with training data to find optimal dividing hyperplane with maximal margin.
- Use SVM to map all the objects into a different space via a kernel function (see Figure 3 for examples).
- Classify new object according to its position with respect to hyperplane.
- Errors in training are allowed while the goal of training is to maximize the margin and minimize errors. Namely, find the solution to optimization problem in Figure 4, where x is the attribute, y is the object label, ξ is the error and φ is the mapping function.