Hire Writer

Application of Support Vector Machine in Binary Classification of Data

Categories: Math Science

Essay, Pages 10 (2367 words)

Views

Abstract

This is the era of data, everything is data-driven. As a very promising field with huge growth potential, agricultural data classification is a hot topic in the agriculture and computer science communities. In recent years, many popular algorithms in the machine learning field have been applied in the agricultural data classification, such as Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM) etc,. The classification of agricultural data is an important application of information technology in agriculture.

SVM is a powerful state-of-the-art classifier and has been applied in many fields.

Don't use plagiarized sources. Get your custom paper on

“ Application of Support Vector Machine in Binary Classification of Data ”

Get high-quality paper

NEW! smart matching with writer

In order to analyze and to have a deeper understanding of SVM, a study was conducted on credit cards of defaulter and non-defaulters using 12391 observations and 31 variables over 15 days. To measure the accuracy, the SVM model was compared with other classification techniques such as DT, RF and Logistic Regression. The results highlighted that SVM (linear) has greater accuracy in comparison with other methods.

Introduction

As a really promising field with an enormous growth potential, agricultural data classification may be a hot topic within the agriculture and computing communities.

writer-Charlotte

Verified writer

Proficient in: Math

4.7 (348)

“ Amazing as always, gave her a week to finish a big assignment and came through way ahead of time. ”

+84 relevant experts are online

Hire writer

In recent years, many popular algorithms within the machine learning field are applied within the agricultural data classification, like decision tree, KNN, artificial neural network and Support vector machine (SVM) etc. The classification of agricultural data is a crucial application of data technology in agriculture. SVM may be a powerful state-of-the-art classifier and has been applied in many fields. In this report, SVM is introduced to classify the agricultural data for improving the classification performance and forecast the data.

SVM achieve good performances in terms of upper accuracy, better generalization and therefore the global optimal solution (S.R.Gunn 1998; Vapnik 2000; Doumpos 2004).

Get to Know The Price Estimate For Your Paper

Topic

Deadline: 10 days left

Number of pages

Email Invalid email

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy. We’ll occasionally send you promo and account related email

"You must agree to out terms of services and privacy policy"

Write my paper

You won’t be charged yet!

Another merit of SVMs is that the training of SVMs is like solving a linearly constrained quadratic program, which suggests that the solutions of SVMs are unique, optimal and absent from local minima (Cortes and Vapnik 1995; Smola and Scholkopf 1998; Pai 2006). SVMs, supported statistical learning theory, are successfully applied to classify images (Huang etal., 2002; Pal and Mather, 2004; Dixon and Candade, 2007; Kavzoglu and Colkesen, 2009).

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and multivariate analysis. SVM is intrinsically a binary classifier that constructs a linear separating hyper plane to classify data instances. The classification capabilities of traditional SVMs are often substantially enhanced through transformation of the first feature space into a feature space of a better dimension by using the “kernel trick”. Supported global optimization, SVMs affect over fitting problems, which appear in high-dimensional spaces, making them appealing in various applications. Most used SVM algorithms include the support vector regression, method of least squares support vector machine and successive projection algorithm-support vector machine.

The question whether an email is spam or not is an example of a classification problem. In these sorts of problems, the target is to work out whether a given datum belongs to a particular class or not. After first training a classifier model on data points that the category is understood (e.g. a group of emails that are labelled as spam or not spam), you'll then use the model to work out the category of latest , unseen data-points. a strong technique for these sorts of problems is Support Vector Machines (SVM).

Terminologies

Algorithm: A Machine Learning algorithm may be a set of rules and statistical techniques wont to learn patterns from data and draw significant information from it. It is the logic behind a Machine Learning model. An example of a Machine Learning algorithm is that the Linear Regression algorithm.

Training Data: The Machine Learning model is made using the training data. The training data helps the model to spot key trends and patterns essential to predict the output.

Testing Data: After the model is trained, it must be tested to guage how accurately it can predict an outcome. This is done by the testing data set.

Hyperplane: Are decision boundaries that help classify the data points. Data points falling on either side of the hyperplane are often attributed to different classes. To separate the 2 classes of knowledge points, there are many possible hyperplanes that would be chosen. Our objective is to seek out a plane that has maximum margin, i.e the utmost distance between data points of both classes. Maximizing the margin distance provides some reinforcement in order that future data points are often classified with more confidence.

Also, the dimension of the hyperplane depends upon the amount of features. If the amount of input features is 2, then the hyperplane is simply a line. If the amount of input features is 3, then the hyperplane becomes a two-dimensional plane. It becomes difficult to imagine when the amount of features exceeds 3.

Margin: The hyperplane is drawn based on these support vectors and an optimum hyperplane will have a maximum distance from each of the support vectors. And this distance between the hyperplane and therefore the support vectors is understood because the margin.

Support Vectors: Support vectors are data points that are closer to the hyperplane and influence the position and orientation of the hyperplane. Using these support vectors, we maximize the margin of the classifier. Deleting the support vectors will change the position of the hyperplane. These are the points that help us build our SVM.

The Margin is defined with the assistance of the Support Vectors (hence the name). In our example, Yellow stars and Yellow circles are the Support Vectors defining the Margin. The better the gap, the higher the classifier works. Hence support vectors play a crucial role in developing the classifier.

Every new datum in test data are going to be classified consistent with this Margin. If it lies on the right side of it, it’ll be classified as a Red circle otherwise as a Blue star.

Kernel function: A Kernel function is always used by SVM, whether it is linear or non-linear data, but its main function comes into play when the data is inseparable in its current form. Here, the Kernel function adds dimensions to the matter for classification.

In this case, we are employing a linear kernel as you'll see. Depending on the problem, we can use different types of Kernel functions like Polynomial, Radial Basis Function, Gaussian, Laplace, Sigmoid and many more. Choosing the proper kernel function is vital for building the classifier.

These are functions which takes low dimensional input space and transform it to a better dimensional space i.e. it converts not separable problem to separable problem, these functions are called kernels. It is mostly useful in non-linear separation problem. Simply put, it does some extremely complex data transformations, then determine the method to separate the data based on the labels or outputs you’ve defined.

Kernel-trick: It uses a kernel function to map the non-linear data to higher dimensions in order that it becomes linear and finds the decision boundary there. Things get little tricky within the case of non-linear data. Here SVM uses the ‘Kernel-trick’; it uses a kernel function to map the non-linear data to higher dimensions in order that it becomes linear and finds the decision boundary there.

A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labelled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples, and it was able to handle multiple continuous and categorical variables.

The main idea behind SVM is that you try to find the boundary line that separates the two classes, but in such a way that the boundary line creates a maximum separation between the classes. To demonstrate this, we'll use the subsequent simple data for our classification problem:

Suppose there are two independent variables (features): x1 and x2. And there are two classes Class A and Class B. The following graphic shows the scatter diagram.

In this example, the green circles and therefore the red squares could represent two different segments during a total set of customers (e.g. high potentials and low potentials), based on all kinds of properties for each of the customers. Any line that keeps the green circles on the left and the red squares on the right is considered a valid boundary line for the classification problem. There is an infinite number of such lines that can be drawn. Four different examples are presented below:

As stated before, SVM helps you to find the boundary line that maximizes the separation between the two classes. In the provided example, this this will be drawn as follows:

The two dotted lines are the two parallel separation lines with the largest space between them. The actual classification boundary that's used are going to be the solid line exactly within the middle of the two dotted lines.

The name Support Vector Machine comes from the data points that are directly on either of those lines. These are the supporting vectors. In our example, there were three supporting vectors.

If any of the other data points (i.e. not a supporting vector) is moved a touch, the dotted boundary lines are not affected. However, if the position of any of the supporting vectors is slightly changed (e.g. data point 1 is moved slightly to the left), the position of the dotted boundary lines will change and therefore the position of the solid classification line also changes. In real life, data is not as straightforward as in this simplified example. We usually work with more than two dimensions. Besides having straight separation lines, the underlying mathematics for an SVM also allows for certain types of calculations or kernels that result in boundary lines that are non-linear.

Confusion matrix/ Error matrix. A confusion matrix is a summary of prediction results on a classification problem.

Methodology

What does SVM do?

Given a group of coaching examples, each marked as belonging to at least one or the opposite of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the opposite, making it a non-probabilistic binary linear classifier.

How Classification System works

The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix. The confusion matrix shows the ways during which your classification model is confused when it makes predictions. It gives us insight not only into the errors being made by a classifier but more importantly the kinds of errors that are being made.

Class 1

Predicted Class 2

Predicted

Class 1

Actual TPFN

Class 2

Actual FP TN

Here,

Class 1 : Positive

Class 2 : Negative

Definition of the Terms:

Positive (P) : Observation is positive (for example: is an defaulter).

Negative (N) : Observation is not positive (for example: is non defaulter).

True Positive (TP) : Observation is positive, and is predicted to be positive.

False Negative (FN) : Observation is positive, but is predicted negative.

True Negative (TN) : Observation is negative, and is predicted to be negative.

False Positive (FP) : Observation is negative, but is predicted positive.

Classification Rate or Accuracy is given by the relation:

Accuracy =(TP+TN)/(TP+TN+FP+FN)

However, there are problems with accuracy. It assumes equal costs for both sorts of errors. A 99% accuracy are often excellent, good, mediocre, poor or terrible depending upon the problem.

Data Sources: In this study payment schedule of credit card holders of XYZ company were used. 12391 observations (Credit card Holders) and 31 variables are used, out of which 80 percent (9912) of data is used as training and 20 percent (2479) as testing set.

Cross validate the SVM model with other classification techniques like Decision tree, Random Forest and Logistic regression. Python software is used for analysis

Decision Tree (DT) : As the name goes, it uses a tree-like model of decisions. Decision Trees are a non-parametric supervised learning method used for both classification and regression tasks. The goal is to make a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. The decision rules are generally in sort of if-then-else statements. The deeper the tree, the more complex the rules and fitter the model.

Random Forest (RF): Random forest, like its name implies, consists of a large number of individual decision trees that operate as an ensemble. Each individual tree within the random forest spits out a class prediction and the class with the foremost votes becomes our model’s prediction

Logistic regression: Is the appropriate regression analysis to conduct when the variable is dichotomous (binary). Like all regression analyses, the logistic regression may be a predictive analysis. Logistic regression is employed to elucidate data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.

Result and Discussion

The table 1 shows the confusion matrix of selected models i.e. Support Vector, Decision Tree, Random Forest and Logistic regression and accuracy rate of these models. For SVM 3 sorts of kernels were selected i.e. linear, sigmoid and gausian. The study found that SVM (linear) is best model for this data because the accuracy rate is 82 percent which is above than other methods of data classification i.e. 73 percent (Decision Tree), 76 percent (Random Forest), 80 percent (SVM (gausian)), 78 percent (SVM (sigmoid)) and 79 percent (Logistic regression).

Summary

SVM is a powerful tool for solving classification problems with small samples, nonlinearities and local minima, and has been of of fantastic performance. Support Vector Machine (SVM) is a novel learning method based on statistical learning theory. Support vector machine is a widely used method for classification and are utilized in sort of applications. So as to research and to possess a deeper understanding of SVM, a study was conducted on credit cards of defaulter and non-defaulters using 12391 observations and 31 variables over 15 days (Dec 15th to Jan 1st 2020).

To live the accuracy, the SVM model was compared with other classification techniques like DT, RF and Logistic Regression. The results highlighted that SVM (linear) has greater accuracy as compared with other methods. The results vary with reference to to context and kinds of knowledge which the researchers choose for his or her studies. Hence, it's preferred to use ‘hybrid model’ for higher accurate data classification.

Application of Support Vector Machine in Binary Classification of Data . (2024, Feb 13). Retrieved from https://studymoose.com/document/application-of-support-vector-machine-in-binary-classification-of-data