當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

数学建模算法：支持向量机_从零开始的算法：支持向量机

發布時間：2023/12/15 编程问答 36 豆豆

生活随笔收集整理的這篇文章主要介紹了数学建模算法：支持向量机_从零开始的算法：支持向量机小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

數學建模算法：支持向量機

從零開始的算法 (Algorithms From Scratch)

A popular algorithm that is capable of performing linear or non-linear classification and regression, Support Vector Machines were the talk of the town before the rise of deep learning due to the exciting kernel trick — If the terminology makes no sense to you right now don’t worry about it. By the end of this post you’ll have an good understanding about the intuition of SVMs, what is happening under the hood of linear SVMs, and how to implement one in Python.

支持向量機是一種能夠執行線性或非線性分類和回歸的流行算法，由于令人興奮的內核技巧，它在深度學習興起之前就已經成為熱門話題了-如果該術語對您現在沒有意義，請不要不用擔心。到本文結尾，您將對SVM的直覺，線性SVM的幕后情況以及如何在Python中實現一個方面有一個很好的了解。

To see the full Algorithms from Scratch Series click on the link below.

要查看Scratch系列的完整算法，請點擊下面的鏈接。

直覺 (Intuition)

In classification problems the objective of the SVM is to fit the largest possible margin between the 2 classes. On the contrary, regression task flips the objective of classification task and attempts to fit as many instances as possible within the margin — We will first focus on classification.

在分類問題中，SVM的目標是使兩個類別之間的可能余量最大。相反，回歸任務會顛覆分類任務的目標，并嘗試在邊界內盡可能多地容納實例—我們將首先關注分類。

If we focus solely on the extremes of the data (the observations that are on the edges of the cluster) and we define a threshold to be the mid-point between the two extremes, we are left with a margin that we use to sepereate the two classes — this is often referred to as a hyperplane. When we apply a threshold that gives us the largest margin (meaning that we are strict to ensure that no instances land within the margin) to make classifications this is called Hard Margin Classification (some text refer to this as Maximal Margin Classification).

如果我們僅關注數據的極值(位于群集邊緣的觀測值)，并且將閾值定義為兩個極值之間的中點，則我們留有一個余量，用于分離兩類-這通常稱為超平面。當我們應用一個給我們最大保證金的閾值(意味著我們嚴格確保沒有實例落在該保證金內)進行分類時，這稱為“ 硬保證金分類” (某些文本稱為“ 最大保證金分類” )。

When detailing hard margin classification it always helps to see what is happening visually, hence Figure 2 is an example of a hard margin classification. To do this we will use the iris dataset from scikit-learn and utility function plot_svm() which you can find when you access the full code on github — link below.

在詳細說明硬邊距分類時，它總是有助于視覺上看到發生的事情，因此圖2是硬邊距分類的示例。為此，我們將使用scikit-learn和實用程序函數plot_svm()的虹膜數據集，當您訪問github上的完整代碼時，您可以找到它-以下鏈接。

Note: This story was written straight from jupyter notebooks using python package jupyter_to_medium — for more information on this package click here — and the committed version on github is a first draft hence you may notice some alterations to this post.

注意：這個故事是使用python軟件包jupyter_to_medium直接從jupyter筆記本上jupyter_to_medium -有關此軟件包的更多信息， 請單擊此處 jupyter_to_medium上的提交版本是初稿，因此您可能會注意到本文的一些更改。

import pandas as pd
import numpy as np
from sklearn.svm import LinearSVC
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
%matplotlib inline# store the data
iris = load_iris()
# convert to DataFrame
df = pd.DataFrame(data=iris.data,
columns= iris.feature_names)
# store mapping of targets and target names
target_dict = dict(zip(set(iris.target), iris.target_names))
# add the target labels and the feature names
df["target"] = iris.target
df["target_names"] = df.target.map(target_dict)
# view the data
df.tail()Figure 1: Original Dataset圖1：原始數據集 # setting X and y
X = df.query("target_names == 'setosa' or target_names == 'versicolor'").loc[:, "petal length (cm)":"petal width (cm)"]
y = df.query("target_names == 'setosa' or target_names == 'versicolor'").loc[:, "target"]
# fit the model with hard margin (Large C parameter)
svc = LinearSVC(loss="hinge", C=1000)
svc.fit(X, y)
plot_svm()Figure 2: Visualizing the decision boundary by SVM圖2：通過SVM可視化決策邊界

Figure 2 displays how the Linear SVM uses hard margin classification to ensure that no instances falls within the margin. Although this looks good for our current scenario, we must be careful to take into account the pitfalls that come with performing hard margin classification:

圖2顯示了Linear SVM如何使用硬邊距分類來確保沒有實例落在該邊距內。盡管這對于我們當前的情況來說看起來不錯，但是我們必須謹慎考慮執行硬邊距分類帶來的陷阱：

Very sensitive to outliers

對異常值非常敏感

It only works when the classes are linearly separable

僅當類可線性分離時才有效

處理異常值和非線性數據 (Dealing with Outliers and Non-Linear Data)

A more flexible alternative to hard margin classification is soft margin classification which is a good solution to overcome the pitfalls listed above when doing hard margin classification — mainly with solving the issue of sensitivity to outliers. When we allow for there to be some misclassifications (meaning that some negative observations may be classified as positive and vice versa), the distance from the threshold to the observations is called soft margin. In soft margin classification we aim to achieve a good balance between maximizing the size of the margin and limiting the amount of violations in the margin (the number of observations that land in the margin).

硬邊距分類的一種更靈活的替代方法是軟邊距分類，這是克服進行硬邊距分類時上面列出的陷阱的好方法，主要是解決對異常值的敏感性問題。當我們允許存在一些錯誤分類時(這意味著某些負面觀察可能被分類為正面觀察，反之亦然)，從閾值到觀察的距離稱為軟裕度 。在軟邊距分類中，我們力求在最大化邊距大小和限制邊距中的違例數量(落在邊距中的觀察數)之間取得良好的平衡。

Yes, Linear SVM classifiers (hard-margin and soft-margin) are quite efficient and work really well in many cases, but when the dataset is not linearly separable, as if often the case with many datasets, a better solution is to make use of the SVMs kernel trick (once you understand the kernel trick you may notice that it is not exclusive to SVMs). The kernel trick maps non-linearly separable data into a higher dimension then uses a hyperplane to separate the classes. What makes this trick so exciting is that the mapping of the data into higher dimensions does not actually add the new features, but we still get the same results as if we did. Since we do not have to add the new features to our data, our model is much more computationally effieicent and works ust as good.

是的，線性SVM分類器(硬邊距和軟邊距)非常有效并且在很多情況下都可以很好地工作，但是當數據集不能線性分離時(就像許多數據集經常出現這種情況)，更好的解決方案是使用SVM內核技巧(一旦您了解了內核技巧，您可能會注意到它不是SVM專有的)。內核技巧將非線性可分離的數據映射到更高的維度，然后使用超平面來分離類。使這一技巧如此激動人心的原因是，將數據映射到更高維度實際上并沒有添加新功能，但是我們仍然獲得與我們相同的結果。由于我們不必在數據中添加新功能，因此我們的模型在計算效率上要高得多，并且效果也一樣好。

You’ll see an example of this phenomena below.

您將在下面看到這種現象的示例。

術語 (Terminology)

Decision Boundary: The hyperplane that seperates the dataset into two classes
決策邊界 ：將數據集分為兩類的超平面
Support Vectors: The observations are at the edge of the cluster (located nearest to the seperating hyperplane).
支持向量 ：觀測值位于群集的邊緣(最靠近分離的超平面)。
Hard Margin: When we strictly impose that all observations do not fall within the margin
硬邊距 ：當我們嚴格要求所有觀察值不屬于邊距范圍內時
Soft Margin: When we allow for some misclassification. We seek to find a balacne of keeping the margin as large as possible and limiting the number of violations (bias/variance tradeoff)
軟邊距 ：當我們允許一些錯誤分類時。我們尋求找到一種保持盡可能大的邊距并限制違規次數( 偏差/方差折衷 )的方法

from sklearn.datasets import make_moons
from mlxtend.plotting import plot_decision_regions
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC# loading the data
X, y = make_moons(noise=0.3, random_state=0)
# scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# fit the model with polynomial kernel
svc_clf = SVC(kernel="poly", degree=3, C=5, coef0=1)
svc_clf.fit(X_scaled, y)
# plotting the decision regions
plt.figure(figsize=(10, 5))
plot_decision_regions(X_scaled, y, clf=svc_clf)
plt.show()Figure 3: Kernel Trick applied to non-linear data圖3：應用于非線性數據的內核技巧

Note: We applied a polynomial kernel to this dataset, however RBF is also a very popular kernel that is applied in many Machine Learning problems and is often used as a default when data is not linearly separable.

注意：我們對該數據集應用了多項式內核，但是RBF也是一種非常流行的內核，已應用于許多機器學習問題中，并且在數據不可線性分離時通常用作默認值。

創建模型 (Creating the Model)

Now that we have built up our conceptual understanding of what SVM is doing, lets understand what is happening under the hood of the model. The linear SVM classifier computes the decision function w.T * x + b and predicts the positive class for results that are positive or else it is the negative class. Training a Linear SVM classifier means finding the values w and b that make the margin as wide as possible whilst avoiding margin violations (hard margin classification) or limiting them (soft margin classification)

既然我們已經建立了對SVM正在做什么的概念理解，那么讓我們了解在模型的內部正在發生的事情。線性SVM分類器計算決策函數wT * x + b并為肯定的結果預測肯定的類，否則為否定的類。訓練線性SVM分類器意味著找到值w和b使邊距盡可能寬，同時避免邊距沖突(硬邊距分類)或限制它們(軟邊距分類)

Figure 4: Linear SVM Classifier Prediction圖4：線性SVM分類器預測

The slope of the decision function is equal to the norm of the weight vector hence for us to achieve the largest possible margin we want to minimize the norm of the weight vector. However, there are ways to go about this for us to achieve hard margin classification and soft margin classification.

決策函數的斜率等于權重向量的范數，因此，為了實現最大可能的裕度，我們希望最小化權重向量的范數。但是，有一些方法可以幫助我們實現硬邊距分類和軟邊距分類。

The hard margin optimization problem is as follows:

硬邊距優化問題如下：

Figure 5: Linear SVM (Hard margin classifier) objective圖5：線性SVM(硬邊際分類器)目標

And soft margin:

和軟邊距：

Figure 6: Linear SVM (Soft margin classifier) objective; Note that to achieve the soft margin we add a slack variable (zeta ≥ 0) for each instance, which measures how much each instance is allowed to violate the margin.圖6：線性SVM(軟邊際分類器)目標；請注意，要獲得軟裕量，我們為每個實例添加一個松弛變量(zeta≥0)，該變量測量每個實例允許多少違反裕量。

實作 (Implementation)

Note: For this Implementation I will be doing hard margin classification, however further work will consist of Python implementations of soft-margin and the kernel trick performed to different datasets including regression based task — to be notified of these post you can follow me on Github.

注意：對于此實現，我將進行硬邊距分類，但是，進一步的工作將包括對軟邊距的Python實現以及對不同數據集(包括基于回歸的任務)執行的內核技巧-有關這些信息，您可以在Github上關注我。

from sklearn.datasets.samples_generator import make_blobs
# generating a dataset
X, y = make_blobs(n_samples=50, n_features=2, centers=2, cluster_std=1.05, random_state=23)def initialize_param(X):
"""
Initializing the weight vector and bias
"""
_, n_features = X.shape
w = np.zeros(n_features)
b = 0
return w, bdef optimization(X, y, learning_rate=0.001, lambd=0.01, n_iters=1000):
"""
finding value of w and b that make the margin as large as possible while
avoiding violations (Hard margin classification)
"""
t = np.where(y <= 0, -1, 1)
w, b = initialize_param(X)

for _ in range(n_iters):
for idx, x_i in enumerate(X):
condition = t[idx] * (np.dot(x_i, w) + b) >= 1
if condition:
w -= learning_rate * (2 * lambd * w)
else:
w -= learning_rate * (2 * lambd * w - np.dot(x_i, t[idx]))
b -= learning_rate * t[idx]
return w, bw, b = gradient_descent(X, y)def predict(X, w, b):
"""
classify examples
"""
decision = np.dot(X, w) + b
return np.sign(decision)# my implementation visualization
visualize_svm()
# convert X to DataFrame to easily copy code
X = pd.DataFrame(data=X,
columns= ["x1", "x2"])
# fit the model with hard margin (Large C parameter)
svc = LinearSVC(loss="hinge", C=1000)
svc.fit(X, y)
# sklearn implementation visualization
plot_svm()

優點 (Pros)

Very good Linear classifier because it finds the best decision boundary (In a Hard Margin Classification sense)
很好的線性分類器，因為它找到了最佳決策邊界(從硬邊界分類的意義上來說)
Easy to transform into a non-linear model
易于轉換為非線性模型

缺點 (Cons)

Not suited for large datasets
不適合大型數據集

結語 (Wrap Up)

The SVM is quite a tricky algorithm to code and is a good reminder as to why we should be grateful for Machine Learning libraries that allow us to implement them with few lines of code. In this post I did not go into the full detail of SVMs and there are still quite a few gaps that you may want to read up on such as computing the support vector machine and empirical risk minimization.

SVM是一種非常棘手的編碼算法，很好地提醒了我們為什么要感謝機器學習庫，這些庫允許我們用很少的代碼行來實現它們。在這篇文章中，我沒有詳細介紹SVM，您仍然需要閱讀一些空白，例如計算支持向量機和經驗最小化風險。

Additionally, it may be worth watch Andrew Ng’s lectures on SVMs — Click Here

此外，值得觀看Andrew Ng關于SVM的講座— 單擊此處

Thank you for taking the time to read through this story (as it’s called on Medium). You now have a good conceptual understanding of Support Vector Machines, what happens under the hood of a SVM and how to code a hard margin classifier in Python. If you’d like to get in contact with me, I am most accessible on LinkedIn.

感謝您抽出寶貴的時間閱讀此故事(如中號所稱)。您現在已經對支持向量機有了很好的概念性理解，它在SVM的幕后發生了什么，以及如何在Python中編寫硬邊距分類器。如果您想與我聯系，可以在LinkedIn上訪問我。

翻譯自: https://towardsdatascience.com/algorithms-from-scratch-support-vector-machine-6f5eb72fce10