當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

复现经典：《统计学习方法》第 6 章逻辑斯谛回归

發(fā)布時(shí)間：2025/3/8 编程问答 14 豆豆

生活随笔收集整理的這篇文章主要介紹了复现经典：《统计学习方法》第 6 章逻辑斯谛回归小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

本文是李航老師的《統(tǒng)計(jì)學(xué)習(xí)方法》^[1]一書的代碼復(fù)現(xiàn)。

作者：黃海廣^[2]

備注：代碼都可以在github^[3]中下載。

我將陸續(xù)將代碼發(fā)布在公眾號(hào)“機(jī)器學(xué)習(xí)初學(xué)者”，敬請(qǐng)關(guān)注。

代碼目錄

第 1 章統(tǒng)計(jì)學(xué)習(xí)方法概論
第 2 章感知機(jī)
第 3 章 k 近鄰法
第 4 章樸素貝葉斯
第 5 章決策樹
第 6 章邏輯斯諦回歸
第 7 章支持向量機(jī)
第 8 章提升方法
第 9 章 EM 算法及其推廣
第 10 章隱馬爾可夫模型
第 11 章條件隨機(jī)場(chǎng)
第 12 章監(jiān)督學(xué)習(xí)方法總結(jié)

代碼參考：wzyonggege^[4],WenDesi^[5],火燙火燙的^[6]

第 6 章邏輯斯諦回歸

邏輯斯諦回歸(LR)是經(jīng)典的分類方法

1．邏輯斯諦回歸模型是由以下條件概率分布表示的分類模型。邏輯斯諦回歸模型可以用于二類或多類分類。

這里，為輸入特征，為特征的權(quán)值。

邏輯斯諦回歸模型源自邏輯斯諦分布，其分布函數(shù)是形函數(shù)。邏輯斯諦回歸模型是由輸入的線性函數(shù)表示的輸出的對(duì)數(shù)幾率模型。

2．最大熵模型是由以下條件概率分布表示的分類模型。最大熵模型也可以用于二類或多類分類。

其中，是規(guī)范化因子，為特征函數(shù)，為特征的權(quán)值。

3．最大熵模型可以由最大熵原理推導(dǎo)得出。最大熵原理是概率模型學(xué)習(xí)或估計(jì)的一個(gè)準(zhǔn)則。最大熵原理認(rèn)為在所有可能的概率模型（分布）的集合中，熵最大的模型是最好的模型。

最大熵原理應(yīng)用到分類模型的學(xué)習(xí)中，有以下約束最優(yōu)化問題：

求解此最優(yōu)化問題的對(duì)偶問題得到最大熵模型。

4．邏輯斯諦回歸模型與最大熵模型都屬于對(duì)數(shù)線性模型。

5．邏輯斯諦回歸模型及最大熵模型學(xué)習(xí)一般采用極大似然估計(jì)，或正則化的極大似然估計(jì)。邏輯斯諦回歸模型及最大熵模型學(xué)習(xí)可以形式化為無約束最優(yōu)化問題。求解該最優(yōu)化問題的算法有改進(jìn)的迭代尺度法、梯度下降法、擬牛頓法。

回歸模型：

其中 wx 線性函數(shù)：

from math import exp import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inlinefrom sklearn.datasets import load_iris from sklearn.model_selection import train_test_split # data def create_data():iris = load_iris()df = pd.DataFrame(iris.data, columns=iris.feature_names)df['label'] = iris.targetdf.columns = ['sepal length', 'sepal width', 'petal length', 'petal width', 'label']data = np.array(df.iloc[:100, [0,1,-1]])# print(data)return data[:,:2], data[:,-1] X, y = create_data() X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) class LogisticReressionClassifier:def __init__(self, max_iter=200, learning_rate=0.01):self.max_iter = max_iterself.learning_rate = learning_ratedef sigmoid(self, x):return 1 / (1 + exp(-x))def data_matrix(self, X):data_mat = []for d in X:data_mat.append([1.0, *d])return data_matdef fit(self, X, y):# label = np.mat(y)data_mat = self.data_matrix(X) # m*nself.weights = np.zeros((len(data_mat[0]), 1), dtype=np.float32)for iter_ in range(self.max_iter):for i in range(len(X)):result = self.sigmoid(np.dot(data_mat[i], self.weights))error = y[i] - resultself.weights += self.learning_rate * error * np.transpose([data_mat[i]])print('LogisticRegression Model(learning_rate={},max_iter={})'.format(self.learning_rate, self.max_iter))# def f(self, x):# return -(self.weights[0] + self.weights[1] * x) / self.weights[2]def score(self, X_test, y_test):right = 0X_test = self.data_matrix(X_test)for x, y in zip(X_test, y_test):result = np.dot(x, self.weights)if (result > 0 and y == 1) or (result < 0 and y == 0):right += 1return right / len(X_test) lr_clf = LogisticReressionClassifier() lr_clf.fit(X_train, y_train) LogisticRegression Model(learning_rate=0.01,max_iter=200) lr_clf.score(X_test, y_test) 1.0 x_ponits = np.arange(4, 8) y_ = -(lr_clf.weights[1]*x_ponits + lr_clf.weights[0])/lr_clf.weights[2] plt.plot(x_ponits, y_)#lr_clf.show_graph() plt.scatter(X[:50,0],X[:50,1], label='0') plt.scatter(X[50:,0],X[50:,1], label='1') plt.legend()

scikit-learn 實(shí)例

sklearn.linear_model.LogisticRegression

solver 參數(shù)決定了我們對(duì)邏輯回歸損失函數(shù)的優(yōu)化方法，有四種算法可以選擇，分別是：

a) liblinear：使用了開源的 liblinear 庫實(shí)現(xiàn)，內(nèi)部使用了坐標(biāo)軸下降法來迭代優(yōu)化損失函數(shù)。
b) lbfgs：擬牛頓法的一種，利用損失函數(shù)二階導(dǎo)數(shù)矩陣即海森矩陣來迭代優(yōu)化損失函數(shù)。
c) newton-cg：也是牛頓法家族的一種，利用損失函數(shù)二階導(dǎo)數(shù)矩陣即海森矩陣來迭代優(yōu)化損失函數(shù)。
d) sag：即隨機(jī)平均梯度下降，是梯度下降法的變種，和普通梯度下降法的區(qū)別是每次迭代僅僅用一部分的樣本來計(jì)算梯度，適合于樣本數(shù)據(jù)多的時(shí)候。

from sklearn.linear_model import LogisticRegressionclf = LogisticRegression(max_iter=200)clf.fit(X_train, y_train)LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,intercept_scaling=1, max_iter=200, multi_class='warn',n_jobs=None, penalty='l2', random_state=None, solver='warn',tol=0.0001, verbose=0, warm_start=False) clf.score(X_test, y_test)1.0 print(clf.coef_, clf.intercept_)[[ 1.94562393 -3.20898537]] [-0.49595725] x_ponits = np.arange(4, 8) y_ = -(clf.coef_[0][0]*x_ponits + clf.intercept_)/clf.coef_[0][1] plt.plot(x_ponits, y_)plt.plot(X[:50, 0], X[:50, 1], 'bo', color='blue', label='0') plt.plot(X[50:, 0], X[50:, 1], 'bo', color='orange', label='1') plt.xlabel('sepal length') plt.ylabel('sepal width') plt.legend()

最大熵模型

import math from copy import deepcopyclass MaxEntropy:def __init__(self, EPS=0.005):self._samples = []self._Y = set() # 標(biāo)簽集合，相當(dāng)去去重后的yself._numXY = {} # key為(x,y)，value為出現(xiàn)次數(shù)self._N = 0 # 樣本數(shù)self._Ep_ = [] # 樣本分布的特征期望值self._xyID = {} # key記錄(x,y),value記錄id號(hào)self._n = 0 # 特征鍵值(x,y)的個(gè)數(shù)self._C = 0 # 最大特征數(shù)self._IDxy = {} # key為(x,y)，value為對(duì)應(yīng)的id號(hào)self._w = []self._EPS = EPS # 收斂條件self._lastw = [] # 上一次w參數(shù)值def loadData(self, dataset):self._samples = deepcopy(dataset)for items in self._samples:y = items[0]X = items[1:]self._Y.add(y) # 集合中y若已存在則會(huì)自動(dòng)忽略for x in X:if (x, y) in self._numXY:self._numXY[(x, y)] += 1else:self._numXY[(x, y)] = 1self._N = len(self._samples)self._n = len(self._numXY)self._C = max([len(sample) - 1 for sample in self._samples])self._w = [0] * self._nself._lastw = self._w[:]self._Ep_ = [0] * self._nfor i, xy in enumerate(self._numXY): # 計(jì)算特征函數(shù)fi關(guān)于經(jīng)驗(yàn)分布的期望self._Ep_[i] = self._numXY[xy] / self._Nself._xyID[xy] = iself._IDxy[i] = xydef _Zx(self, X): # 計(jì)算每個(gè)Z(x)值z(mì)x = 0for y in self._Y:ss = 0for x in X:if (x, y) in self._numXY:ss += self._w[self._xyID[(x, y)]]zx += math.exp(ss)return zxdef _model_pyx(self, y, X): # 計(jì)算每個(gè)P(y|x)zx = self._Zx(X)ss = 0for x in X:if (x, y) in self._numXY:ss += self._w[self._xyID[(x, y)]]pyx = math.exp(ss) / zxreturn pyxdef _model_ep(self, index): # 計(jì)算特征函數(shù)fi關(guān)于模型的期望x, y = self._IDxy[index]ep = 0for sample in self._samples:if x not in sample:continuepyx = self._model_pyx(y, sample)ep += pyx / self._Nreturn epdef _convergence(self): # 判斷是否全部收斂for last, now in zip(self._lastw, self._w):if abs(last - now) >= self._EPS:return Falsereturn Truedef predict(self, X): # 計(jì)算預(yù)測(cè)概率Z = self._Zx(X)result = {}for y in self._Y:ss = 0for x in X:if (x, y) in self._numXY:ss += self._w[self._xyID[(x, y)]]pyx = math.exp(ss) / Zresult[y] = pyxreturn resultdef train(self, maxiter=1000): # 訓(xùn)練數(shù)據(jù)for loop in range(maxiter): # 最大訓(xùn)練次數(shù)print("iter:%d" % loop)self._lastw = self._w[:]for i in range(self._n):ep = self._model_ep(i) # 計(jì)算第i個(gè)特征的模型期望self._w[i] += math.log(self._Ep_[i] / ep) / self._C # 更新參數(shù)print("w:", self._w)if self._convergence(): # 判斷是否收斂breakdataset = [['no', 'sunny', 'hot', 'high', 'FALSE'],['no', 'sunny', 'hot', 'high', 'TRUE'],['yes', 'overcast', 'hot', 'high', 'FALSE'],['yes', 'rainy', 'mild', 'high', 'FALSE'],['yes', 'rainy', 'cool', 'normal', 'FALSE'],['no', 'rainy', 'cool', 'normal', 'TRUE'],['yes', 'overcast', 'cool', 'normal', 'TRUE'],['no', 'sunny', 'mild', 'high', 'FALSE'],['yes', 'sunny', 'cool', 'normal', 'FALSE'],['yes', 'rainy', 'mild', 'normal', 'FALSE'],['yes', 'sunny', 'mild', 'normal', 'TRUE'],['yes', 'overcast', 'mild', 'high', 'TRUE'],['yes', 'overcast', 'hot', 'normal', 'FALSE'],['no', 'rainy', 'mild', 'high', 'TRUE']]maxent = MaxEntropy() x = ['overcast', 'mild', 'high', 'FALSE']maxent.loadData(dataset) maxent.train()iter:0 w: [0.0455803891984887, -0.002832177999673058, 0.031103560672370825, -0.1772024616282862, -0.0037548445453157455, 0.16394435955437575, -0.02051493923938058, -0.049675901430111545, 0.08288783767234777, 0.030474400362443962, 0.05913652210443954, 0.08028783103573349, 0.1047516055195683, -0.017733409097415182, -0.12279936099838235, -0.2525211841208849, -0.033080678592754015, -0.06511302013721994, -0.08720030253991244] iter:1 w: [0.11525071899801315, 0.019484939219927316, 0.07502777039579785, -0.29094979172869884, 0.023544184009850026, 0.2833018051925922, -0.04928887087664562, -0.101950931659509, 0.12655289130431963, 0.016078718904129236, 0.09710585487843026, 0.10327329399123442, 0.16183727320804359, 0.013224083490515591, -0.17018583153306513, -0.44038644519804815, -0.07026660158873668, -0.11606564516054546, -0.1711390483931799] iter:2 w: [0.18178907332733973, 0.04233703122822168, 0.11301330241050131, -0.37456674484068975, 0.05599764270990431, 0.38356978711239126, -0.07488546168160945, -0.14671211613144097, 0.15633348706002106, -0.011836411721359321, 0.12895826039781944, 0.10572969681821211, 0.19953102749655352, 0.06399991656546679, -0.17475388854415905, -0.5893308194447993, -0.10405912653008922, -0.16350962040062977, -0.24701967386590512] iter:3 w: [0.2402117261976856, 0.06087651054892573, 0.14300856884173724, -0.44265412294427664, 0.08623192206158618, 0.47264512563925376, -0.09600090083002198, -0.18353847640798293, 0.17967535014110475, -0.04398112111909075, 0.15854994616895085, 0.09937760679990165, 0.22754399461146121, 0.12138068016302067, -0.15616500410638443, -0.7136213594089919, -0.13342640817803014, -0.2097936229338585, -0.3153356710047331] iter:4 w: [0.2914313208012359, 0.07547654306538813, 0.16668283431764536, -0.5013655616789854, 0.1129176109082406, 0.553725824276617, -0.11340104779016447, -0.214026170852028, 0.19932565541497924, -0.07698174342694904, 0.18676347888513212, 0.08897527479225055, 0.250034281885875, 0.17966909953761648, -0.12561912266368833, -0.8214131440732644, -0.15887039192864807, -0.255021849396353, -0.3775163032854775] iter:5 w: [0.3371038340609469, 0.08719816942080917, 0.1858885244336221, -0.5536101929687616, 0.13629778855333752, 0.6284587190599515, -0.12800357294309486, -0.23983404211792342, 0.21652676634149073, -0.10944257416223822, 0.2137132093479417, 0.07676820672685672, 0.2690824414813502, 0.2363909590693551, -0.0894456215757756, -0.9176374337279947, -0.18113135827470755, -0.298867529863144, -0.43486330681003177] iter:6 w: [0.3785824456804424, 0.09688384478129128, 0.2020182323803342, -0.6009874705178111, 0.15692184161636785, 0.6978719259357552, -0.14051015547758733, -0.26225964542470354, 0.231946295562788, -0.14075188805495795, 0.23936253047337575, 0.06390380813674021, 0.2858409112894462, 0.290497131241793, -0.05118245076440544, -1.0054122371529666, -0.20087035680546067, -0.34104258966535955, -0.4883751534969831] 。。。。。。中間過程略。 iter:663 w: [3.806361507565719, 0.0348973837073587, 1.6391762776402004, -4.46082036700038, 1.7872898160522181, 5.305910631880809, -0.13401635325297073, -2.2528324581617647, 1.4833115301839292, -1.8899383652170454, 1.9323695880561387, -1.2622764904730739, 1.7249196963071136, 2.966398532640618, 3.904166955381073, -9.515244625579237, -1.8726512915652174, -3.4821197858946427, -5.634828605832783] iter:664 w: [3.8083642640626554, 0.03486819339595951, 1.6400224976589866, -4.463151671894514, 1.7883062251202617, 5.308526768308639, -0.13398764643967714, -2.2539799445450406, 1.4840784189709668, -1.890906591367886, 1.933249316738729, -1.2629454476069037, 1.7257519419059324, 2.967849703391228, 3.9061632698216244, -9.520241584621713, -1.8736788731126397, -3.483844660866203, -5.637874599559359] print('predict:', maxent.predict(x))predict: {'no': 2.819781341881656e-06, 'yes': 0.9999971802186581} 參考資料[1] 《統(tǒng)計(jì)學(xué)習(xí)方法》:?https://baike.baidu.com/item/統(tǒng)計(jì)學(xué)習(xí)方法/10430179 [2] 黃海廣:?https://github.com/fengdu78 [3] github:?https://github.com/fengdu78/lihang-code [4] wzyonggege:?https://github.com/wzyonggege/statistical-learning-method [5] WenDesi:?https://github.com/WenDesi/lihang_book_algorithm [6] 火燙火燙的:?https://blog.csdn.net/tudaodiaozhale往期精彩回顧那些年做的學(xué)術(shù)公益-你不是一個(gè)人在戰(zhàn)斗適合初學(xué)者入門人工智能的路線及資料下載吳恩達(dá)機(jī)器學(xué)習(xí)課程筆記及資源（github標(biāo)星12000+，提供百度云鏡像）吳恩達(dá)深度學(xué)習(xí)筆記及視頻等資源（github標(biāo)星8500+，提供百度云鏡像）《統(tǒng)計(jì)學(xué)習(xí)方法》的python代碼實(shí)現(xiàn)（github標(biāo)星7200+）機(jī)器學(xué)習(xí)的數(shù)學(xué)精華（在線閱讀版）備注：加入本站微信群或者qq群，請(qǐng)回復(fù)“加群”加入知識(shí)星球（4300+用戶，ID：92416895），請(qǐng)回復(fù)“知識(shí)星球”

總結(jié)

以上是生活随笔為你收集整理的复现经典：《统计学习方法》第 6 章逻辑斯谛回归的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：复现经典：《统计学习方法》第 8 章提
下一篇：复现经典：《统计学习方法》第 7 章支