當(dāng)前位置：首頁(yè) >

机器学习集成学习篇——python实现Bagging和AdaBOOST算法

發(fā)布時(shí)間：2025/3/21 30 豆豆

生活随笔收集整理的這篇文章主要介紹了机器学习集成学习篇——python实现Bagging和AdaBOOST算法小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

機(jī)器學(xué)習(xí) 集成學(xué)習(xí)篇——python實(shí)現(xiàn)Bagging和AdaBOOST算法

摘要
Bagging算法
Adaboost算法

摘要

本文通過(guò)python實(shí)現(xiàn)了集成學(xué)習(xí)中的Bagging和AdaBOOST算法，并將代碼進(jìn)行了封裝，方便讀者調(diào)用。

Bagging算法

import numpy as np import pandas as pd class Cyrus_bagging(object):def __init__(self,estimator,n_estimators = 20):self.estimator = estimatorself.n_estimators = n_estimatorsself.models = Nonedef fit(self,x,y):x = np.array(x)y = np.array(y).reshape((-1,))indices = np.arange(x.shape[0])self.models = []for i in range(self.n_estimators):index = np.random.choice(indices,x.shape[0])x0 = x[index]y0 = y[index]self.models.append(self.estimator.fit(x0,y0))def predict(self,x):res = np.zeros([x.shape[0],self.n_estimators])for i in range(self.n_estimators):res[:,i] = self.models[i].predict(x)result = []for i in range(res.shape[0]):pd_res = pd.Series(res[i,:]).value_counts()result.append(int(pd_res.argmax()))return np.array(result) from sklearn.neighbors import KNeighborsClassifier from sklearn.metrics import classification_report knn = KNeighborsClassifier() model = Cyrus_bagging(knn) model.fit(x_train,y_train) y_pre = model.predict(x_test) print(classification_report(y_test,y_pre))

示例使用的數(shù)據(jù)為了與不使用集成算法的模型的準(zhǔn)確率區(qū)分開(kāi)來(lái)，所以使用較少特征的數(shù)據(jù)，因而準(zhǔn)確率不是特別高，不過(guò)與未使用集成算法的模型相比，準(zhǔn)確率已經(jīng)優(yōu)出不少。

precision recall f1-score support0 1.00 1.00 1.00 111 0.67 0.67 0.67 92 0.70 0.70 0.70 10avg / total 0.80 0.80 0.80 30

Adaboost算法

import numpy as np import pandas as pd from sklearn.metrics import accuracy_score class CyrusAdaBoost(object):def __init__(self,estimator,n_estimators = 20):self.estimator = estimatorself.n_estimators = n_estimatorsself.error_rate = Noneself.model = Nonedef update_w(self,y,pre_y,w):error_rate = 1 - accuracy_score(y,pre_y)for i in range(w.shape[0]):if y[i] == pre_y[i]:w[i] = w[i]*np.exp(-error_rate)else:w[i] = w[i]*np.exp(error_rate)return w/w.sum()def cal_label(self,result,alpha):label = []for i in range(result.shape[0]):count = np.zeros(int(result[i,:].max()+1))for j in range(result.shape[1]):count[int(result[i,j])] += alpha[j]label.append(count.argmax())return np.array(label)def fit(self,x,y):x = np.array(x)y = np.array(y).reshape((-1,))self.error_rate = []self.model = []w0 = np.ones(x.shape[0])w0 = w0/w0.sum()indices = np.arange(x.shape[0])for i in range(self.n_estimators):index = np.random.choice(indices,size = x.shape[0],p = w0)x0 = x[index]y0 = y[index]model0 = self.estimator.fit(x0,y0)pre_y0 = model0.predict(x0)error_rate = 1 - accuracy_score(y0,pre_y0)self.error_rate.append(error_rate)self.model.append(model0)w0 = self.update_w(y0,pre_y0,w0)def predict(self,x):res = np.zeros([x.shape[0],self.n_estimators])for i in range(self.n_estimators):res[:,i] = self.model[i].predict(x)alpha = 1 - np.array(self.error_rate)return self.cal_label(res,alpha) from sklearn.tree import DecisionTreeClassifier model = CyrusAdaBoost(estimator=DecisionTreeClassifier(),n_estimators=50) model.fit(x_train,y_train) y_pre = model.predict(x_test) print(accuracy_score(y_pre,y_test)) 0.932

by CyrusMay 2020 06 12

這世界全部的漂亮
不過(guò)你的可愛(ài)模樣
——————五月天（愛(ài)情的模樣）——————

總結(jié)

以上是生活随笔為你收集整理的机器学习集成学习篇——python实现Bagging和AdaBOOST算法的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：深度学习（神经网络） —— BP神经网络
下一篇：机器学习聚类篇——python实现DB