當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

用XGBoost调XGBoost?我调我自己？

發布時間：2025/3/8 编程问答 61 豆豆

生活随笔收集整理的這篇文章主要介紹了用XGBoost调XGBoost?我调我自己？小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

上篇《深惡痛絕的超參》已經介紹了很多實用的調參方式，今天來看一篇更有趣的跳槽方法，用ML的方式調ML的模型我們用我們熟悉的模型去調我們熟悉的模型，看到這里很暈是不是，接下來我們就看看XGBoost如何調XGBoost。

Model-based HP Tuning

基于模型的調參其實想法很簡單，我們需要有個方式指導超參優化，從而達到最好的效果。現在訓練集很大，訓練模型相當耗時，各種配置的組合往往又非常大，所以為什么不直接學一個estimator去給當前配置打分呢？每次訓練都可以為我們探索方向給予啟發。

基于模型優化超參可以概括為以下流程：

隨機選n種配置
用estimator評估這些配置
從這些配置中挑出評分最高的
用評分最高的配置訓練模型
把該配置和模型最終效果保存到estimator的訓練數據中
重新訓練estimator
返回最開始的一步，如果沒達到停止條件

參數空間采樣

怎么在參數空間采樣呢？已經有現成的lib可以用了:

>>> import ConfigSpace as CS >>> import ConfigSpace.hyperparameters as CSH >>> cs = CS.ConfigurationSpace(seed=1234) >>> a = CSH.UniformIntegerHyperparameter('a', lower=10, upper=100, log=False) >>> b = CSH.CategoricalHyperparameter('b', choices=['red', 'green', 'blue']) >>> cs.add_hyperparameters([a, b]) [a, Type: UniformInteger, Range: [10, 100], Default: 55,...] >>> cs.sample_configuration() Configuration:a, Value: 27b, Value: 'blue'

"我"調"我"自己

最早都是用高斯過程最為estimator來進行調參的，但是最近的研究顯示樹模型也很適合做estimator，而且高斯過程也不支持類目特征，所以用XGBoost做estimator當然是最合適的。

接下來就是構建超參優化器了：

import pandas as pd import numpy as np class Optimizer:""" This class optimise an algorithm/model configuration with respect to a given score. """def __init__(self,algo_score,max_iter,max_intensification,model,cs):""" :param algo_score: is the function called to evaluate algorithm / model score :param max_iter: the maximal number of training to perform :param max_intensification: the maximal number of candidates configuration to sample randomly :param model: the class of the internal model used as score estimator. :param cs: the configuration space to explore """self.traj = []self.algo_score = algo_score # 打分模型self.max_iter = max_iter # 迭代次數，停止條件可以按需求更改self.max_intensification = max_intensification # 候選參數組合隨機的個數self.internal_model = model() # 評估參數模型self.trajectory = [] # 記錄每次優化后的參數組合self.cfgs = []self.scores = {}self.best_cfg = Noneself.best_score = Noneself.cs = csdef cfg_to_dtf(self, cfgs):""" Convert configs list into pandas DataFrame to ease learning """cfgs = [dict(cfg) for cfg in cfgs]dtf = pd.DataFrame(cfgs)return dtfdef optimize(self):""" Optimize algo/model using internal score estimator """cfg = self.cs.sample_configuration()self.cfgs.append(cfg)self.trajectory.append(cfg)# initial runscore = self.algo_score(cfg)self.scores[cfg] = scoreself.best_cfg = cfgself.best_score = scoredtf = self.cfg_to_dtf(self.cfgs)for i in range(0, self.max_iter):# We need at least two datapoints for training# 至少2個數據才能訓練調參模型if dtf.shape[0] > 1:scores = np.array([ val for key, val in self.scores.items()])self.internal_model.fit(dtf, scores)# intensificationcandidates = [self.cs.sample_configuration() for i in range(0, self.max_intensification)]candidate_scores = [self.internal_model.predict(self.cfg_to_dtf([cfg])) for cfg in candidates]best_candidates = np.argmax(candidate_scores)cfg = candidates[best_candidates]self.cfgs.append(cfg)score = self.algo_score(cfg)self.scores[cfg] = scoreif score > self.best_score:self.best_cfg = cfgself.best_score = scoreself.trajectory.append(cfg)dtf = self.cfg_to_dtf(self.cfgs)self.internal_model.fit(dtf,np.array([val for kay, val in self.scores.items()]))else:cfg = self.cs.sample_configuration()self.cfgs.append(cfg)score = self.algo_score(cfg)self.scores[cfg] = scoreif score > self.best_score:self.best_cfg = cfgself.best_score = scoreself.trajectory.append(cfg)dtf = self.cfg_to_dtf(self.cfgs)

把algo_score換成需要調參數的XGB，并把internal_model替換成用于調參的XGB，就可以自動搜尋參數啦，還等什么，快去嘗試下吧！

參考文獻：

用XGB調XGB?"我"調"我"自己？

總結

以上是生活随笔為你收集整理的用XGBoost调XGBoost?我调我自己？的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： AutoDim:自动Embedding维
下一篇：兜兜转转一个圈，到底What is al