當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

使用Optuna的XGBoost模型的高效超参数优化

發(fā)布時(shí)間：2023/12/15 编程问答 37 豆豆

生活随笔收集整理的這篇文章主要介紹了使用Optuna的XGBoost模型的高效超参数优化小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

介紹： (Introduction :)

Hyperparameter optimization is the science of tuning or choosing the best set of hyperparameters for a learning algorithm. A set of optimal hyperparameter has a big impact on the performance of any machine learning algorithm. It is one of the most time-consuming yet a crucial step in machine learning training pipeline.

^ h yperparameter優(yōu)化調(diào)整或選擇超參數(shù)為學(xué)習(xí)算法的最佳設(shè)置的科學(xué)性。一組最佳超參數(shù)對(duì)任何機(jī)器學(xué)習(xí)算法的性能都有很大影響。這是機(jī)器學(xué)習(xí)培訓(xùn)流程中最耗時(shí)但至關(guān)重要的步驟之一。

A Machine learning model has two types of tunable parameter :

機(jī)器學(xué)習(xí)模型具有兩種可調(diào)參數(shù)：

· Model parameters

·型號(hào)參數(shù)

· Model hyperparameters

·模型超參數(shù)

source)源 )

Model parameters are learned during the training phase of a model or classifier. For example :

在模型或分類器的訓(xùn)練階段學(xué)習(xí)模型參數(shù) 。例如：

coefficients in logistic regression or liner regression
邏輯回歸或線性回歸的系數(shù)
weights in an artificial neural network
人工神經(jīng)網(wǎng)絡(luò)中的權(quán)重

Model Hyperparameters are set by user before the model training phase. For example :

模型的超參數(shù)是由用戶模型訓(xùn)練階段之前設(shè)置。例如：

‘c’ (regularization strength), ‘penalty’ and ‘solver’ in logistic regression
logistic回歸中的'c'(正則化強(qiáng)度)，'懲罰'和'solver'
‘learning rate’, ‘batch size’, ‘number of hidden layers’ etc. in an artificial neural network
人工神經(jīng)網(wǎng)絡(luò)中的“學(xué)習(xí)率”，“批大小”，“隱藏層數(shù)”等

The choice of Machine learning model depends on the dataset, the task in hand i.e. prediction or classification. Each model has its own unique set of hyperparameter and the task of finding the best combination of these parameter is known as hyperparameter optimization.

機(jī)器學(xué)習(xí)模型的選擇取決于數(shù)據(jù)集，手頭的任務(wù)，即預(yù)測或分類。每個(gè)模型都有其獨(dú)特的超參數(shù)集，找到這些參數(shù)的最佳組合的任務(wù)稱為超參數(shù)優(yōu)化。

For solving hyperparameter optimization problem there are various methods are available. For example :

為了解決超參數(shù)優(yōu)化問題，有多種方法可用。例如：

Grid Search
網(wǎng)格搜索
Random Search
隨機(jī)搜尋
Optuna
奧圖納
HyperOpt
超級(jí)光電

In this post, we will focus on Optuna library which has one of the most accurate and successful hyperparameter optimization strategy.

在本文中，我們將重點(diǎn)介紹Optuna庫，該庫具有最準(zhǔn)確，最成功的超參數(shù)優(yōu)化策略。

Optuna： (Optuna :)

Optuna is an open source hyperparameter optimization (HPO) framework to automate search space of hyperparameter. For finding an optimal set of hyperparameters, Optuna uses Bayesian method. It supports various types of samplers listed below :

Optuna是一個(gè)開源的超參數(shù)優(yōu)化(HPO)框架，用于自動(dòng)執(zhí)行超參數(shù)的搜索空間。為了找到最佳的超參數(shù)集，Optuna使用貝葉斯方法。它支持下面列出的各種類型的采樣器：

GridSampler (using grid search)
GridSampler (使用網(wǎng)格搜索)
RandomSampler (using random sampling)
RandomSampler (使用隨機(jī)采樣)
TPESampler (using Tree-structured Parzen Estimator algorithm)
TPESampler (使用樹結(jié)構(gòu)的Parzen估計(jì)器算法)
CmaEsSampler ( using CMA-ES algorithm)
CmaEsSampler (使用CMA-ES算法)

Use of Optuna for hyperparameter optimization is explained using Credit Card Fraud Detection dataset on Kaggle. The problem statement is to classify a credit card transaction fraudulent or genuine(binary classification). This data contains only numerical input variables which are PCA transformation of original features. Due to confidentially issues, the original features and more background information about the data are not available.

使用Kaggle上的信用卡欺詐檢測數(shù)據(jù)集來說明將Optuna用于超參數(shù)優(yōu)化的情況。問題陳述是對(duì)信用卡交易的欺詐性或真實(shí)性進(jìn)行分類(二進(jìn)制分類)。此數(shù)據(jù)僅包含數(shù)字輸入變量，它們是原始特征的PCA轉(zhuǎn)換。由于機(jī)密問題，無法使用原始功能以及有關(guān)數(shù)據(jù)的更多背景信息。

In this case, we have used only a subset of the dataset to speed up the training time and to ensure the two different classes reach a perfectly balance. Here, the sampling method used is TPESampler . A subset of a dataset is shown in the figure below :

在這種情況下，我們僅使用數(shù)據(jù)集的一個(gè)子集來加快訓(xùn)練時(shí)間，并確保兩個(gè)不同的類達(dá)到完美的平衡。在這里，使用的采樣方法是TPESampler 。下圖顯示了數(shù)據(jù)集的子集：

A subset of Credit Card Fraud Detection dataset信用卡欺詐檢測數(shù)據(jù)集的子集

Importing required packages :

導(dǎo)入所需的軟件包：

import optuna
from optuna import Trial, visualization
from optuna.samplers import TPESampler
from xgboost import XGBClassifier

Following are the main steps involved in HPO using Optuna for XGBoost model:

以下是將Optuna用于XGBoost模型的HPO涉及的主要步驟：

1. Define Objective Function :The first important step is to define an objective function. The objective should be to return a real value which has to minimize or maximize. In our case, we will be training XGBoost model and using the cross-validation score for evaluation. We will be returning this cross-validation score from an objective function which has to be maximized.

1.定義目標(biāo)函數(shù)：第一步是定義目標(biāo)函數(shù)。目標(biāo)應(yīng)該是返回必須最小化或最大化的真實(shí)值。在我們的案例中，我們將訓(xùn)練XGBoost模型，并使用交叉驗(yàn)證得分進(jìn)行評(píng)估。我們將從必須最大化的目標(biāo)函數(shù)返回此交叉驗(yàn)證分?jǐn)?shù)。

2. Define Hyperparameter Search Space :Optuna supports five kind of hyperparameters distribution, which are given as follows :

2.定義超參數(shù)搜索空間： Optuna支持五種超參數(shù)分布，如下所示：

Integer parameter : A uniform distribution on integers.
整數(shù)參數(shù) ： 整數(shù)的均勻分布。

Integer parameter : A uniform distribution on integers.n_estimators = trial.suggest_int('n_estimators',100,500)
整數(shù)參數(shù) ： 整數(shù)的均勻分布。 n_estimators = trial.suggest_int('n_estimators',100,500)
Categorical parameter : A categorical distribution.
分類參數(shù) ：分類分布。

Categorical parameter : A categorical distribution.criterion = trial.suggest_categorical('criterion' ,['gini', 'entropy'])
分類參數(shù) ：分類分布。 criterion = trial.suggest_categorical('criterion' ,['gini', 'entropy'])
Uniform parameter : A uniform distribution in linear domain.
均勻參數(shù) ：線性域中的均勻分布。

Uniform parameter : A uniform distribution in linear domain.subsample = trial.suggest_uniform('subsample' ,0.2,0.8)
均勻參數(shù) ：線性域中的均勻分布。 subsample = trial.suggest_uniform('subsample' ,0.2,0.8)
Discrete-uniform parameter : A discretized uniform distribution in linear domain.
離散均勻參數(shù) ：線性域中的離散均勻分布。

Discrete-uniform parameter : A discretized uniform distribution in linear domain.max_features = trial.suggest_discrete_uniform('max_features', 0.05,1,0.05)
離散均勻參數(shù) ：線性域中的離散均勻分布。 max_features = trial.suggest_discrete_uniform('max_features', 0.05,1,0.05)
Loguniform parameter : A uniform distribution in log domain.
Loguniform參數(shù) ：在日志域中的均勻分布。

Loguniform parameter : A uniform distribution in log domain.learning_rate = trial.sugget_loguniform('learning_rate' : 1e-6, 1e-3)
Loguniform參數(shù) ：在日志域中的均勻分布。 learning_rate = trial.sugget_loguniform('learning_rate' : 1e-6, 1e-3)

The below figure shows the objective function and hyperparameter for our example.

下圖顯示了本示例的目標(biāo)函數(shù)和超參數(shù)。

def objective(trial: Trial,X,y) -> float:joblib.dump(study, 'study.pkl')train_X,test_X,train_y,test_y = train_test_split(X, Y, test_size = 0.30,random_state = 101)param = {"n_estimators" : trial.suggest_int('n_estimators', 0, 1000),'max_depth':trial.suggest_int('max_depth', 2, 25),'reg_alpha':trial.suggest_int('reg_alpha', 0, 5),'reg_lambda':trial.suggest_int('reg_lambda', 0, 5),'min_child_weight':trial.suggest_int('min_child_weight', 0, 5),'gamma':trial.suggest_int('gamma', 0, 5),'learning_rate':trial.suggest_loguniform('learning_rate',0.005,0.5),'colsample_bytree':trial.suggest_discrete_uniform('colsample_bytree',0.1,1,0.01),'nthread' : -1}model = XGBClassifier(**param)model.fit(train_X,train_y)return cross_val_score(model,test_X,test_y).mean()

3. Study Objective :We have to understand some important terminologies mentioned in their docs, which will make our work easier. These are given as follows :

3.研究目標(biāo)：我們必須了解他們的文檔中提到的一些重要術(shù)語，這將使我們的工作更加輕松。給出如下：

Trial : A single call of the objective function
試用：一次調(diào)用目標(biāo)函數(shù)
Study : An optimization session, which is a set of trails
研究：優(yōu)化會(huì)話，它是一組線索
Parameter : A variable whose value is to be optimized such as value of “n_estimators”
參數(shù) ：要優(yōu)化其值的變量，例如“ n_estimators”的值

The Study object is used to manage optimization process. Method create_study() returns a study object. A study object has useful properties for analyzing the optimization outcome. In method of create_study(), we have to define the direction of objective function i.e. “maximize” or “minimize” and sampler for example TPESampler(). After creating study, we can call Optimize().

Study對(duì)象用于管理優(yōu)化過程。方法create_study()返回學(xué)習(xí)對(duì)象。研究對(duì)象具有用于分析優(yōu)化結(jié)果的有用屬性。在create_study()方法中，我們必須定義目標(biāo)函數(shù)的方向，即“最大化”或“最小化”以及采樣器，例如TPESampler() 。創(chuàng)建研究后，我們可以調(diào)用Optimize() 。

study = optuna.create_study(direction='maximize',sampler=TPESampler()) study.optimize(lambda trial : objective(trial,X,Y),n_trials= 50)

4. Best Trial and Result :Once the optimization process completed then we can obtain the best parameters value and the optimal value of the objective function.

4.最佳試驗(yàn)和結(jié)果：優(yōu)化過程完成后，我們可以獲得最佳參數(shù)值和目標(biāo)函數(shù)的最佳值。

print('Best trial: score {},\nparams {}'.format(study.best_trial.value,study.best_trial.params))Best trial: score 0.9427118644067797,
params {'n_estimators': 396, 'max_depth': 6, 'reg_alpha': 3, 'reg_lambda': 3, 'min_child_weight': 2, 'gamma': 0, 'learning_rate': 0.09041583301198859, 'colsample_bytree': 0.45999999999999996}

5. Trail History :We can get the entire history of all the trial in the form data frame by just calling study.trails_dataframe().

5.追蹤歷史記錄：我們只需調(diào)用study.trails_dataframe()在表格數(shù)據(jù)框中獲得所有試驗(yàn)的全部歷史記錄。

hist = study.trials_dataframe() hist.head()

6. Visualizations :

6.可視化：

Photo by Isaac Smith on Unsplash 艾薩克·史密斯 ( Isaac Smith)在Unsplash上拍攝的照片

Visualizing the hyperparameter search space can be very useful. From the visualization, we can gain some useful information on the interaction between parameters and we can see where to search next. The optuna.visualization module includes a set of useful visualizations.

可視化超參數(shù)搜索空間可能非常有用。通過可視化，我們可以獲得有關(guān)參數(shù)之間相互作用的一些有用信息，并且可以看到下一步要搜索的位置。 optuna.visualization 模塊包括一組有用的可視化。

i) plot_optimization_history(study) : plots optimization history of all trials as well as the best score at each point.

i) plot_optimization_history(study) ：繪制所有試驗(yàn)的優(yōu)化歷史以及每個(gè)點(diǎn)的最佳分?jǐn)?shù)。

Optimization History Plot優(yōu)化歷史記錄圖

ii) plot_slice(study) : plots the parameter relationship as slice also we can see which part of search space were explored more.

ii) plot_slice(study) ：將參數(shù)關(guān)系繪制為切片，我們還可以看到搜索空間的哪一部分得到了更多的探索。

Slice Plot切片圖

iii) plot_parallel_coordinate(study) : plots the interactive visualization of the high-dimensional parameter relationship in study and scores.

iii) plot_parallel_coordinate(study) ：繪制學(xué)習(xí)和分?jǐn)?shù)中高維參數(shù)關(guān)系的交互式可視化。

Parallel Coordinate Plot平行坐標(biāo)圖

iv) plot_contor(study) : plots parameter interactive chart from we can choose which hyperparameter space has to explore.

iv) plot_contor(study) ：從中繪制參數(shù)交互式圖表，我們可以選擇要探索的超參數(shù)空間。

Contour Plot等高線圖

Overall, Visualizations are Amazing in Optuna !!

總體而言， Optuna的可視化效果令人贊嘆！

Summary :Optuna is a good HPO framework and is easy to use. It has good documentation and visualization features. For me, It becomes my first choice for hyperparameter optimization method.

簡介： Optuna是一個(gè)很好的HPO框架，易于使用。它具有良好的文檔編制和可視化功能。對(duì)我來說，它成為我超參數(shù)優(yōu)化方法的首選。

Photo by Kelly Sikkema on Unsplash Kelly Sikkema在Unsplash上的照片

Reference :

參考：

Optuna: A hyperparameter optimization framework https://optuna.readthedocs.io/en/stable/index.html

Optuna：超參數(shù)優(yōu)化框架https://optuna.readthedocs.io/en/stable/index.html

Optuna Github Project : https://github.com/optuna

Optuna Github項(xiàng)目： https : //github.com/optuna

翻譯自: https://medium.com/swlh/efficient-hyperparameter-optimization-for-xgboost-model-using-optuna-3ee9a02566b1

總結(jié)

以上是生活随笔為你收集整理的使用Optuna的XGBoost模型的高效超参数优化的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：如何将360度评估调查应用于员工发展？
下一篇： latex 表格中虚线_如何识别和修复表