日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

基于xgboost 的贷款风险预测

發布時間:2025/4/5 编程问答 18 豆豆
生活随笔 收集整理的這篇文章主要介紹了 基于xgboost 的贷款风险预测 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

????
????現在我們用傳說中的xgboost 對這個數據集進行計算

#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Sat Aug 19 13:19:26 2017@author: luogan """import pandas as pd df = pd.read_csv('loans.csv')from sklearn.preprocessing import LabelEncoder from collections import defaultdict d = defaultdict(LabelEncoder) dff =df.apply(lambda df: d[df.name].fit_transform(df)) dff.to_excel('dff.xls')import pandas as pd import numpy as np import xgboost as xgb from xgboost.sklearn import XGBClassifier from sklearn import cross_validation, metrics #Additional scklearn functions from sklearn.grid_search import GridSearchCV #Perforing grid searchimport matplotlib.pylab as plt #%matplotlib inline from matplotlib.pylab import rcParams rcParams['figure.figsize'] = 12, 4train = pd.read_excel('dff.xls') target = 'safe_loans' IDcol = 'id'def modelfit(alg, dtrain, predictors,useTrainCV=True, cv_folds=5, early_stopping_rounds=50):if useTrainCV:xgb_param = alg.get_xgb_params()xgtrain = xgb.DMatrix(dtrain[predictors].values, label=dtrain[target].values)cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,metrics='auc', early_stopping_rounds=early_stopping_rounds)alg.set_params(n_estimators=cvresult.shape[0])#Fit the algorithm on the dataalg.fit(dtrain[predictors], dtrain['safe_loans'],eval_metric='auc')#Predict training set:dtrain_predictions = alg.predict(dtrain[predictors])dtrain_predprob = alg.predict_proba(dtrain[predictors])[:,1]from pandas import DataFrame'''gg=DataFrame(dtrain_predictions)gg.to_excel('dtrain_predictions.xls') tt=DataFrame(dtrain_predprob)tt.to_excel('dtrain_predprob.xls')'''print(alg)#Print model report:print ("\nModel Report")print ("Accuracy : %.4g" % metrics.accuracy_score(dtrain['safe_loans'].values, dtrain_predictions))print ("AUC Score (Train): %f" % metrics.roc_auc_score(dtrain['safe_loans'], dtrain_predprob))ww=(alg.feature_importances_)print(ww) feat_imp = pd.Series(ww).sort_values(ascending=False)#print(feat_imp)feat_imp.plot(kind='bar', title='Feature Importances')plt.ylabel('Feature Importance Score')"""model=algfeatureImportance = model.get_score() features = pd.DataFrame() features['features'] = featureImportance.keys() features['importance'] = featureImportance.values() features.sort_values(by=['importance'],ascending=False,inplace=True) fig,ax= plt.subplots() fig.set_size_inches(20,10) plt.xticks(rotation=60) #sn.barplot(data=features.head(30),x="features",y="importance",ax=ax,orient="v") """#Choose all predictors except target & IDcols predictors = [x for x in train.columns if x not in [target, IDcol]] xgb1 = XGBClassifier(learning_rate =0.1,n_estimators=1000,max_depth=18,min_child_weight=1,gamma=0,subsample=0.8,colsample_bytree=0.8,objective= 'binary:logistic',nthread=4,scale_pos_weight=1,seed=27) modelfit(xgb1, train, predictors) Model Report Accuracy : 0.9533 AUC Score (Train): 0.990971

????正確率95%,甩決策樹和BP網幾條街啊!
可見傳說中的xgboost果然厲害,難怪工業實踐中xgboost 應用如癡的廣泛
下圖顯示了每個 feature的重要性,里面有兩個文件,請運行xgboost.py

代碼文件下載

github代碼

總結

以上是生活随笔為你收集整理的基于xgboost 的贷款风险预测的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。