日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

【Python-ML】SKlearn库Pipeline工作流和K折交叉验证

發布時間:2025/4/16 python 25 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【Python-ML】SKlearn库Pipeline工作流和K折交叉验证 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
# -*- coding: utf-8 -*- ''' Created on 2018年1月18日 @author: Jason.F @summary: Pipeline,流水線工作流,串聯模型擬合、數據轉換等 K折交叉驗證,采用無重復抽樣技術,數據集劃分k份,每次選擇其中一份作為測試集,其他k-1作為訓練集 ''' import pandas as pd from sklearn.preprocessing import LabelEncoder from sklearn.cross_validation import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn.linear_model import LogisticRegression from sklearn.pipeline import Pipeline import numpy as np from sklearn.cross_validation import StratifiedKFold from sklearn.cross_validation import cross_val_score #導入數據 df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data',header=None) X=df.loc[:,2:].values y=df.loc[:,1].values le=LabelEncoder() y=le.fit_transform(y)#類標整數化 print (le.transform(['M','B'])) #劃分訓練集合測試集 X_train,X_test,y_train,y_test = train_test_split (X,y,test_size=0.20,random_state=1) #標準化、PCA降維、模型訓練串聯 pipe_lr=Pipeline([('scl',StandardScaler()),\('pca',PCA(n_components=2)),\('clf',LogisticRegression(random_state=1))]) pipe_lr.fit(X_train,y_train) print ('Test Accuracy:%.3f' % pipe_lr.score(X_test, y_test)) #k折交叉驗證 kfold=StratifiedKFold(y=y_train,n_folds=10,random_state=1) scores=[] for k,(train,test) in enumerate(kfold):pipe_lr.fit(X_train[train],y_train[train])score=pipe_lr.score(X_train[test],y_train[test])scores.append(score)print ('Fold: %s, Class dist.: %s,Acc: %.3f' % (k+1,np.bincount(y_train[train]),score)) print ('CV accuracy: %.3f +/- %.3f'%(np.mean(scores),np.std(scores)) ) #scikit-learn實現的k折交叉驗證 scores=cross_val_score(estimator=pipe_lr,X=X_train,y=y_train,cv=10,n_jobs=1)#n_jobs分布到多少個cpu上執行 print ('Test Accuracy:%s' %scores) print ('CV accuracy: %.3f +/- %.3f'%(np.mean(scores),np.std(scores)) )

結果:

[1 0] Test Accuracy:0.947 Fold: 1, Class dist.: [256 153],Acc: 0.891 Fold: 2, Class dist.: [256 153],Acc: 0.978 Fold: 3, Class dist.: [256 153],Acc: 0.978 Fold: 4, Class dist.: [256 153],Acc: 0.913 Fold: 5, Class dist.: [256 153],Acc: 0.935 Fold: 6, Class dist.: [257 153],Acc: 0.978 Fold: 7, Class dist.: [257 153],Acc: 0.933 Fold: 8, Class dist.: [257 153],Acc: 0.956 Fold: 9, Class dist.: [257 153],Acc: 0.978 Fold: 10, Class dist.: [257 153],Acc: 0.956 CV accuracy: 0.950 +/- 0.029 Test Accuracy:[ 0.89130435 0.97826087 0.97826087 0.91304348 0.93478261 0.977777780.93333333 0.95555556 0.97777778 0.95555556] CV accuracy: 0.950 +/- 0.029

總結

以上是生活随笔為你收集整理的【Python-ML】SKlearn库Pipeline工作流和K折交叉验证的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。