生活随笔
收集整理的這篇文章主要介紹了
机器学习-分类算法-模型选择与调优09
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
模型選擇與調優
交叉驗證:為了讓被評估的模型更加準確可信
網格搜索
from sklearn
.neighbors
import KNeighborsClassifier
from sklearn
.model_selection
import train_test_split
,GridSearchCV
from sklearn
.preprocessing
import StandardScaler
import pandas
as pd
def knncls():data
= pd
.read_csv
("train.csv")data
= data
.query
("x > 1.0 & x <1.25 & y >2.5 & y < 2.75")time_value
= pd
.to_datetime
(data
["time"],unit
="s")time_value
= pd
.DatetimeIndex
(time_value
)data
["day"] = time_value
.daydata
["hour"] = time_value
.hourdata
["weekday"] = time_value
.weekdaydata
= data
.drop
(["time"],axis
=1) place_count
= data
.groupby
("place_id").count
()tf
= place_count
[place_count
.row_id
> 3].reset_index
()data
= data
[data
["place_id"].isin
(tf
.place_id
)]data
= data
.drop
(["row_id"],axis
=1)print(data
)y
= data
["place_id"]x
= data
.drop
(["place_id"],axis
=1)x_train
,x_test
,y_train
,y_test
= train_test_split
(x
,y
,test_size
=0.25)std
= StandardScaler
()x_train
= std
.fit_transform
(x_train
)x_test
= std
.transform
(x_test
)knn
= KNeighborsClassifier
()param
= {"n_neighbors":[3,5,10]}gc
= GridSearchCV
(knn
,param_grid
=param
,cv
=10)gc
.fit
(x_train
,y_train
)gc
.score
(x_test
,y_test
)print("在測試集上的準確率:",gc
.score
(x_test
,y_test
))print("在交叉驗證中最好的結果:",gc
.best_score_
)print("最好的模型是:",gc
.best_estimator_
)print("每個超參數每次交叉驗證的結果:",gc
.cv_results_
)return Noneif __name__
=="__main__":knncls
()
與50位技術專家面對面20年技術見證,附贈技術全景圖
總結
以上是生活随笔為你收集整理的机器学习-分类算法-模型选择与调优09的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。