sklearn决策树概述
生活随笔
收集整理的這篇文章主要介紹了
sklearn决策树概述
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
決策樹(shù)是一類(lèi)常見(jiàn)的機(jī)器學(xué)習(xí)方法,決策樹(shù)學(xué)習(xí)的目的是為了產(chǎn)生一棵泛化能力強(qiáng),即處理未見(jiàn)示例能力強(qiáng)的決策樹(shù)。決策樹(shù)是個(gè)遞歸生成的過(guò)程,如何選擇最優(yōu)劃分屬性是決策樹(shù)學(xué)習(xí)的關(guān)鍵。我們希望決策樹(shù)的分支節(jié)點(diǎn)所包含的樣本盡可能屬于同一類(lèi)別,信息熵、增益率和基尼指數(shù)都可以用來(lái)形容節(jié)點(diǎn)的“純度”。
以下是簡(jiǎn)單的決策樹(shù)示例:
# Create a random dataset rng = np.random.RandomState(1) X = np.sort(5 * rng.rand(80, 1), axis=0) y = np.sin(X).ravel() y[::5] += 3 * (0.5 - rng.rand(16))# Fit regression model regr_1 = DecisionTreeRegressor(max_depth=2) regr_2 = DecisionTreeRegressor(max_depth=5) regr_1.fit(X, y) regr_2.fit(X, y)# Predict X_test = np.arange(0.0, 5.0, 0.01)[:, np.newaxis] y_1 = regr_1.predict(X_test) y_2 = regr_2.predict(X_test)# Plot the results plt.figure() plt.scatter(X, y, s=20, edgecolor="black",c="darkorange", label="data") plt.plot(X_test, y_1, color="cornflowerblue",label="max_depth=2", linewidth=2) plt.plot(X_test, y_2, color="yellowgreen", label="max_depth=5", linewidth=2) plt.xlabel("data") plt.ylabel("target") plt.title("Decision Tree Regression") plt.legend() plt.show()以iris數(shù)據(jù)集為例,我們可以構(gòu)建如下的決策樹(shù):
from sklearn.datasets import load_iris from sklearn import tree iris = load_iris() X, y = iris.data, iris.target clf = tree.DecisionTreeClassifier() clf = clf.fit(X, y)tree.plot_tree(clf)帶結(jié)果圖片的決策樹(shù)代碼
import numpy as np import matplotlib.pyplot as pltfrom sklearn.datasets import load_iris from sklearn.tree import DecisionTreeClassifier, plot_tree# Parameters n_classes = 3 plot_colors = "ryb" plot_step = 0.02# Load data iris = load_iris()for pairidx, pair in enumerate([[0, 1], [0, 2], [0, 3],[1, 2], [1, 3], [2, 3]]):# We only take the two corresponding featuresX = iris.data[:, pair]y = iris.target# Trainclf = DecisionTreeClassifier().fit(X, y)# Plot the decision boundaryplt.subplot(2, 3, pairidx + 1)x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),np.arange(y_min, y_max, plot_step))plt.tight_layout(h_pad=0.5, w_pad=0.5, pad=2.5)Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])Z = Z.reshape(xx.shape)cs = plt.contourf(xx, yy, Z, cmap=plt.cm.RdYlBu)plt.xlabel(iris.feature_names[pair[0]])plt.ylabel(iris.feature_names[pair[1]])# Plot the training pointsfor i, color in zip(range(n_classes), plot_colors):idx = np.where(y == i)plt.scatter(X[idx, 0], X[idx, 1], c=color, label=iris.target_names[i],cmap=plt.cm.RdYlBu, edgecolor='black', s=15)plt.suptitle("Decision surface of a decision tree using paired features") plt.legend(loc='lower right', borderpad=0, handletextpad=0) plt.axis("tight")plt.figure() clf = DecisionTreeClassifier().fit(iris.data, iris.target) plot_tree(clf, filled=True) plt.show()總結(jié)
以上是生活随笔為你收集整理的sklearn决策树概述的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 力量训练的基本方法 可以怎么进行力量训练
- 下一篇: sklearn随机森林概述