python决策树生成规则_如何从scikit-learn决策树中提取决策规则?
我創(chuàng)建了自己的函數(shù)來從sklearn創(chuàng)建的決策樹中提取規(guī)則:
import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeClassifier
# dummy data:
df = pd.DataFrame({'col1':[0,1,2,3],'col2':[3,4,5,6],'dv':[0,1,0,1]})
# create decision tree
dt = DecisionTreeClassifier(max_depth=5, min_samples_leaf=1)
dt.fit(df.ix[:,:2], df.dv)
此函數(shù)首先從節(jié)點(diǎn)(在子數(shù)組中由-1標(biāo)識(shí))開始,然后以遞歸方式查找父節(jié)點(diǎn)。我將此稱為節(jié)點(diǎn)的“譜系”。一路上,我抓住了我需要?jiǎng)?chuàng)建的值if / then / else SAS邏輯:
def get_lineage(tree, feature_names):
left? ? ? = tree.tree_.children_left
right? ? ?= tree.tree_.children_right
threshold = tree.tree_.threshold
features? = [feature_names[i] for i in tree.tree_.feature]
# get ids of child nodes
idx = np.argwhere(left == -1)[:,0]
def recurse(left, right, child, lineage=None):
if lineage is None:
lineage = [child]
if child in left:
parent = np.where(left == child)[0].item()
split = 'l'
else:
parent = np.where(right == child)[0].item()
split = 'r'
lineage.append((parent, split, threshold[parent], features[parent]))
if parent == 0:
lineage.reverse()
return lineage
else:
return recurse(left, right, parent, lineage)
for child in idx:
for node in recurse(left, right, child):
print node
下面的元組包含創(chuàng)建SAS if / then / else語句所需的一切。我不喜歡do在SAS中使用塊,這就是我創(chuàng)建描述節(jié)點(diǎn)整個(gè)路徑的邏輯的原因。元組之后的單個(gè)整數(shù)是路徑中終端節(jié)點(diǎn)的ID。所有前面的元組組合起來創(chuàng)建該節(jié)點(diǎn)。
In [1]: get_lineage(dt, df.columns)
(0, 'l', 0.5, 'col1')
1
(0, 'r', 0.5, 'col1')
(2, 'l', 4.5, 'col2')
3
(0, 'r', 0.5, 'col1')
(2, 'r', 4.5, 'col2')
(4, 'l', 2.5, 'col1')
5
(0, 'r', 0.5, 'col1')
(2, 'r', 4.5, 'col2')
(4, 'r', 2.5, 'col1')
6
總結(jié)
以上是生活随笔為你收集整理的python决策树生成规则_如何从scikit-learn决策树中提取决策规则?的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python字符串去掉特殊符号和空格_从
- 下一篇: 爬虫技术python nutch_pyt