當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

二十八、电力窃漏电案例模型构建

發(fā)布時(shí)間：2024/9/16 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了二十八、电力窃漏电案例模型构建小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1. 綜合案例模型構(gòu)建

構(gòu)建竊漏電用戶識(shí)別模型；
構(gòu)建LM神經(jīng)網(wǎng)絡(luò)模型；
構(gòu)建CART決策樹(shù)模型；
模型評(píng)價(jià)

2. 構(gòu)建竊漏電用戶識(shí)別模型

2.1 構(gòu)建專家樣本

專家樣本準(zhǔn)備完成后，需要?jiǎng)澐譁y(cè)試樣本和訓(xùn)練樣本，隨機(jī)選取20%作為測(cè)試樣本，剩下的作為訓(xùn)練樣本。

2.3 資源

數(shù)據(jù)集

專家樣本數(shù)據(jù)集的內(nèi)容包括時(shí)間、用戶編號(hào)、電量趨勢(shì)下降指標(biāo)、線損指標(biāo)、告警類指標(biāo)和是否竊漏電標(biāo)簽，數(shù)據(jù)集共有291條樣本數(shù)據(jù)。

數(shù)據(jù)集詳情參考model.xls

工具庫(kù)

pandas==0.24.2

2.4 步驟

導(dǎo)入數(shù)據(jù)分析庫(kù)pandas

導(dǎo)入隨機(jī)函數(shù)shuffle,用于打亂順序

設(shè)置訓(xùn)練數(shù)據(jù)比例，劃分?jǐn)?shù)據(jù)

2.5 代碼

數(shù)據(jù)劃分實(shí)現(xiàn)代碼

import pandas as pd #導(dǎo)入數(shù)據(jù)分析庫(kù) from random import shuffle #導(dǎo)入隨機(jī)函數(shù)shuffle，用來(lái)打算數(shù)據(jù)datafile = '../data/model.xls' #數(shù)據(jù)名 data = pd.read_excel(datafile) #讀取數(shù)據(jù)，數(shù)據(jù)的前三列是特征，第四列是標(biāo)簽 data = data.as_matrix() #將表格轉(zhuǎn)換為矩陣 shuffle(data) #隨機(jī)打亂數(shù)據(jù)p = 0.8 #設(shè)置訓(xùn)練數(shù)據(jù)比例 train = data[:int(len(data)*p),:] #前80%為訓(xùn)練集 test = data[int(len(data)*p):,:] #后20%為測(cè)試集

3. 構(gòu)建LM神經(jīng)網(wǎng)絡(luò)

3.1 模型搭建

使用Keras庫(kù)建立神經(jīng)網(wǎng)絡(luò)模型。設(shè)定LM神經(jīng)網(wǎng)絡(luò)有三層模型，輸入層為3個(gè)節(jié)點(diǎn)，隱藏層為10個(gè)結(jié)點(diǎn)，輸出層為1個(gè)結(jié)點(diǎn)，使用Adam方法求解。

LM神經(jīng)網(wǎng)絡(luò)的混淆矩陣

通過(guò)混淆矩陣可得，模型的分類準(zhǔn)確率為(161+58)/(161+58+6+7)=94.4%，正常用戶被誤判為漏電用戶占正常用戶對(duì)7/(161+7)=4.2%，竊漏電用戶被誤判為正常用戶占竊漏電用戶的6/(6+58)=9.4%。

3.2 資源

資源庫(kù)

pandas==0.24.2 scipy==1.1.0 keras=2.2.0 Tensorflow=1.10

數(shù)據(jù)集

數(shù)據(jù)集詳情參考model.xls

3.3 步驟

導(dǎo)入神經(jīng)網(wǎng)絡(luò)初始化函數(shù)

導(dǎo)入神經(jīng)網(wǎng)絡(luò)層函數(shù)，激活函數(shù)

構(gòu)建模型存儲(chǔ)路徑

建立神經(jīng)網(wǎng)絡(luò)

編譯模型，使用adam方法求解

保存模型

顯示混淆矩陣及可視化結(jié)果

3.4 代碼

LM神經(jīng)網(wǎng)絡(luò)實(shí)現(xiàn)代碼

#構(gòu)建LM神經(jīng)網(wǎng)絡(luò)模型 from keras.models import Sequential #導(dǎo)入神經(jīng)網(wǎng)絡(luò)初始化函數(shù) from keras.layers.core import Dense, Activation #導(dǎo)入神經(jīng)網(wǎng)絡(luò)層函數(shù)、激活函數(shù)netfile = '../tmp/net.model' #構(gòu)建的神經(jīng)網(wǎng)絡(luò)模型存儲(chǔ)路徑net = Sequential() #建立神經(jīng)網(wǎng)絡(luò) net.add(Dense(input_dim = 3, output_dim = 10)) #添加輸入層（3節(jié)點(diǎn)）到隱藏層（10節(jié)點(diǎn)）的連接 net.add(Activation('relu')) #隱藏層使用relu激活函數(shù) net.add(Dense(input_dim = 10, output_dim = 1)) #添加隱藏層（10節(jié)點(diǎn)）到輸出層（1節(jié)點(diǎn)）的連接 net.add(Activation('sigmoid')) #輸出層使用sigmoid激活函數(shù) net.compile(loss = 'binary_crossentropy', optimizer = 'adam', class_mode = "binary") #編譯模型，使用adam方法求解net.fit(train[:,:3], train[:,3], nb_epoch=1000, batch_size=1) #訓(xùn)練模型，循環(huán)1000次 net.save_weights(netfile) #保存模型predict_result = net.predict_classes(train[:,:3]).reshape(len(train)) #預(yù)測(cè)結(jié)果變形

4 構(gòu)建CART決策樹(shù)模型

4.1 使用sklearn機(jī)器學(xué)習(xí)庫(kù)構(gòu)建決策樹(shù)模型

通過(guò)scikit-learn利用訓(xùn)練樣本構(gòu)建CART決策樹(shù)模型，得到的混淆矩陣如下圖所示，分類準(zhǔn)確率為（160+56）/（160+56+3+13）=93.1%，正常用戶被誤判為竊漏電用戶占正常用戶的 13/（13+160）=7.5%，竊漏電用戶被誤判為正常用戶占竊漏電用戶的3/（3+56）=5.1%。

4. 2代碼

決策樹(shù)構(gòu)建竊漏電用戶識(shí)別代碼
#構(gòu)建CART決策樹(shù)模型
from sklearn.tree import DecisionTreeClassifier #導(dǎo)入決策樹(shù)模型

treefile = '../tmp/tree.pkl' #模型輸出名字 tree = DecisionTreeClassifier() #建立決策樹(shù)模型 tree.fit(train[:,:3], train[:,3]) #訓(xùn)練#保存模型 from sklearn.externals import joblib joblib.dump(tree, treefile)from cm_plot import * #導(dǎo)入自行編寫(xiě)的混淆矩陣可視化函數(shù) cm_plot(train[:,3], tree.predict(train[:,:3])).show() #顯示混淆矩陣可視化結(jié)果 #注意到Scikit-Learn使用predict方法直接給出預(yù)測(cè)結(jié)果。

5 綜合案例模型評(píng)價(jià)

5.1 使用測(cè)試數(shù)據(jù)

對(duì)于訓(xùn)練樣本，LM神經(jīng)網(wǎng)絡(luò)和CART決策樹(shù)的分類準(zhǔn)確相差不大，分別為94%和93%，為了進(jìn)一步評(píng)估模型分類的性能，故利用測(cè)試樣本對(duì)兩個(gè)模型進(jìn)行評(píng)價(jià)，采用ROC曲線評(píng)價(jià)方法進(jìn)行評(píng)估。

6 完成代碼

6.1 Cart決策樹(shù)模型

#-*- coding: utf-8 -*- #構(gòu)建并測(cè)試CART決策樹(shù)模型import pandas as pd #導(dǎo)入數(shù)據(jù)分析庫(kù) from random import shuffle #導(dǎo)入隨機(jī)函數(shù)shuffle，用來(lái)打算數(shù)據(jù)datafile = '../data/model.xls' #數(shù)據(jù)名 data = pd.read_excel(datafile) #讀取數(shù)據(jù)，數(shù)據(jù)的前三列是特征，第四列是標(biāo)簽 data = data.as_matrix() #將表格轉(zhuǎn)換為矩陣 shuffle(data) #隨機(jī)打亂數(shù)據(jù)p = 0.8 #設(shè)置訓(xùn)練數(shù)據(jù)比例 train = data[:int(len(data)*p),:] #前80%為訓(xùn)練集 test = data[int(len(data)*p):,:] #后20%為測(cè)試集#構(gòu)建CART決策樹(shù)模型 from sklearn.tree import DecisionTreeClassifier #導(dǎo)入決策樹(shù)模型treefile = '../tmp/tree.pkl' #模型輸出名字 tree = DecisionTreeClassifier() #建立決策樹(shù)模型 tree.fit(train[:,:3], train[:,3]) #訓(xùn)練#保存模型 from sklearn.externals import joblib joblib.dump(tree, treefile)from cm_plot import * #導(dǎo)入自行編寫(xiě)的混淆矩陣可視化函數(shù) cm_plot(train[:,3], tree.predict(train[:,:3])).show() #顯示混淆矩陣可視化結(jié)果 #注意到Scikit-Learn使用predict方法直接給出預(yù)測(cè)結(jié)果。from sklearn.metrics import roc_curve #導(dǎo)入ROC曲線函數(shù)fpr, tpr, thresholds = roc_curve(test[:,3], tree.predict_proba(test[:,:3])[:,1], pos_label=1) plt.plot(fpr, tpr, linewidth=2, label = 'ROC of CART', color = 'green') #作出ROC曲線 plt.xlabel('False Positive Rate') #坐標(biāo)軸標(biāo)簽 plt.ylabel('True Positive Rate') #坐標(biāo)軸標(biāo)簽 plt.ylim(0,1.05) #邊界范圍 plt.xlim(0,1.05) #邊界范圍 plt.legend(loc=4) #圖例 plt.show() #顯示作圖結(jié)果

6.2 LM神經(jīng)網(wǎng)絡(luò)模型

#-*- coding: utf-8 -*-import pandas as pd from random import shuffledatafile = '../data/model.xls' data = pd.read_excel(datafile) data = data.as_matrix() shuffle(data)p = 0.8 #設(shè)置訓(xùn)練數(shù)據(jù)比例 train = data[:int(len(data)*p),:] test = data[int(len(data)*p):,:]#構(gòu)建LM神經(jīng)網(wǎng)絡(luò)模型 from keras.models import Sequential #導(dǎo)入神經(jīng)網(wǎng)絡(luò)初始化函數(shù) from keras.layers.core import Dense, Activation #導(dǎo)入神經(jīng)網(wǎng)絡(luò)層函數(shù)、激活函數(shù)netfile = '../tmp/net.model' #構(gòu)建的神經(jīng)網(wǎng)絡(luò)模型存儲(chǔ)路徑net = Sequential() #建立神經(jīng)網(wǎng)絡(luò) net.add(Dense(input_dim = 3, output_dim = 10)) #添加輸入層（3節(jié)點(diǎn)）到隱藏層（10節(jié)點(diǎn)）的連接 net.add(Activation('relu')) #隱藏層使用relu激活函數(shù) net.add(Dense(input_dim = 10, output_dim = 1)) #添加隱藏層（10節(jié)點(diǎn)）到輸出層（1節(jié)點(diǎn)）的連接 net.add(Activation('sigmoid')) #輸出層使用sigmoid激活函數(shù) net.compile(loss = 'binary_crossentropy', optimizer = 'adam', class_mode = "binary") #編譯模型，使用adam方法求解net.fit(train[:,:3], train[:,3], nb_epoch=1000, batch_size=1) #訓(xùn)練模型，循環(huán)1000次 net.save_weights(netfile) #保存模型predict_result = net.predict_classes(train[:,:3]).reshape(len(train)) #預(yù)測(cè)結(jié)果變形 '''這里要提醒的是，keras用predict給出預(yù)測(cè)概率，predict_classes才是給出預(yù)測(cè)類別，而且兩者的預(yù)測(cè)結(jié)果都是n x 1維數(shù)組，而不是通常的 1 x n'''from cm_plot import * #導(dǎo)入自行編寫(xiě)的混淆矩陣可視化函數(shù) cm_plot(train[:,3], predict_result).show() #顯示混淆矩陣可視化結(jié)果from sklearn.metrics import roc_curve #導(dǎo)入ROC曲線函數(shù)predict_result = net.predict(test[:,:3]).reshape(len(test)) fpr, tpr, thresholds = roc_curve(test[:,3], predict_result, pos_label=1) plt.plot(fpr, tpr, linewidth=2, label = 'ROC of LM') #作出ROC曲線 plt.xlabel('False Positive Rate') #坐標(biāo)軸標(biāo)簽 plt.ylabel('True Positive Rate') #坐標(biāo)軸標(biāo)簽 plt.ylim(0,1.05) #邊界范圍 plt.xlim(0,1.05) #邊界范圍 plt.legend(loc=4) #圖例 plt.show() #顯示作圖結(jié)果

6.3 混淆矩陣代碼

def cm_plot(y, yp):from sklearn.metrics import confusion_matrix #導(dǎo)入混淆矩陣函數(shù)cm = confusion_matrix(y, yp) #混淆矩陣import matplotlib.pyplot as plt #導(dǎo)入作圖庫(kù)plt.matshow(cm, cmap=plt.cm.Greens) #畫(huà)混淆矩陣圖，配色風(fēng)格使用cm.Greens，更多風(fēng)格請(qǐng)參考官網(wǎng)。plt.colorbar() #顏色標(biāo)簽for x in range(len(cm)): #數(shù)據(jù)標(biāo)簽for y in range(len(cm)):plt.annotate(cm[x,y], xy=(x, y), horizontalalignment='center', verticalalignment='center')plt.ylabel('True label') #坐標(biāo)軸標(biāo)簽plt.xlabel('Predicted label') #坐標(biāo)軸標(biāo)簽return pltfor x in range(len(cm)): #數(shù)據(jù)標(biāo)簽for y in range(len(cm)):plt.annotate(cm[x,y], xy=(x, y), horizontalalignment='center', verticalalignment='center')plt.ylabel('True label') #坐標(biāo)軸標(biāo)簽plt.xlabel('Predicted label') #坐標(biāo)軸標(biāo)簽return plt

總結(jié)

以上是生活随笔為你收集整理的二十八、电力窃漏电案例模型构建的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：二十七、综合案例数据预处理
下一篇：二十九、电子商务服务推荐项目基本描述