日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程语言 > python >内容正文

python

apriori算法代码python_Apriori算法的Python实现

發布時間:2024/2/28 python 27 豆豆
生活随笔 收集整理的這篇文章主要介紹了 apriori算法代码python_Apriori算法的Python实现 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

輸入數據格式

25 52 164 240 274 328 368 448 538 561 630 687 730 775 825 834

39 120 124 205 401 581 704 814 825 834

35 249 674 712 733 759 854 950

39 422 449 704 825 857 895 937 954 964

15 229 262 283 294 352 381 708 738 766 853 883 966 978

具體函數解釋

1. createC1(dataSet)

創建候選1項集,各item及其support存儲在字典中

def createC1(dataSet):

C1_dict = {} # 物品清單

C1 = []

for items in dataSet:

for item in items:

if item in C1_dict:

C1_dict[item] += 1.; #數字當字典的key

else:

C1_dict[item] = 1.;

for key in C1_dict:

C1.append([key])

print("C1: ",C1)

return C1 #list(C1.keys()) 相等于 list(map(frozenset,C1))

# C1 = createC1(dataSet)

# print("C1:",C1) # C1: [[25], [52], [164], [240], [274]]

2. selectLk(dataSet,Ck,minSupport)

尋找k-頻繁項集

def selectLk(dataSet,Ck,minSupport): #dataSet 原始數據集

scan = {} #字典:存候選項集及其支持度

for tid in dataSet:

# print("tid:",tid) # tid: [33, 217, 283, 346, 496, 515, 626]

for item in Ck: # item: [29] [1,2]

if set(item).issubset(tid): # 轉換list to set 判斷是否為原數據集各tid的子集

item = list(map(str, item))

item = ','.join(item)

if item not in scan.keys():

scan[item] = 1

else:

scan[item] += 1

numItems = float(len(dataSet))

# retList = [] # 頻繁項集

Lk = {}

supportData = {} # 候選項集(ssCnt)的支持度的字典

for key in scan:

support = scan[key] / numItems

#supportData[key] = support

if support >= minSupport:

Lk[key] = support;

# return retList,supportData #retList -> Lk L1: {'368': 0.08, '120': 0.056}

return Lk

3. createCk(Lk,k)

創建k+1-候選項集

def createCk(Lk,k): # Lk:包含k項的頻繁項集

Ck = []

Lk = list(Lk.keys())

print("Lk:",Lk) # Lk: ['368', '120', '283', '766', '529', '217', '177', '354', '684', '829', '460', '438']

lenLk = len(Lk)

for i in range(lenLk):

for j in range(i+1,lenLk):

#前k-2個項相同時,將兩個集合合并

L1 = Lk[i].split(',') # str to list[str]

L1 = list(map(int, L1)) #list[str] to list[int]

L1pre = (L1)[:k-2]

L1pre.sort();

L2 = Lk[j].split(',')

L2 = list(map(int, L2))

L2pre = (L2)[:k-2]

L2pre.sort()

if L1pre == L2pre:

Ck.append(list(set(L1).union(set(L2))))

print("Ck: ",Ck)

return Ck

def apriori(dataSet,minSupport):

C1 = createC1(dataSet);

L1 = selectLk(dataSet,C1,minSupport) #L1 是字典

L = [L1]

k = 2

while(len(L[k-2]) > 0):

Ck = createCk(L[k-2],k)

Lk= selectLk(dataSet,Ck,minSupport)

L.append(Lk)

k += 1

return L

from pprint import pprint

dataSet = loadDataSet() #原始數據集轉換為二維list格式

L = apriori(dataSet,0.01)

pprint(L)

4. 將頻繁項集存入Excel

# write list[dictionary] to an excel file

workbook = xlsxwriter.Workbook('out.xlsx')

worksheet = workbook.add_worksheet()

row = 0

col = 0

worksheet.write(row, col, "Itemset")

worksheet.write(row, col+1, "Support")

for items in L:

for key in items.keys():

row += 1

worksheet.write(row, col, key)

worksheet.write(row, col + 1, items[key])

workbook.close()

總結

以上是生活随笔為你收集整理的apriori算法代码python_Apriori算法的Python实现的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。