當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

K-近邻算法实现简单filmClassify

發(fā)布時間：2025/3/16 编程问答 30 豆豆

生活随笔收集整理的這篇文章主要介紹了 K-近邻算法实现简单filmClassify 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

k-近鄰算法采用測量不同特征值之間的距離方法進(jìn)行分類，屬于監(jiān)督學(xué)習(xí)。

主要代碼如下： def createDataSet():group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])labels = ['A','A','B','B']return group,labelsdef classify0(inX,dataSet,labels,k):dataSetSize = dataSet.shape[0]diffMat = tile(inX,(dataSetSize,1)) - dataSetsqDiffMat = diffMat **2sqDistances = sqDiffMat.sum(axis=1)distances = sqDistances **0.5sortedDistIndicies = distances.argsort()classCount = {}for i in range(k):voteIlabel = labels[sortedDistIndicies[i]]classCount[voteIlabel] = classCount.get(voteIlabel,0) +1sortedClassCount = sorted(classCount.iteritems(),key=operator.itemgetter(1),reverse=True)return sortedClassCount[0][0] createDataSet（）用于導(dǎo)入數(shù)據(jù)；主要介紹classify0(inX,dataSet,labels,k)：

4個輸入?yún)?shù)：用于分類的輸入向量inX，訓(xùn)練樣本集dataSet（是一個數(shù)組），標(biāo)簽向量labels，用于選擇最近鄰的數(shù)目k；

dataSetSize = dataSet.shape[0] 這一句使用到了NumPy的shape函數(shù)，返回矩陣/數(shù)組的不同維數(shù)的長度，第一個元素（shape[0]）表示第一維的長度，亦即行數(shù)，即有幾個訓(xùn)練數(shù)據(jù)；

diffMat = tile(inX, (dataSetSize,1)) - dataSet 這里用到了NumPy中的tile(A,reps)函數(shù)，用于擴(kuò)充A，numpy.tile([0,0],(3,1))表示將（0，0）在行上重復(fù)3次，在列上重復(fù)1次；

sqDiffMat = diffMat**2 求平方；

sqDistances = sqDiffMat.sum(axis=1) 在列上求和；

distances = sqDistances**0.5 求開方，上面的幾步是用來計算距離的；

sortedDistIndicies = distances.argsort()使用了argsort()函數(shù)，升序排序，返回數(shù)組值從小到大的索引index；

classCount={} 這是一個dict，用于存儲不同標(biāo)簽出現(xiàn)的次數(shù)；

for i in range(k):　　voteIlabel = labels[sortedDistIndicies[i]]　　classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1這里是選擇距離最小的k個點， sortedDistIndicies已經(jīng)排好序，只需迭代的取前k個樣本點的labels(即標(biāo)簽)，并計算該標(biāo)簽出現(xiàn)的次數(shù)，這里還用到了dict.get(key, default=None)函數(shù)，key就是dict中的鍵voteIlabel，如果不存在則返回一個0并存入dict，如果存在則讀取當(dāng)前值并+1；

sortedClassCount = sorted(classCount.iteritems(), key=operator.itemgetter(1), reverse=True) 這里使用了sorted()函數(shù)sorted(iterable, cmp=None, key=None, reverse=False)，iteritems()將dict分解為元組列表，operator.itemgetter(1)表示按照第二個元素的次序?qū)υM進(jìn)行排序，排序默認(rèn)是升序，reverse=true表示反轉(zhuǎn)，降序排序

總結(jié)

以上是生活随笔為你收集整理的K-近邻算法实现简单filmClassify的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：再现暴力裁员！患病员工被关小黑屋，摄像头
下一篇： matPlotLib绘制决策树