python语义分析_Python差异的潜在语义分析
我正在嘗試使用以下代碼在
Python中遵循
Wikipedia Article on latent semantic indexing:
documentTermMatrix = array([[ 0.,1.,0.,1.],[ 0.,0.],[ 1.,0.]])
u,s,vt = linalg.svd(documentTermMatrix,full_matrices=False)
sigma = diag(s)
## remove extra dimensions...
numberOfDimensions = 4
for i in range(4,len(sigma) -1):
sigma[i][i] = 0
queryVector = array([[ 0.],# same as first column in documentTermMatrix
[ 0.],[ 0.],[ 1.],[ 1.]])
數(shù)學(xué)怎么說應(yīng)該有效:
dtMatrixToQueryAgainst = dot(u,dot(s,vt))
queryVector = dot(inv(s),dot(transpose(u),queryVector))
similarityToFirst = cosineDistance(queryVector,dtMatrixToQueryAgainst[:,0]
# gives 'matrices are not aligned' error. should be 1 because they're the same
什么工作,數(shù)學(xué)看起來不正確:(從here)
dtMatrixToQueryAgainst = dot(s,vt)
queryVector = dot(transpose(u),queryVector)
similarityToFirst = cosineDistance(queryVector,dtMatrixToQueryAgainsst[:,0])
# gives 1,which is correct
為什么路由工作,而第一個(gè)沒有,當(dāng)我能找到關(guān)于LSA數(shù)學(xué)的所有東西顯示第一個(gè)是正確的?我覺得我錯(cuò)過了一些明顯的東西……
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎(jiǎng)勵(lì)來咯,堅(jiān)持創(chuàng)作打卡瓜分現(xiàn)金大獎(jiǎng)總結(jié)
以上是生活随笔為你收集整理的python语义分析_Python差异的潜在语义分析的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python圆面积函数_Python基础
- 下一篇: python股票数据分析实验报告_Pyt