python语义分析_Python差异的潜在语义分析
我正在嘗試使用以下代碼在
Python中遵循
Wikipedia Article on latent semantic indexing:
documentTermMatrix = array([[ 0.,1.,0.,1.],[ 0.,0.],[ 1.,0.]])
u,s,vt = linalg.svd(documentTermMatrix,full_matrices=False)
sigma = diag(s)
## remove extra dimensions...
numberOfDimensions = 4
for i in range(4,len(sigma) -1):
sigma[i][i] = 0
queryVector = array([[ 0.],# same as first column in documentTermMatrix
[ 0.],[ 0.],[ 1.],[ 1.]])
數(shù)學怎么說應該有效:
dtMatrixToQueryAgainst = dot(u,dot(s,vt))
queryVector = dot(inv(s),dot(transpose(u),queryVector))
similarityToFirst = cosineDistance(queryVector,dtMatrixToQueryAgainst[:,0]
# gives 'matrices are not aligned' error. should be 1 because they're the same
什么工作,數(shù)學看起來不正確:(從here)
dtMatrixToQueryAgainst = dot(s,vt)
queryVector = dot(transpose(u),queryVector)
similarityToFirst = cosineDistance(queryVector,dtMatrixToQueryAgainsst[:,0])
# gives 1,which is correct
為什么路由工作,而第一個沒有,當我能找到關于LSA數(shù)學的所有東西顯示第一個是正確的?我覺得我錯過了一些明顯的東西……
創(chuàng)作挑戰(zhàn)賽新人創(chuàng)作獎勵來咯,堅持創(chuàng)作打卡瓜分現(xiàn)金大獎總結
以上是生活随笔為你收集整理的python语义分析_Python差异的潜在语义分析的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: python圆面积函数_Python基础
- 下一篇: websocket python爬虫_p