日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

【推荐】LSI(latent semantic indexing) 完美教程

發(fā)布時間:2025/4/14 编程问答 29 豆豆
生活随笔 收集整理的這篇文章主要介紹了 【推荐】LSI(latent semantic indexing) 完美教程 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

【推薦】LSI(latent semantic indexing) 完美教程

"instead of lecturing about SVD I want to show you how things work --step by step"?

-- 如果大家認同這句話的話,Dr. E. Garcia寫的此教程就是最適合你閱讀的LSI / LSA教程。

原文比較長,直接貼鏈接了:

http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-1-understanding.html

?

若覺得原文太長,還可以看Garcia寫的精簡版:

Latent Semantic Indexing (LSI) Fast Track Tutorial
Singular Value Decomposition (SVD) Fast Track Tutorial

?

?

摘錄部分內(nèi)容:

?

一、常見的對LSI的不正確認識:

1) is theming (analysis of themes).

2) is used by search engines to find all the nouns and verbs, and then associate them with related (substitution-useful) nouns and verbs.

3) allows search engines to "learn" which words are related and which noun concepts relate to one another.

4) is a form of on-topic analysis (term scope/subject analysis).can be applied to collections of any size.

5) has no problem addressing polysemy (terms with different meanings).

?

Pasted from <http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-1-understanding.html>

?

?

二、LSI本質(zhì)上識別了以文檔為單位的second-order co-ocurrence的單詞并歸入同一個子空間。因此:

1)落在同一子空間的單詞不一定是同義詞,甚至不一定是在同情景下出現(xiàn)的單詞,對于長篇文檔尤其如是。

2)LSI根本無法處理一詞多義的單詞(多義詞),多義詞會導致LSI效果變差。

?

A persistent myth in search marketing circles is that LSI grants contextuality; i.e., terms occurring in the same context. This is not always the case. Consider two documents X and Y and three terms A, B and C and wherein:

?

A and B do not co-occur.

X mentions terms A and C

Y mentions terms B and C.

?

:. A---C---B

?

The common denominator is C, so we define this relation as an in-transit co-occurrence since both A and B occur while in transit with C. This is called second-order co-occurrence and is a special case of high-order co-occurrence.

?

However, only because terms A and B are in-transit with C this does not grant contextuality, as the terms can be mentioned in different contexts in documents X and Y. For example, this would be the case of X and Y discussing different topics. Long documents are more prone to this.

?

Even if X and Y are monotopic thesemight be discussing different subjects. Thus, it would be fallacious to assume that high-order co-occurrence between A and B while in-transit with C equates to a contextuality relationship between terms. Add polysemy to this and the scenario worsens, as LSI can fail to address polysemy.

?

Pasted from <http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-1-understanding.html>

總結

以上是生活随笔為你收集整理的【推荐】LSI(latent semantic indexing) 完美教程的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。