日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當(dāng)前位置: 首頁 > 编程资源 > 编程问答 >内容正文

编程问答

白话Elasticsearch18-深度探秘搜索技术之基于slop参数实现近似匹配以及原理剖析

發(fā)布時間:2025/3/21 编程问答 43 豆豆
生活随笔 收集整理的這篇文章主要介紹了 白话Elasticsearch18-深度探秘搜索技术之基于slop参数实现近似匹配以及原理剖析 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

文章目錄

  • 概述
  • 官網(wǎng)
  • slop 含義
  • 例子
    • 示例一
    • 示例二
    • 示例三

概述

繼續(xù)跟中華石杉老師學(xué)習(xí)ES,第18篇

課程地址: https://www.roncoo.com/view/55


接上篇博客 白話Elasticsearch17-match_phrase query 短語匹配搜索


官網(wǎng)

https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query-phrase.html


slop 含義

官網(wǎng)中我們可以看到

A phrase query matches terms up to a configurable slop (which defaults to 0) in any order. Transposed terms have a slop of 2.

slop是什么呢?

query string,搜索文本,中的幾個term,要經(jīng)過幾次移動才能與一個document匹配,這個移動的次數(shù),就是slop 。

  • slop的phrase match,就是proximity match,近似匹配

  • 如果我們指定了slop,那么就允許搜索關(guān)鍵詞進(jìn)行移動,來嘗試與doc進(jìn)行匹配

  • 搜索關(guān)鍵詞k,可以有一定的距離,但是靠的越近,越先搜索出來,proximity match


例子

一個query string經(jīng)過幾次移動之后可以匹配到一個document,然后設(shè)置slop .

假設(shè)有個doc

hello world, java is very good, spark is also very good.

我們使用 match_phrase query 來搜索 java spark ,是肯定搜索不到的, 因為 match_phrase query 會將java spark 作為一個整體來查找。

如果我們指定了slop,那么就允許java spark進(jìn)行移動,來嘗試與doc進(jìn)行匹配

這里的slop,就是3,因為java spark這個短語,spark移動了3次,就可以跟一個doc匹配上了 。

slop的含義,不僅僅是說一個query string terms移動幾次,跟一個doc匹配上。一個query string terms,最多可以移動幾次去嘗試跟一個doc匹配上

slop,設(shè)置的是3,那么就ok

GET /forum/article/_search {"query": {"match_phrase": {"title": {"query": "java spark","slop": 3}}} }

就可以把剛才那個doc匹配上,那個doc會作為結(jié)果返回

但是如果slop設(shè)置的是2,那么java spark,spark最多只能移動2次,此時跟doc是匹配不上的,那個doc是不會作為結(jié)果返回的。


示例一

我們那我們的測試數(shù)據(jù)來驗證下

GET /forum/article/_search {"query": {"match_phrase": {"content": {"query": "spark data","slop": 3}}} }

分析一下slop

data經(jīng)過了3次移動才匹配到 spark data ,所以 slop設(shè)置為3即可,當(dāng)然了設(shè)置成比3大的數(shù)字,肯定也是可以查詢到的,這里的slop設(shè)置為3 ,可以理解為至少移動3次。


示例二

如果我們搜索data spark 呢? 會不會匹配得到呢? 答案是 : 可以

來分析一下


示例三

slop搜索下,關(guān)鍵詞離的越近,relevance score就會越高 .

GET /forum/article/_search {"query": {"match_phrase": {"title": {"query": "java blog","slop": 5}}} }

返回結(jié)果:

{"took": 2,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": 3,"max_score": 0.81487787,"hits": [{"_index": "forum","_type": "article","_id": "2","_score": 0.81487787,"_source": {"articleID": "KDKE-B-9947-#kL5","userID": 1,"hidden": false,"postDate": "2017-01-02","tag": ["java"],"tag_cnt": 1,"view_cnt": 50,"title": "this is java blog","content": "i think java is the best programming language","sub_title": "learned a lot of course","author_first_name": "Smith","author_last_name": "Williams","new_author_last_name": "Williams","new_author_first_name": "Smith"}},{"_index": "forum","_type": "article","_id": "1","_score": 0.31424814,"_source": {"articleID": "XHDK-A-1293-#fJ3","userID": 1,"hidden": false,"postDate": "2017-01-01","tag": ["java","hadoop"],"tag_cnt": 2,"view_cnt": 30,"title": "this is java and elasticsearch blog","content": "i like to write best elasticsearch article","sub_title": "learning more courses","author_first_name": "Peter","author_last_name": "Smith","new_author_last_name": "Smith","new_author_first_name": "Peter"}},{"_index": "forum","_type": "article","_id": "4","_score": 0.31424814,"_source": {"articleID": "QQPX-R-3956-#aD8","userID": 2,"hidden": true,"postDate": "2017-01-02","tag": ["java","elasticsearch"],"tag_cnt": 2,"view_cnt": 80,"title": "this is java, elasticsearch, hadoop blog","content": "elasticsearch and hadoop are all very good solution, i am a beginner","sub_title": "both of them are good","author_first_name": "Robbin","author_last_name": "Li","new_author_last_name": "Li","new_author_first_name": "Robbin"}}]} }

可以看到

得分最高的

次之

最后

《新程序員》:云原生和全面數(shù)字化實踐50位技術(shù)專家共同創(chuàng)作,文字、視頻、音頻交互閱讀

總結(jié)

以上是生活随笔為你收集整理的白话Elasticsearch18-深度探秘搜索技术之基于slop参数实现近似匹配以及原理剖析的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯,歡迎將生活随笔推薦給好友。