當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

Search API

發布時間：2025/3/17 编程问答 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 Search API 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

Search

搜索條件可以通過查詢字符串，也可以在請求體中傳遞。

搜索接口支持從多個索引中查找文檔vj。

基本格式：

# 單索引內檢索文檔 GET /{index}/_search?q={field}:xxx# 多索引內檢索文檔 GET /{index1, index2}/_search?q={field}: xxx# 全部索引內檢索文檔 GET /_all_/_search?q={field}: xxx

URI Search

通過URI傳參的方式比較簡單，但是不能支持所有的搜索選項。

RUI支持傳參如下：
https://www.elastic.co/guide/en/elasticsearch/reference/7.2/search-uri-request.html

Request Body Search

搜索請求可以使用search DSL，并包含Query DSL，比如：

GET /twitter/_search {"query": {"term": {"user": "kimchy"}} }

注意：其實GET請求也是可以帶請求體的。考慮到不是所有的客戶端支持GET攜帶請求體，因此，也上請求也可以通過POST發送。

返回結果如下：

{"took" : 5,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.47000363,"hits" : [{"_index" : "twitter","_type" : "_doc","_id" : "A6y0umsBAkV3IICsYCLL","_score" : 0.47000363,"_source" : {"user" : "kimchy","post_date" : "2009-11-15T14:12:12","message" : "trying out Elasticsearch"}},{"_index" : "twitter","_type" : "_doc","_id" : "BKzNumsBAkV3IICsUCJn","_score" : 0.47000363,"_routing" : "kimchy","_source" : {"user" : "kimchy","post_date" : "2009-11-15T14:12:12","message" : "trying out Elasticsearch"}}]} }

如果只是想知道是否有匹配條件的文檔，可以設置size參數為0，這表示你并不需要搜索的結果。或者，將terminate_after設置為1，表示只要找到一個一個文檔，查詢操作就可以結束了。

curl -X GET "localhost:9200/_search?q=message:number&size=0&terminate_after=1"

如果查詢時不指定任何條件，將返回所有結果，比如：

GET /twitter/_search

查詢的方式：

match 查詢，針對全文檢索
term 查詢，詞條精確搜索

查詢的邏輯運算：

搜索時如果有多個關鍵字，ES默認他們是或的關系：

GET twitter/_search {"query": {"match": {"message": "out me"}} }

以上搜索查找message字段中包含out或me的文檔。

如果要指定and搜索，需要使用布爾查詢：

GET twitter/_search {"query": {"bool":{"must": [{"match": {"field1": "out"}},{"match": {"field2": "me"}}]}} }

docvalue fields 字段的文檔值

為每個命中返回字段的文檔值表示，比如：

GET /_search {"query" : {"match_all": {}},"docvalue_fields" : ["my_ip_field", // 直接用字段名{"field": "my_keyword_field" // 也可以使用對象標記},{"field": "my_date_field", // 字段名也支持通配符，比如 *_date_field"format": "epoch_millis" // 在對象標記中可以自定義格式}] }

docvalue_fields支持兩種兩種用法：

直接指定字段名
對象標記

docvalue_fields支持所有啟動了文檔值的字段，無論這些字段是否被存儲。

如果docvalue_fields中指定了未啟用文檔值的字段，它將嘗試從字段數據的緩存中加載值，從而導致該字段的詞條加載到內存中，消耗更多的內存。

另外，大部分字段類型不支持自定義格式，但是：

Date類型的字段可以指定日期格式
Numeric類型的字段可以指定Decimal樣式

注意：

docvalue_fields不能加載嵌套對象中的字段，如果某字段的路徑中包含嵌套對象，那么無法返回任何數據。要訪問嵌套的字段，只能在inner_hits塊中使用docvalue_fields

什么是doc values？

大部分字段會被默認索引，以備查詢。倒排索引允許查找詞條，并找到包含詞條的相關文檔。但是排序，聚合，以及在腳本中訪問字段的值要使用的不同的數據訪問模式，我們需要查找文檔并找到字段中的詞條，而不是先找到詞條，再找到文檔。

文檔值是磁盤存儲的數據結構，在文檔索引時構建。其值與_source字段相同，但是是以列的方式，使其對排序和聚合操作更高效。幾乎所有的字段類型都支持文檔值，除了analyzed字符串字段。

支持文檔值的字段默認都已開啟這一特性。如果你確定不需要基于某字段進行排序、聚合，或者在腳本中訪問該字段的值，你可以禁用這一特性，以節約磁盤空間。

PUT my_index // curl -X PUT "localhost:9200/my_index" -H 'Content-Type: application/json' -d' {"mappings": {"properties": {"status_code": { // 默認啟用文檔值"type": "keyword"},"session_id": { "type": "keyword","doc_values": false // 禁用文檔值}}} }

Explain詳細信息

指定explain為true可以查看每次命中的詳細計算

GET twitter/_search {"explain": true, "query": {"term": {"user": "kimchy"}} }

collapese 字段折疊

基于字段的值折疊搜索結果，相當于分組并排序后，取每組第一個文檔。比如下面的查詢檢索每個用戶獲贊最高的tweet：

GET /twitter/_search {"query": {"match": {"message": "elasticsearch"}},"collapse" : {"field" : "user" // 根據user字段折疊},"sort": ["likes"], }

From, Size

from指定偏移，size指定返回的命中數，可以用于分頁

GET /_search {"from" : 0, "size" : 10,"query" : {"term" : { "user" : "kimchy" }} }

注意：from + size 不能大于index.max_result_window的默認值10000

###highlight高亮

基本用法：高亮需要指定字段

GET /_search {"query": {"match": {"content": "mate"}},"highlight": {"fields": {"content":{}} // 使用默認樣式高亮content字段} }

返回結果如下，默認樣式是加標簽：

{"took" : 261,"timed_out" : false,"_shards" : {"total" : 18,"successful" : 18,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.8328784,"hits" : [{"_index" : "article","_type" : "_doc","_id" : "2","_score" : 0.8328784,"_source" : {"content" : "huawei mate pro 銷量一飛沖天"},"highlight" : {"content" : ["huawei mate pro 銷量一飛沖天"]}},{"_index" : "article","_type" : "_doc","_id" : "3","_score" : 0.7942397,"_source" : {"content" : "mate pro 銷量以前銷量普通"},"highlight" : {"content" : ["mate pro 銷量以前銷量普通"]}}]} }

指定高亮文本的摘要長度：

GET post001/_search {"query": {"multi_match": {"fields": ["title", "content"],"query": "考研"}},"highlight": {"fields": {"content": {"number_of_fragments": 3, // 返回幾個匹配的摘要"fragment_size": 50 // 每個摘要的長度},"title": {}}}}

更多自定義設置參考官網

indices_boost 索引提升

為不同的索引（集）設置不同的提升級別，當一個索引的命中結果比其他索引的命中結果更重要時，可以使用。

GET /_search {"indices_boost" : [{ "index1" : 1.4 },{ "index2" : 1.3 }] }

如果指定的索引不存在，會報錯。

inner_hits內部命中

join父子字段和nested嵌套字段（對象數組）可以返回不同域內匹配的文檔。

inner_hits可以告訴你哪個嵌套對象或者父/子文檔導致了特定信息被返回。inner_hits可以定義在nested, has_child, has_parent查詢和過濾中。

nested inner hits

示例：

// 定義test001，comments字段類別為nested PUT test001 {"mappings": {"properties": {"comments": {"type": "nested"}}} }// 索引文檔 PUT test/_doc/1 {"title": "Test title","comments": [ // 對象數組{"author": "kimchy","number": 1},{"author": "nik9000","number": 2}] }// POST test001/_search {"query": {"nested": { // nested查詢"path": "comments","query": {"match": {"comments.number" : 2}},"inner_hits": {} }} }// 查詢結果如下： {"took" : 34,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "test001","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"title" : "Test title","comments" : [{"author" : "kimchy","number" : 1},{"author" : "nik9000","number" : 2}]},"inner_hits" : {"comments" : {"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "test001","_type" : "_doc","_id" : "1","_nested" : { // 內部命中了哪個對象"field" : "comments","offset" : 1},"_score" : 1.0,"_source" : { // 這個資源是 _nested中命中的那個對象"author" : "nik9000","number" : 2}}]}}}}]} }

上面的例子中，_nested元數據很關鍵，因為它定義了這個內部命中來自哪個內部嵌套對象：這里是comments字段中，偏移為1的那個嵌套對象。不過，由于排序和打分，命中位置通常和該對象定義時的位置不一致。

特別要注意的是，嵌套對象存儲在根文檔中，而根文檔在_source字段下，嵌套對象其實沒有_source字段。為嵌套對象返回這個字段是有性能開銷的，尤其是在size和內部命中的size設置的比默認值大的時候。為了避免這一點，可以在inner_hits中禁止包含_source字段，而是使用docvalue_fields字段。就像這樣：

POST test/_search {"query": {"nested": {"path": "comments","query": {"match": {"comments.text" : "words"}},"inner_hits": {"_source" : false,"docvalue_fields" : ["comments.number"]}}} }

多層嵌套字段和內部命中

示例：以下comments嵌套字段中又包含一個votes嵌套字段：

// 定義mapping PUT test {"mappings": {"properties": {"comments": {"type": "nested","properties": {"votes": {"type": "nested"}}}}} }// 索引文檔 PUT test/_doc/1?refresh {"title": "Test title","comments": [{"author": "kimchy","text": "comment text","votes": []},{"author": "nik9000","text": "words words words","votes": [{"value": 1 , "voter": "kimchy"},{"value": -1, "voter": "other"}]}] }// inner hits查詢 POST test/_search {"query": {"nested": {"path": "comments.votes","query": {"match": {"comments.votes.voter": "kimchy"}},"inner_hits" : {}}} }

返回結果如下：

{...,"hits": {"total" : {"value": 1,"relation": "eq"},"max_score": 0.6931472,"hits": [{"_index": "test","_type": "_doc","_id": "1","_score": 0.6931472,"_source": ...,"inner_hits": {"comments.votes": { "hits": {"total" : {"value": 1,"relation": "eq"},"max_score": 0.6931472,"hits": [{"_index": "test","_type": "_doc","_id": "1","_nested": {"field": "comments","offset": 1,"_nested": {"field": "votes","offset": 0}},"_score": 0.6931472,"_source": {"value": 1,"voter": "kimchy"}}]}}}}]} }

父子內部命中

示例：

PUT test {"mappings": {"properties": {"my_join_field": {"type": "join","relations": {"my_parent": "my_child"}}}} }// 索引父文檔 PUT test/_doc/1?refresh {"content": "from parent balabala","my_join_field": "my_parent" }// 索引子文檔,url中的routing必須是parent的id值 PUT test/_doc/2?routing=1&refresh {"content": "from child balabala","my_join_field": {"name": "my_child","parent": "1"} }// has_child搜索，搜索子文檔查找父文檔 POST test/_search {"query": {"has_child": {"type": "my_child","query": {"match": {"content": "from child balabala"}},"inner_hits": {} }} }// has_parent，基于父文檔查找子文檔 POST test/_search {"query": {"has_parent": {"type": "my_parent","query": {"match": {"content": "from parent balabala"}},"inner_hits": {} }} }

返回結果：

{...,"hits": {"total" : {"value": 1,"relation": "eq"},"max_score": 1.0,"hits": [ // 命中父文檔{"_index": "test","_type": "_doc","_id": "1", "_score": 1.0,"_source": { "number": 1,"my_join_field": "my_parent"},"inner_hits": {"my_child": {"hits": {"total" : {"value": 1,"relation": "eq"},"max_score": 1.0,"hits": [ // 命中子文檔{"_index": "test","_type": "_doc","_id": "2","_score": 1.0,"_routing": "1","_source": {"number": 1,"my_join_field": {"name": "my_child","parent": "1"}}}]}}}}]} }

min_score

過濾打分低于指定值的文檔：

GET /_search {"min_score": 0.5,"query" : {"term" : { "user" : "kimchy" }} }

_name 命名查詢

過濾上下文和查詢上下文中，可以指定_name

GET /_search {"query": {"bool" : {"should" : [{"match" : { "name.first" : {"query" : "shay", "_name" : "first"} }},{"match" : { "name.last" : {"query" : "banon", "_name" : "last"} }}],"filter" : {"terms" : {"name.last" : ["banon", "kimchy"],"_name" : "test"}}}} }

post_filter

在搜索結果出來后，再進行過濾（區別于搜索中過濾），具體參見官網的示例

preference

指定在哪個副本分片上執行搜索。

rescore

二次打分有助于提高查詢的精度。它應用額外的算法對query和post_filter返回查詢結果的TOP-N進行重新排序（不對所有查詢結果應用，是為了減少開銷）。rescore請求在每個分片返回其結果給協調節點前（該節點負責處理當前請求并匯總結果）執行。

TOP-N可以通過window_size參數指定，默認是10。

原始查詢打分和二次打分查詢的打分合并為文檔的最終打分。

原始查詢和二次打分查詢的權重可以通過query_weight和rescore_query_weight來控制，默認是1。

示例：

POST /_search {"query" : {"match" : {"message" : {"operator" : "or","query" : "the quick brown"}}},"rescore" : {"window_size" : 50,"query" : {"rescore_query" : {"match_phrase" : {"message" : {"query" : "the quick brown","slop" : 2}}},"query_weight" : 0.7,"rescore_query_weight" : 1.2}} }

打分合并的方式可以由score_mode控制：

total 相加（默認值）
multiply
avg
max
min

另外，可以依次執行多個二次打分：

POST /_search {"query" : {"match" : {// ...}}},"rescore" : [ {"window_size" : 100,"query" : {"rescore_query" : {// ...},"query_weight" : 0.7,"rescore_query_weight" : 1.2}}, {"window_size" : 10,"query" : {"score_mode": "multiply","rescore_query" : {// ...}}} ] }

script fields

自定義字段，并根據腳本返回自定義的值：

GET /_search {"query" : {"match_all": {}},"script_fields" : {"test1" : {"script" : {"lang": "painless","source": "doc['price'].value * 2"}},"test2" : {"script" : {"lang": "painless","source": "doc['price'].value * params.factor","params" : {"factor" : 2.0}}}} }

訪問字段時，推薦使用doc['field'].value的方式，當然也可以通過params['_source']['field']的方式，比如：

GET /_search{"query" : {"match_all": {}},"script_fields" : {"test1" : {"script" : "params['_source']['message']"}}}

二者的區別是，doc['field'].value的方式是將目標字段的詞條加載到內存并緩存，執行速度更快。另外，這種方式僅適用于簡單數據類型的字段（比如，不能是json），且對于不作分詞處理或者單詞條的字段才有意義（也就不能是text類型）。

不推薦使用_source的方式，因為要加載并解析整個文檔，會很慢。

scroll

類似于傳統數據庫的游標。scroll用于處理大量的數據，比如分頁，比如將一個索引的文檔重新索引到另一個索引，分批來索引。

示例：

POST /twitter/_search?scroll=1m // 指定保持搜索上下文1分鐘 {"size": 100, // 指定分頁大小"query": {"match" : {"title" : "elasticsearch"}} }

要使用scroll，必須在第一次請求的查詢字符串中指定?scroll，來告訴ES保持搜索上下文。ES返回的結果中，會包含一個_scroll_id，在調用scroll API時，需要傳遞這個id來檢索下一批的結果：

POST /_search/scroll {"scroll" : "1m", "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==" }

注意，這次請求不需要指定索引名，因為第一次請求中已經指定了（猜測scroll_id中包含該信息）。scroll參數告訴ES，再保持上下文1分鐘。

更多查看官網

search_after

分頁可以使用from和size完成，但是隨著分頁數越來越大，開銷也將無法接受。ES默認的index.max_result_window值是10000就是出于這個考慮。相比之下，對于大的分頁，更推薦使用scroll，但是scroll有上下文開銷，并且不推薦用于用戶的實時請求。而search_after通過實時游標可以解決這個問題，其思路是使用上一頁的結果來幫助檢索下一頁的數據。

示例：

GET twitter/_search {"size": 10,"query": {"match" : {"title" : "elasticsearch"}},"sort": [{"date": "asc"},{"tie_breaker_id": "asc"} ] }

注意：

排序的tiebreaker參數應該使用能唯一標識文檔的字段，否則可能出現排序未定義，導致缺失或者重復結果。_id字段具有唯一值，但是不推薦直接作為tiebreaker使用。需要知道的是，search_after在查找文檔時，只需要全部或者部分匹配tiebreaker提供的值。因此，若一個文檔tiebreaker值是"654323"，而你指定search_after為"654"，那么將仍然匹配該文檔，并返回它之后的結果。建議在另一個字段中重復_id字段的內容，并使用這個新字段作為tiebreaker用于排序。

以上請求的結果會包含文檔排序值（sort values）的數組，這些排序值可以用在search_after參數中。比如我們可以將最后一個文檔的排序值傳給"search_after"，以獲取下一頁的數據：

GET twitter/_search {"size": 10,"query": {"match" : {"title" : "elasticsearch"}},"search_after": [1463538857, "654323"],"sort": [{"date": "asc"},{"tie_breaker_id": "asc"}] }

如果使用了search_after，from參數只能設置為0或者-1

search_after無法滿足隨意跳頁的要求，類似于scroll API。不同之處在于，search_after是無狀態的，因此索引的更新或者刪除，可能會改變排序。

也可使用打分（搜索時默認按打分倒序）和id來排序：

"sort": [{"_score": {"order": "desc"}},{"_id": {"order":"asc"}} ]

注意：search_after和collapse不能同時使用！！！

seq_no_primary_term

返回匹配文檔最后一次修改的序號和primary term（和并發加鎖有關，具體參考ES的Document API）：

GET /_search {"seq_no_primary_term": true,"query" : {"term" : { "user" : "kimchy" }} }

sort

排序在字段上定義，對于特殊字段，_score根據打分排序，_doc根據索引排序。

示例：

PUT /my_index {"mappings": {"properties": {"post_date": { "type": "date" },"user": {"type": "keyword"},"name": {"type": "keyword"},"age": { "type": "integer" }}} } GET /my_index/_search {"sort" : [{ "post_date" : {"order" : "asc"}},"user",{ "name" : "desc" },{ "age" : "desc" },"_score"],"query" : {"term" : { "user" : "kimchy" }} }

_doc沒啥用但確是最高效的排序。如果你不關心文檔返回順序，建議使用_doc排序，在scroll中尤其如此。

sort values

每個文檔的排序值會在響應中返回，可用于search_after API

sort order

asc 升序，默認排序方式
desc 倒序，根據_score排序默認是倒序

sort mode option

ES支持就數組或多值字段排序，mode選項可以控制用哪個值用來排序：

min 選擇最小值
max 選擇最大值
sum 加總值（僅適用于數字數組）
avg 平均值（僅適用于數字數組）
median 中位數（僅適用于數字數組）

升序排序時，默認模式是min，倒序排序時，默認模式是max

更多查看官網

_source 過濾

控制_source字段的返回，可以禁止返回，設置為flase，也可以指定返回哪些字段，丟棄哪些字段：

GET twitter/_search {"_source": {"includes": ["user", "post_date"], // 返回的字段"excludes": "message" // 丟棄的字段}, "query": {"match_all": {}} }

另外，字段還支持通配符匹配

stored_fields

stored_fields是那些在mapping中指定為stored的字段（默認false），不推薦使用。建議使用_source 過濾。

track_total_hits

計算有多少匹配文檔，可以指定為true精確計算，也可以給定具體數值。具體查看官網

version

返回文檔的版本

GET /_search {"version": true,"query" : {"term" : { "user" : "kimchy" }} }

Search Template

_search/template接口允許使用模板字符預渲染搜索請求：

GET /_search/template {"source" : {"query": { "match" : { "{{my_field}}" : "{{my_value}}" } },"size" : "{{my_size}}"},"params" : {"my_field" : "message","my_value" : "some message","my_size" : 5} }

或者：

GET _search/template {"source": {"query": {"term": {"message": "{{query_string}}"}}},"params": {"query_string": "search for these words"} }

JSON參數

toJson函數可以將字典或者數組轉為為JSON表示，比如：

GET _search/template {"source": "{ \"query\": { \"terms\": {{#toJson}}statuses{{/toJson}} }}","params": {"statuses": [ "pending", "published" ]} }

將被渲染為：

{"query": {"terms": {"status": ["pending","published"]}} }

_msearch

_msearch，在一個API中執行多個查詢請求。

其他：

copy_to 可以將多個字段的內容合并到一個新字段，在查詢中使用新字段查詢。
精確值和全文本
- 精確值不需要做分詞的處理，就是ES中的keyword
  - 數字，日期，狀態，具體字符串（比如"apple store"），沒有必要作分詞處理。
- 全文本會分詞，ES中的text

總結

以上是生活随笔為你收集整理的Search API的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： VS2008 fatal error L
下一篇： Django contenttypes