當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

01.search_api_综述

發布時間：2024/2/28 编程问答 25 豆豆

生活随笔收集整理的這篇文章主要介紹了 01.search_api_综述小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- 1. Search API 簡介
- - 1. Routing
- 2. es選擇replica 的規則
- 3. Stats Groups
- 3. Global Search Timeout
- 4. Search Cancellation
- 5. Search concurrency and parallelism
- 6. search API 的多個index查詢

1. Search API 簡介

Most search APIs are multi-index, with the exception of the Explain API endpoints.
除了使用explain功能，大部分的search api都支持多個索引

1. Routing

執行搜索時，Elasticsearch將根據自適應副本選擇公式選擇數據的“最佳”副本。也可以通過提供路由參數來控制要搜索哪些分片。例如，在為推特編制索引時，路由值可以是用戶名

POST /twitter/_doc?routing=kimchy {"user" : "kimchy","post_date" : "2009-11-15T14:12:12","message" : "trying out Elasticsearch" }

這種使用情況是一般我們只根據用戶名來識別用戶，那么就可以使用這種方式讓請求只路由到相關的shard上面來加速查詢過程。

POST /twitter/_search?routing=kimchy {"query": {"bool" : {"must" : {"query_string" : {"query" : "some query string here"}},"filter" : {"term" : { "user" : "kimchy" }}}} }

routing 參數可以是一個分割的string數組

2. es選擇replica 的規則

默認情況下es會選擇自適應的replica選擇方式，coordinate node 選擇某個target node上的shard來轉發請求一般基于以下幾個方面的因素

在之前的請求中coordiante和對應的target node的耗時

對應的node執行search請求的耗時（不包括coordiante node 和target node之前的請求傳遞的耗時）

對應的target node上的threadpool 堆積的請求

這個策略可以使用以下方式關閉

PUT /_cluster/settings {"transient": {"cluster.routing.use_adaptive_replica_selection": false} }

在關閉以后，es就使用round robin的方式來輪詢請求（所有有data的shard的primary+replica）

If adaptive replica selection is turned off, searches are sent to the index/indices shards in a round robin fashion between all copies of the data (primaries and replicas).

3. Stats Groups

A search can be associated with stats groups, which maintains a statistics aggregation per group. It can later be retrieved using the indices stats API specifically. For example, here is a search body request that associate the request with two different groups:

POST /_search {"query" : {"match_all" : {}},"stats" : ["group1", "group2"] }

3. Global Search Timeout

單個的search可以在request body中設置timeout。因為search可以來自很多源，所以es具有一個動態的痊愈的search timeout 設置。在超過一定的時候之后，request會被cancelled。cancel的機制可以在下一個小節設置。

個別搜索在請求正文搜索中可能會超時。由于搜索請求可以源自許多來源，因此Elasticsearch具有全局搜索超時的動態集群級別設置，該設置適用于未在請求主體中設置超時的所有搜索請求。這些請求將在指定時間后使用以下有關搜索取消的部分中所述的機制取消。因此，有關超時響應性的相同警告也適用。
可以使用 Cluster Update Settings API 對search.default_search_timeout進行設置。

Individual searches can have a timeout as part of the Request Body Search. Since search requests can originate from many sources, Elasticsearch has a dynamic cluster-level setting for a global search timeout that applies to all search requests that do not set a timeout in the request body. These requests will be cancelled after the specified time using the mechanism described in the following section on Search Cancellation. Therefore the same caveats about timeout responsiveness apply.

The setting key is search.default_search_timeout and can be set using the Cluster Update Settings endpoints. The default value is no global timeout. Setting this value to -1 resets the global search timeout to no timeout.

4. Search Cancellation

可以使用標準任務取消機制來取消搜索。默認情況下，運行中的搜索超時檢查僅檢查僅在segment處理完之后才會發生,也就是檢查的最小粒度是segment,所以cancel可以會因為遇到比較大的segment而產生延遲。可以通過將動態cluster設置search.low_level_cancellation設置為true來提高搜索cacel的響應性。但是，它會導致更頻繁的取消檢查從而產生額外開銷，這在大型快速運行的搜索查詢中會很明顯。

5. Search concurrency and parallelism

默認情況下，Elasticsearch不會根據請求命中的分片數量拒絕任何搜索請求。盡管Elasticsearch將優化協調節點上的搜索執行，但大量shard可能會對CPU和內存方面產生重大影響。通常，最好以較少的比較大的shard來組織數據。如果您想配置軟限制，則可以更新action.search.shard_count.limit群集設置，以拒絕命中太多shard的搜索請求。

By default Elasticsearch doesn’t reject any search requests based on the number of shards the request hits. While Elasticsearch will optimize the search execution on the coordinating node a large number of shards can have a significant impact CPU and memory wise. It is usually a better idea to organize data in such a way that there are fewer larger shards. In case you would like to configure a soft limit, you can update the action.search.shard_count.limit cluster setting in order to reject search requests that hit too many shards.

請求參數max_concurrent_shard_requests可用于控制搜索API將針對該請求的每個node可以執行的并發分片請求的最大數量。此參數應用于保護單個請求以防止集群過載（例如，默認請求將命中集群中的所有索引，如果每個節點的分片數量很高，則可能導致分片請求被拒絕）。該默認值為5。

The request parameter max_concurrent_shard_requests can be used to control the maximum number of concurrent shard requests the search API will execute per node for the request. This parameter should be used to protect a single request from overloading a cluster (e.g., a default request will hit all indices in a cluster which could cause shard request rejections if the number of shards per node is high). This default value is 5.

6. search API 的多個index查詢

GET /twitter/_search?q=user:kimchy GET /kimchy,elasticsearch/_search?q=tag:wow GET /_all/_search?q=tag:wow

總結

以上是生活随笔為你收集整理的01.search_api_综述的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： 15.concurrent-contro
下一篇： 02.uri-search