當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

03.shard_allocation_和_cluster的routing设置

發布時間：2024/2/28 编程问答 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 03.shard_allocation_和_cluster的routing设置小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- 1.簡述
- 2. cluster 級別的shard allocation 相關的設置
- - 1. shard allocation 相關設置
  - - 1. cluster.routing.allocation.enable
    - 2. cluster.routing.allocation.node_concurrent_incoming_recoveries
    - 3. cluster.routing.allocation.node_concurrent_outgoing_recoveries
    - 4. cluster.routing.allocation.node_concurrent_recoveries
    - 5. cluster.routing.allocation.node_initial_primaries_recoveries
    - 6. cluster.routing.allocation.same_shard.host
  - 2. shard rebalance相關設置
  - - 1. cluster.routing.rebalance.enable
    - 2. cluster.routing.allocation.allow_rebalance
    - 3. cluster.routing.allocation.cluster_concurrent_rebalance
  - 3. shard balancing 的因子設置
  - - 1. cluster.routing.allocation.balance.shard
    - 2. cluster.routing.allocation.balance.index
    - 3. cluster.routing.allocation.balance.threshold
  - 4. allocation和rebalance的區別和聯系
- 3. 基于磁盤的shard allocation限制
- - 1. cluster.routing.allocation.disk.threshold_enabled
  - 2. cluster.routing.allocation.disk.watermark.low
  - 3. cluster.routing.allocation.disk.watermark.high
  - 4. cluster.routing.allocation.disk.watermark.flood_stage
  - 5. cluster.info.update.interval
  - 6. cluster.routing.allocation.disk.include_relocations
  - 7. 一個使用樣例
- 4. 通過屬性配置設置,達到allocation 分配時對node的感知
- - 1. 開啟集群allocation 感知
  - 2. 強制感知是什么呢
- 5. cluster級別的shard allocation filter 設置
- - 1. include
  - 2. require
  - 3. exclude
  - 4. 也可以使用正則來進行配置

1.簡述

這里主要是學習master對shard的管理，master決定了一個shard需要被分配到哪個node上面，以及什么時候在cluster中的node之間移動shard來reblace整個cluster

2. cluster 級別的shard allocation 相關的設置

shard alloction 是在某個node上創建某個shard的過程。這個過程會發生在initial recovery, replica allocation, rebalancing, 或者node add或remove的時候

1. shard allocation 相關設置

這個是

1. cluster.routing.allocation.enable

開啟或者關閉某種類型的shard的allocation

all : (default) 允許所有類型的shard被allocate

primaries : 只允許primaries被allocated

new_primaries: 只允許primaries被allocated

none : 不孕育任何shard被allocated

這個設置不會影響一個node重啟的時候對local primary的recovery, 如果一個被重啟的node有一份nassigned primary shard 的copy,那么這個shard會立即成為 primary shard，當然，這個shard的allocation id要和cluster state中記錄的active allocation ids一致。

什么是allocation id 參看這里

2. cluster.routing.allocation.node_concurrent_incoming_recoveries

這個參數設置了每個節點可以有多少個shard可以接收從外面進來的recovery用的數據。

一般情況下都是node上的shard都是replica shard，這些shard 接收來自primary shard的數據進行恢復

如果是relocation操作那么這個node上對應的shard也有可能是primary shard
默認值是2

3. cluster.routing.allocation.node_concurrent_outgoing_recoveries

這個和上一個參數正好是相對的，控制了每個node上可以有多少個shard在向外提供shard recovery的數據。

一般情況下都是node上的shard都是primary shard，這些shard 向replica shard傳輸數據進行shard恢復

如果是relocation操作那么這個node上對應的shard也有可能是replica shard

默認值是2

4. cluster.routing.allocation.node_concurrent_recoveries

這個參數是上面兩個參數的綜合體，也就是會把上面兩個參數設置為一樣的
cluster.routing.allocation.node_concurrent_incoming_recoveries and cluster.routing.allocation.node_concurrent_outgoing_recoveries.

5. cluster.routing.allocation.node_initial_primaries_recoveries

replica shard的恢復一般是通過network從primary恢復，但是unassigned primary shard的恢復則只能是通過原來有這個shard的node被重新啟動了來進行恢復。這個應該稍微大一些，以便于更多的unassigned primary shard可以更快的被恢復。

6. cluster.routing.allocation.same_shard.host

開啟一個檢查來防止同一個shard的多個instances在同一個host上面，這個是為了讓es能夠更好的應對es的node掛掉的情況,這種情況一般都是在一個主機上啟動了多個node,這樣的話這個node掛掉后es的某個shard的數據可能就丟了，一般情況下，同一個集群的的多個node不會在同一個node上面，但是需要注意有時候我們使用的是虛擬機，虛擬機層面不在同一個服務器上，但是實際多個虛擬機上可能在同一個物理機上，這種情況也是應該盡量避免的。否則就會造成數據的丟失情況。
這個值某人是false，也就是不會開啟檢查

2. shard rebalance相關設置

下面這些動態設置是用來設置集群層面的shards的rebalance的

1. cluster.routing.rebalance.enable

開啟或者關閉某種shard的rebalance

all - (default) Allows shard balancing for all kinds of shards.

primaries - Allows shard balancing only for primary shards.

replicas - Allows shard balancing only for replica shards.

none - No shard balancing of any kind are allowed for any indices.

2. cluster.routing.allocation.allow_rebalance

什么時候允許rebalance操作開始

always - 任何時候都可以

indices_primaries_active - 只有當集群中多有的primary shards都被allocated之后才允許

indices_all_active - (default) 只有當集群中的所有的shard(primaries and replicas) 都被allocated之后才允許rebalance操作

3. cluster.routing.allocation.cluster_concurrent_rebalance

這個設置控制了集群rebalance的并行度，默認值是2。
需要注意的是這個只能控制因為imbalances 導致的shard的遷移的并行度，并不能限制因為allocation filtering 或者 forced awareness導致的分片的轉移。

3. shard balancing 的因子設置

The following settings are used together to determine where to place each shard. The cluster is balanced when no allowed rebalancing operation can bring the weight of any node closer to the weight of any other node by more than the balance.threshold.
以下3個因子共同決定了在哪個node上面放置shard,當任何一個rebalance操作都不能使集群中的node之間的weight差距減小的話，集群就達到了balanced的狀態。

1. cluster.routing.allocation.balance.shard

設置了每個node上的shard總數在集群balance中占據的權重因子，默認是0.45f,增加這個值的話就意味著集群的balance更傾向于使每個node上面的shard數量都保持一致。
Defines the weight factor for the total number of shards allocated on a node (float). Defaults to 0.45f. Raising this raises the tendency to equalize the number of shards across all nodes in the cluster.

2. cluster.routing.allocation.balance.index

設置了每個index在某個node上的shards數量的權重，默認是0.55f,增大這個值的話意味著集群的balance更傾向于讓index的shards平均的分配到cluster的每個node上面。
Defines the weight factor for the number of shards per index allocated on a specific node (float). Defaults to 0.55f. Raising this raises the tendency to equalize the number of shards per index across all nodes in the cluster.

3. cluster.routing.allocation.balance.threshold

shard rebalance的觸發閾值，默認是1.0f,增加這個值意味著cluster對集群的balance要求更低，也就是說更不容易觸發rebalance。

Minimal optimization value of operations that should be performed (non negative float). Defaults to 1.0f. Raising this will cause the cluster to be less aggressive about optimizing the shard balance.

4. allocation和rebalance的區別和聯系

這里主要想再強調一下allocation和rebalance的關系，主要從下面兩個配置來進行解析

cluster.routing.allocation.enable

cluster.routing.rebalance.enable

對于allocation強調的是shard的分配，不管你這個shard是因為什么原因要進行分配，比如某個node突然掛掉需要重新分配一些unassigned的shard, 手動的relocation的話需要在目標node上allocation新的shard, rebalance的話也需要在目標node上allocation新的shard。
比如說可能某個node突然掛掉了（而且掛掉的node上的數據被清理掉了），導致了某些shard是unassigned的，這個時候如果 cluster.routing.allocation.enable:none那么即使cluster.routing.rebalance.enable:all,這些unassigned的shard也不會被分配到其他節點，因為最根本的shard分配操作被禁止了。

假如這個時候設置為
cluster.routing.rebalance.enable: none
cluster.routing.allocation.enable: all
那么對應的unassigned的shard會被分配到其他幾點上面。在分配完成集群編程green的時候重啟掛掉的node(該node上面沒有數據)，那么該node上面的shard數量會一直是0，因為rebalance被關閉了。當重新設置cluster.routing.rebalance.enable: all的時候，才會將部分shard遷移到新啟動的node上面。
綜上，rebalance的功能需要依賴allocation功能的開啟，allocation沒有開啟的話是沒有辦法進行rebalance操作的（手動的relocation理所當然也沒有辦法進行），當然allocation還會限制shard丟失之后的shard重新分配。

3. 基于磁盤的shard allocation限制

es會考慮一個node現有的磁盤容量來決定是否將一個新的shard分配到這個node上面，或者是否有必要激活relocation操作從這個node上面遷移走一些shard.
下面這些磁盤相關的設置都是動態的，可以通過elasticsearch.yml設置，也可以通過api來進行設置。

1. cluster.routing.allocation.disk.threshold_enabled

默認是true,如果設置為false的時候在進行shard allocation的時候就不會考慮磁盤的因素。
Defaults to true. Set to false to disable the disk allocation decider.

2. cluster.routing.allocation.disk.watermark.low

低風險水位設置，這個設置的默認值是85%，意味著當一個node的磁盤使用率達到了85%，那么就不會再往這個node上面分配shard了。這個設置對于新創建的index的primary shard不起作用，但是會對replica shard起作用。
這個值也可以直接設置為一個絕對值，比如500mb,這個500mb是指剩余的使用空間哈，不是指已經使用了的空間。這種在集群磁盤比較大的時候比較有用，比如每個node的數量是3T，操作系統實際需要的可能也就50G，但是按照百分比算的話，1% 也有300G，相對來說會有一些浪費。這個時候我們就可以直接設置50G就完事兒了。

Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices but will prevent their replicas from being allocated.

3. cluster.routing.allocation.disk.watermark.high

高風險水位設置，這個設置是90%，當某個node的磁盤使用率達到90%的時候，elasticsearch就會考慮將一部分shard從這個node上面relocate away 到別的node上面。同樣的，這個也可以設置為一個實際的值，比如500mb。
這個設置會影響所有的shard的allocation，不論是之前已經分配過的shard或者是新創建的index的shard的分配。

Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.

4. cluster.routing.allocation.disk.watermark.flood_stage

瀕臨崩潰階段，這個設置默認值是95%，當某個node的磁盤使用達到這個水平以后，這個node上的shard對應的index都會被設置為index.blocks.read_only_allow_delete,也就是只允許讀操作和刪除操作，這是es為了應對集群崩潰不得不采取的一個操作，而且在cluster中的node解除磁盤風險后需要手動進行index.blocks的只讀設置的解除。

Controls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block must be released manually once there is enough disk space available to allow indexing operations to continue.

You can not mix the usage of percentage values and byte values within these settings. Either all are set to percentage values, or all are set to byte values. This is so that we can we validate that the settings are internally consistent (that is, the low disk threshold is not more than the high disk threshold, and the high disk threshold is not more than the flood stage threshold).

非常需要注意的一點是

cluster.routing.allocation.disk.watermark.low cluster.routing.allocation.disk.watermark.high cluster.routing.allocation.disk.watermark.flood_stage

這三個參數的配置類型要保持一致性，也就是說如果使用的是百分比配置則這三個參數都要使用百分比配置，如果想使用具體的大小值設置則都要使用大小值設置。
同時，使用百分比配置的時候是指已經使用的磁盤占比，使用具體值大小的時候指的是剩余空閑磁盤空間容量。

5. cluster.info.update.interval

elasticsearch檢查磁盤使用量的頻率，默認是每隔30s檢查一次。
How often Elasticsearch should check on disk usage for each node in the cluster. Defaults to 30s.

6. cluster.routing.allocation.disk.include_relocations

這個設置控制了cluster在計算一個node的磁盤的使用量的時候是否會加上relacating的shard的磁盤使用，默認是true。
這種計算方式會在磁盤使用量較高node的磁盤使用量計算上產生誤差，因為他可能已經將一個shard的90%都遷移出去了，但是我們統計的時候使用的是整個shard的值。

Defaults to true, which means that Elasticsearch will take into account shards that are currently being relocated to the target node when computing a node’s disk usage. Taking relocating shards’ sizes into account may, however, mean that the disk usage for a node is incorrectly estimated on the high side, since the relocation could be 90% complete and a recently retrieved disk usage would include the total size of the relocating shard as well as the space already used by the running relocation.

7. 一個使用樣例

若果我們想將低風險水位設置在磁盤剩余容量100G，高風險水位設置在磁盤剩余容量50G，瀕臨崩潰的風險水位設置在剩余容量為10G，那么我們可以這樣設置。

PUT _cluster/settings {"transient": {"cluster.routing.allocation.disk.watermark.low": "100gb","cluster.routing.allocation.disk.watermark.high": "50gb","cluster.routing.allocation.disk.watermark.flood_stage": "10gb","cluster.info.update.interval": "1m"} }

An example of updating the low watermark to at least 100 gigabytes free, a high watermark of at least 50 gigabytes free, and a flood stage watermark of 10 gigabytes free, and updating the information about the cluster every minute:

4. 通過屬性配置設置,達到allocation 分配時對node的感知

這一塊兒的配置咋一看基本上和前文當中對index filter的使用中記錄的類似,但是真的是相似而不相同。
這一塊兒主要是針對整個集群的配置。

1. 開啟集群allocation 感知

1.給對應的node設置attribute,假如我們為每個node標記一個容量size屬性，有small,medium,big三個屬性，

node.attr.rack_id: rack_one或者`./bin/elasticsearch -Enode.attr.rack_id=rack_one`

2.在每個master-eligible node的elasticsearch.yml文件中開啟設置

cluster.routing.allocation.awareness.attributes: rack_id

也可以通過對應啊api來進行動態設置

在這種情況下，如果你進行如下操作:

start 2個配置為node.attr.rack_id:rack_one的node

創建一個index，這個index有5個primary shard，每個primary有1個replica

這個時候10個shard會被分配在這兩個node上面，但是并不會考慮是否有某個shard的replica和primary在同一個node上面，因為cluster認為兩個node是同一個node,因為他們對應的rack_id是一樣的

如果再添加兩個配置為node.attr.rack_id:rack_two的node,es會把部分shard遷移到新的node上面，并且會保證同一個shard的primary和replica不會在相同的rack_id的nodes上面

如果配置為node.attr.rack_id:rack_two的node掛掉了，es會把所有的shard都allocated到node.attr.rack_id:rack_one的node上面

如果想要同一個shard的primary和replica不會分配到相同的rack_id的nodes上，可以開啟強制感知

2. 強制感知是什么呢

強制感知可以避免同一個atrribute id的nodes持有某個shard的primary和replica，因為同一個attribute id被認為具有強關聯的機器，可能會同時掛掉,通過強制感知可以降低數據丟失的風險
先來看看強制感知如何使用

cluster.routing.allocation.awareness.attributes: zone cluster.routing.allocation.awareness.force.zone.values: zone1,zone2

這里設置了強制感知的attribute的值為zone1,zone2

還拿上面的例子來說

start 2個配置為node.attr.zone:zone1的node

創建一個index，這個index有5個primary shard，每個primary有1個replica

這個時候只有5個primary會被分配到兩個node上面，replica shard并不會被分配，直到有node.attr.zone:zone2的node加入到集群當中

5. cluster級別的shard allocation filter 設置

在cluster級別設置一些filter和在index級別設置filter的使用方式類似，但是作用范圍是cluster級別
使用的樣式如下

PUT _cluster/settings {"transient" : {"cluster.routing.allocation.exclude._ip" : "10.0.0.1"} }

對應的可以是自定義的node attribute, 或者是是內建的_name, _ip, _host attributes.
對應的setting有

1. include

cluster.routing.allocation.include.{attribute}

只需要node的attribute中有一個在當前include的配置列表當中即可
Allocate shards to a node whose {attribute} has at least one of the comma-separated values.

2. require

cluster.routing.allocation.require.{attribute}
對應的node必須有全部的當前配置的attribute才會將分片分配上去
Only allocate shards to a node whose {attribute} has all of the comma-separated values.

3. exclude

cluster.routing.allocation.exclude.{attribute}
對應的node沒有任何當前配置的的attribute才會將分片分配上去
Do not allocate shards to a node whose {attribute} has any of the comma-separated values.

4. 也可以使用正則來進行配置

PUT _cluster/settings {"transient": {"cluster.routing.allocation.exclude._ip": "192.168.2.*"} } 超強干貨來襲云風專訪：近40年碼齡，通宵達旦的技術人生

總結

以上是生活随笔為你收集整理的03.shard_allocation_和_cluster的routing设置的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇： 02.es的节点发现和集群构建
下一篇： 04.local_gateway和net