當前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

Mycat分片规则详解

發(fā)布時間：2023/11/27 生活经验 51 豆豆

生活随笔收集整理的這篇文章主要介紹了 Mycat分片规则详解小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

1、分片枚舉

通過在配置文件中配置可能的枚舉 id，自己配置分片，本規(guī)則適用于特定的場景，比如有些業(yè)務(wù)需要按照省份或區(qū)縣來做保存，而全國省份區(qū)縣固定的，這類業(yè)務(wù)使用本條規(guī)則，配置如下：

<tableRule name="sharding-by-intfile"><rule><columns>user_id</columns><algorithm>hash-int</algorithm></rule>
</tableRule>
<function name="hash-int" class="io.mycat.route.function.PartitionByFileMap"><property name="mapFile">partition-hash-int.txt</property><property name="type">0</property><property name="defaultNode">0</property>
</function>

配置說明
| 標簽屬性 | 說明 |
| ----------- | -----------------------------------------------------------|
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| mapFile | 標識配置文件名稱 |
| type | 默認值為 0，0 表示 Integer，非零表示 String |
| defaultNode | 默認節(jié)點:小于 0 表示不設(shè)置默認節(jié)點，大于等于 0 設(shè)置默認節(jié)點 |

partition-hash-int.txt 配置：

10000=0
10010=1
DEFAULT_NODE=1      //默認節(jié)點

注意
默認節(jié)點的作用：枚舉分片時，如果碰到不識別的枚舉值，就讓它路由到默認節(jié)點
如果不配置默認節(jié)點（defaultNode 值小于 0 表示不配置默認節(jié)點），碰到不識別的枚舉值就會報錯
like this：can’t find datanode for sharding column:column_name val:ffffffff

2、固定分片 hash 算法

本條規(guī)則類似于十進制的求模運算，區(qū)別在于是二進制的操作,是取 id 的二進制低 10 位，即 id 二進制 &1111111111。
此算法的優(yōu)點在于如果按照 10 進制取模運算，在連續(xù)插入 1-10 時候 1-10 會被分到 1-10 個分片，增大了插入的事務(wù)控制難度，而此算法根據(jù)二進制則可能會分到連續(xù)的分片，減少插入事務(wù)事務(wù)控制難度。

<tableRule name="rule1"><rule><columns>user_id</columns><algorithm>func1</algorithm></rule>
</tableRule>
<function name="func1" class="io.mycat.route.function.PartitionByLong"><property name="partitionCount">2,1</property><property name="partitionLength">256,512</property>
</function>

配置說明：

標簽屬性	說明
columns	標識將要分片的表字段
algorithm	分片函數(shù)
partitionCount	分片個數(shù)列表
partitionLength	分片范圍列表

分區(qū)長度：
默認為最大 2^n=1024 ，即最大支持 1024 分區(qū)。

約束：
count，length 兩個數(shù)組的長度必須是一致的；
1024 = sum((count[i]*length[i]))
count 和 length 兩個向量的點積恒等于 1024。

如果需要平均分配設(shè)置：平均分為 4 分片，partitionCount*partitionLength=1024。

<function name="func1" class="io.mycat.route.function.PartitionByLong"><property name="partitionCount">4</property><property name="partitionLength">256</property>
</function>

3、范圍約定

此分片適用于，提前規(guī)劃好分片字段某個范圍屬于哪個分片。

<tableRule name="auto-sharding-long"><rule><columns>user_id</columns><algorithm>rang-long</algorithm></rule>
</tableRule>
<function name="rang-long" class="io.mycat.route.function.AutoPartitionByLong"><property name="mapFile">autopartition-long.txt</property><property name="defaultNode">0</property>
</function>

配置說明：

標簽屬性	說明
columns	標識將要分片的表字段
algorithm	分片函數(shù)
mapFile	標識配置文件名稱
defaultNode	超過范圍后的默認節(jié)點

所有的節(jié)點配置都是從 0 開始，及 0 代表節(jié)點 1，此配置非常簡單，即預(yù)先制定可能的 id 范圍到某個分片：

# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
500M-1000M=1
1000M-1500M=2
或0-10000000=0
10000001-20000000=1

4、取模

此規(guī)則為對分片字段求摸運算。

<tableRule name="mod-long"><rule><columns>user_id</columns><algorithm>mod-long</algorithm></rule>
</tableRule>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod"><!-- how many data nodes --><property name="count">3</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| --------- | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| count | 分片數(shù)量 |

根據(jù) id 進行十進制求模預(yù)算，相比固定分片 hash，此種在批量插入時可能存在批量插入單事務(wù)插入多數(shù)據(jù)分片，增大事務(wù)一致性難度。

5、按日期（天）分片

此規(guī)則為按天分片。

<tableRule name="sharding-by-date"><rule><columns>create_time</columns><algorithm>sharding-by-date</algorithm></rule>
</tableRule>
<function name="sharding-by-date" class="io.mycat.route.function.PartitionByDate"><property name="dateFormat">yyyy-MM-dd</property><property name="sBeginDate">2014-01-01</property><property name="sEndDate">2014-01-02</property><property name="sPartionDay">10</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ----------- | -------------------------------------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| dateForma | 日期格式 |
| sBeginDate | 開始日期 |
| sEndDate | 結(jié)束日期 |
| sPartionDay | 分區(qū)天數(shù)，即默認從開始日期算起，分隔 10 天一個分區(qū) |

如果配置了 sEndDate 則代表數(shù)據(jù)達到了這個日期的分片后循環(huán)從開始分片插入。
注意
在查詢時，如果需要查詢時間段應(yīng)該使用between...and，使用>=或者<=會查詢所有分片。

6、取模范圍約束

此種規(guī)則是取模運算與范圍約束的結(jié)合，主要為了后續(xù)數(shù)據(jù)遷移做準備，即可以自主決定取模后數(shù)據(jù)的節(jié)點分布。

<tableRule name="sharding-by-pattern"><rule>TopESA - Win Cpp<columns>user_id</columns><algorithm>sharding-by-pattern</algorithm></rule>
</tableRule>
<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPattern"><property name="patternValue">256</property><property name="defaultNode">2</property><property name="mapFile">partition-pattern.txt</property>
</function>

partition-pattern.txt

# id partition range start-end ,data node index
###### first host configuration
1-32=0
33-64=1
65-96=2
97-128=3
######## second host configuration
129-160=4
161-192=5
193-224=6
225-256=7
0-0=7

配置說明：
| 標簽屬性 | 說明 |
| ------------ | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| patternValue | 求模基數(shù) |
| defaoultNod | 默認節(jié)點 |
| mapFile | 配置文件路徑 |

配置文件中，1-32 即代表 id%256 后分布的范圍，如果在 1-32 則在分區(qū) 1，其他類推
如果 id 非數(shù)字，則會分配在 defaoultNode 默認節(jié)點。

7、截取數(shù)字做 hash 求模范圍約束

此種規(guī)則類似于取模范圍約束，此規(guī)則支持數(shù)據(jù)符號字母取模。

<tableRule name="sharding-by-prefixpattern"><rule><columns>user_id</columns><algorithm>sharding-by-prefixpattern</algorithm></rule>
</tableRule>
<function name="sharding-by-pattern" class="io.mycat.route.function.PartitionByPrefixPattern"><property name="patternValue">256</property><property name="prefixLength">5</property><property name="mapFile">partition-pattern.txt</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ------------ | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| patternValue | 求模基數(shù) |
| prefixLength | ASCII 截取的位數(shù) |
| mapFile | 配置文件路徑 |
partition-pattern.txt

# range start-end ,data node index
# ASCII
# 8-57=0-9 阿拉伯數(shù)字
# 64、65-90=@、A-Z
# 97-122=a-z
###### first host configuration
1-4=0
5-8=1
9-12=2
13-16=3
###### second host configuration
17-20=4
21-24=5
25-28=6
29-32=7
0-0=7

配置文件中，1-32 即代表 id%256 后分布的范圍，如果在 1-32 則在分區(qū) 1，其他類推。
此種方式類似取模范圍約束，只不過采取的是將列種獲取前 prefixLength 位列所有 ASCII 碼的和進行求模。
sum%patternValue ,獲取的值，在范圍內(nèi)的分片數(shù)

8、應(yīng)用指定

此規(guī)則是在運行階段有應(yīng)用自主決定路由到那個分片。

<tableRule name="sharding-by-substring"><rule><columns>user_id</columns><algorithm>sharding-by-substring</algorithm></rule>
</tableRule>
<function name="sharding-by-substring" class="io.mycat.route.function.PartitionDirectBySubString"><property name="startIndex">0</property><!-- zero-based --><property name="size">2</property><property name="partitionCount">8</property><property name="defaultPartition">0</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ---------------- | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| partitionCount | 分區(qū)數(shù) |
| defaultPartition | 默認分區(qū) |

此方法為直接根據(jù)字符子串（必須是數(shù)字）計算分區(qū)號（由應(yīng)用傳遞參數(shù)，顯式指定分區(qū)號）。

例如：id=05-100000002，在此配置中代表根據(jù) id 中從 startIndex=0，開始，截取 siz=2 位數(shù)字即 05，05 就是獲取的分區(qū)，如果沒傳默認分配到 defaultPartition。

9、截取數(shù)字 hash 解析

此規(guī)則是截取字符串中的 int 數(shù)值 hash 分片。

<tableRule name="sharding-by-stringhash"><rule><columns>user_id</columns><algorithm>sharding-by-stringhash</algorithm></rule>
</tableRule>
<function name="sharding-by-stringhash" class="io.mycat.route.function.PartitionByString"><property name="partitionLength">512</property><!-- zero-based --><property name="partitionCount">2</property><property name="hashSlice">0:2</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| --------------- | ----------------------------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| partitionLength | 字符串hash求模基數(shù) |
| partitionCount | 分區(qū)數(shù) |
| hashSlice | 預(yù)算位，即根據(jù)子字符串中 int 值 hash 運算。 0 means str.length(), -1 means str.length()-1|
注意
hashSlice可以理解為substring（start，end），start為0則只表示0；
例1：值“45abc”，hash預(yù)算位0:2 ，取其中45進行計算
例2：值“aaaabbb2345”，hash預(yù)算位-4:0 ，取其中2345進行計算

10、一致性 hash

一致性 hash 預(yù)算有效解決了分布式數(shù)據(jù)的擴容問題。

<tableRule name="sharding-by-murmur"><rule><columns>user_id</columns><algorithm>murmur</algorithm></rule>
</tableRule>
<function name="murmur" class="io.mycat.route.function.PartitionByMurmurHash"><!-- 默認是 0 --><property name="seed">0</property><!-- 要分片的數(shù)據(jù)庫節(jié)點數(shù)量，必須指定，否則沒法分片 --><property name="count">2</property><!-- 一個實際的數(shù)據(jù)庫節(jié)點被映射為這么多虛擬 節(jié)點，默認是 160 倍，也就是虛擬節(jié)點數(shù)是物理節(jié)點數(shù)的 160 倍 --><property name="virtualBucketTimes">160</property><!-- 節(jié)點的權(quán)重，沒有指定權(quán)重的節(jié)點默認是 1。以 properties 文件的格式填寫，以從 0 開始到 count-1 的整數(shù)值也就是節(jié)點索引為 key，以節(jié)點權(quán)重值為值。所有權(quán)重值必須是正整數(shù)，否則以 1 代替 --><property name="weightMapFile">weightMapFile</property><!-- 用于測試時觀察各物理節(jié)點與虛擬節(jié)點的分布情況，如果指定了這個屬性，會把虛擬節(jié)點的 murmur hash 值與物理節(jié) 點的映射按行輸出到這個文件，沒有默認值，如果不指定，就不會輸出任何東西 --><property name="bucketMapPath">/etc/mycat/bucketMapPath</property>
</function>

11、按單月小時拆分

此規(guī)則是單月內(nèi)按照小時拆分，最小粒度是小時，可以一天最多 24 個分片，最少 1 個分片，一個月完后下月從頭開始循環(huán)。每個月月尾，需要手工清理數(shù)據(jù)。

<tableRule name="sharding-by-hour"><rule><columns>create_time</columns><algorithm>sharding-by-hour</algorithm></rule>
</tableRule>
<function name="sharding-by-hour" class="io.mycat.route.function.LatestMonthPartion"><property name="splitOneDay">24</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ----------- | -------------------------------------------- |
| columns | 標識將要分片的表字段（字符串類型yyyyMMddHH） |
| algorithm | 分片函數(shù) |
| splitOneDay | 一天切分的分片數(shù) |
注意
分片字段必須為字符串格式，否則分片不成功，默認存到第一個分片里面；
保存的時間格式必須為‘yyyymmddHH’格式，不能多也不能少字符，否則分片不成功，默認存到第一個分片里面；

12、范圍求模分片

先進行范圍分片計算出分片組，組內(nèi)再求模。
優(yōu)點可以避免擴容時的數(shù)據(jù)遷移，又可以一定程度上避免范圍分片的熱點問題。
綜合了范圍分片和求模分片的優(yōu)點，分片組內(nèi)使用求模可以保證組內(nèi)數(shù)據(jù)比較均勻，分片組之間是范圍分片，可以兼顧范圍查詢。
最好事先規(guī)劃好分片的數(shù)量，數(shù)據(jù)擴容時按分片組擴容，則原有分片組的數(shù)據(jù)不需要遷移。由于分片組內(nèi)數(shù)據(jù)比較均勻，所以分片組內(nèi)可以避免熱點數(shù)據(jù)問題。

<tableRule name="auto-sharding-rang-mod"><rule><columns>id</columns><algorithm>rang-mod</algorithm></rule>
</tableRule>
<function name="rang-mod" class="io.mycat.route.function.PartitionByRangeMod"><property name="mapFile">partition-range-mod.txt</property><property name="defaultNode">21</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ----------- | ------------------------------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| mapFile | 配置文件路徑 |
| defaultNode | 超過范圍后的默認節(jié)點順序號，節(jié)點從 0 開始。 |
partition-range-mod.txt

# 以下配置一個范圍代表一個分片組，=號后面的數(shù)字代表該分片組所擁有的分片的數(shù)量。
# range start-end ,data node group size
0-200M=5 //代表有 5 個分片節(jié)點
200M1-400M=1
400M1-600M=4
600M1-800M=4
800M1-1000M=6

注意
如上0-200M存入到5個分片中，開始范圍-結(jié)束范圍=該分片組有多少個分片。如果超過配置范圍需要增加分片組。

13、日期范圍HASH分片

思想與范圍求模一致，當由于日期在取模會有數(shù)據(jù)集中問題，所以改成 hash 方法。
先根據(jù)日期分組，再根據(jù)時間 hash 使得短期內(nèi)數(shù)據(jù)分布的更均勻。
優(yōu)點可以避免擴容時的數(shù)據(jù)遷移，又可以一定程度上避免范圍分片的熱點問題。要求日期格式盡量精確些，不然達不到局部均勻的目的

<tableRule name="range-date-hash"><rule><columns>col_date</columns><algorithm>range-date-hash</algorithm></rule>
</tableRule>
<function name="range-date-hash" class="io.mycat.route.function.PartitionByRangeDateHash"><property name="sBeginDate">2014-01-01 00:00:00</property><property name="sPartionDay">365</property><property name="dateFormat">yyyy-MM-dd HH:mm:ss</property><property name="groupPartionSize">3</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ---------------- | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| sBeginDate | 開始日期 |
| sPartionDay | 多少天一個分片 |
| dateFormat | 日期格式 |
| groupPartionSize | 分片組的大小 |

注意
從sBeginDate時間開始計算，每sPartionDay天的數(shù)據(jù)為一個分片組，每個分片組可以分布在groupPartionSize個分片上面。上面的例子最多可以有三天進行分片，如果超出則會拋出以下異常。

Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Can't find a valid data node for specified node index :ALAN_TEST -> RANGE_DATE -> 2019-01-11 12:00:00 -> Index : 4
The error may involve com.mycat.test.model.AlanTest.insert-Inline
The error occurred while setting parameters

14、冷熱數(shù)據(jù)分片

根據(jù)日期查詢?nèi)罩緮?shù)據(jù) 冷熱數(shù)據(jù)分布，最近 n 個月的到實時交易庫查詢，超過 n 個月的按照 m 天分片。

<tableRule name="sharding-by-date"><rule><columns>create_time</columns><algorithm>sharding-by-hotdate</algorithm></rule>
</tableRule>
<function name="sharding-by-hotdate" class="io.mycat.route.function.PartitionByHotDate"><property name="dateFormat">yyyy-MM-dd</property><property name="sLastDay">10</property><property name="sPartionDay">30</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ----------- | -------------------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| dateFormat | 日期格式 |
| sLastDay | 熱數(shù)據(jù)的時間 |
| sPartionDay | 冷數(shù)據(jù)的分片天數(shù)（按照天數(shù)分片） |
注意
冷數(shù)據(jù)按照這個范圍進行分片，例如上面的規(guī)則配置，今天是2019年1月21日，往前推10天為2019年1月12日，則2019年1月12日之前的數(shù)據(jù)為冷數(shù)據(jù)，該批冷數(shù)據(jù)的分片規(guī)則為30天一個分片，即2018-12-12至2019-01-11的數(shù)據(jù)放入第1個分片，2018-11-12至2018-12-11的數(shù)據(jù)放入第2個分片...以此類推，如果數(shù)據(jù)庫分區(qū)不夠，則在保存的時候會拋出以下異常

Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Can't find a valid data node for specified node index :ALAN_TEST -> CREATE_DATE -> 2018-11-09 12:00:00 -> Index : 3

15、自然月分片

按月份列分區(qū) ，每個自然月一個分片，格式 between 操作解析的范例。

<tableRule name="sharding-by-month"><rule><columns>create_time</columns><algorithm>sharding-by-month</algorithm></rule>
</tableRule>
<function name="sharding-by-month" class="io.mycat.route.function.PartitionByMonth"><property name="dateFormat">yyyy-MM-dd</property><property name="sBeginDate">2014-01-01</property>
</function>

配置說明：
| 標簽屬性 | 說明 |
| ---------- | -------------------- |
| columns | 標識將要分片的表字段 |
| algorithm | 分片函數(shù) |
| dateFormat | 日期格式 |
| sBeginDate | 開始日期（無默認值） |
| "sEndDate | 結(jié)束日期（無默認值） |
注意

默認設(shè)置，節(jié)點數(shù)量必須是12個，每12個月循環(huán)從開始分片插入
如配置了sBeginDate="2019-01"月是第0個分片，從該時間按月遞增，無最大節(jié)點
配置了sBeginDate = "2015-01-01"sEndDate = "2015-12-01"該配置可以看成和第一個一致
配置了sBeginDate = "2015-01-01"sEndDate = "2015-03-01"該配置標識只有 3 個節(jié)點；很難與月份對應(yīng)上；平均分散到 3 個節(jié)點上

轉(zhuǎn)載于:https://www.cnblogs.com/alan319/p/10556979.html

總結(jié)

以上是生活随笔為你收集整理的Mycat分片规则详解的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。