日韩性视频-久久久蜜桃-www中文字幕-在线中文字幕av-亚洲欧美一区二区三区四区-撸久久-香蕉视频一区-久久无码精品丰满人妻-国产高潮av-激情福利社-日韩av网址大全-国产精品久久999-日本五十路在线-性欧美在线-久久99精品波多结衣一区-男女午夜免费视频-黑人极品ⅴideos精品欧美棵-人人妻人人澡人人爽精品欧美一区-日韩一区在线看-欧美a级在线免费观看

歡迎訪問 生活随笔!

生活随笔

當前位置: 首頁 > 运维知识 > 数据库 >内容正文

数据库

Redis:缩容、扩容、渐进式rehash

發布時間:2023/12/14 数据库 32 豆豆
生活随笔 收集整理的這篇文章主要介紹了 Redis:缩容、扩容、渐进式rehash 小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

目錄

1、縮容 擴容

2、漸進式rehash


1、縮容 擴容

隨著redis的操作的不斷執行,哈希表保存的鍵值會逐漸地增多或者減少,為了讓哈希表的負載因子(ratio)維持在一個合理的范圍之內,當哈希表保存的鍵值對數量太多或者太少時,程序需要對哈希表的大小進行相應的擴展或者收縮。

ratio?= ht[0].used / ht[0].size

比如,hash表的size為4,如果已經插入了4個k-v的話,則ratio為1。

redis的默認負載因子為1,負載因子最大可以達到5(持久化的時候,需要fork操作,這個時候不會分配內存,所以redis源碼中有判斷,如果大于數據長度的5倍(5*used),則馬上擴容)。

擴展和收縮哈希表的工作可以執行rehash(重新散列)操作來完成,Redis對字典的哈希表執行rehash的策略如下:

1、如果ratio小于0.1,則會對hash表進行收縮操作

#define HASHTABLE_MIN_FILL 10 /* Minimal hash table fill 10% *//* Expand or create the hash table,* when malloc_failed is non-NULL, it'll avoid panic if malloc fails (in which case it'll be set to 1).* Returns DICT_OK if expand was performed, and DICT_ERR if skipped. */ int _dictExpand(dict *d, unsigned long size, int* malloc_failed) {if (malloc_failed) *malloc_failed = 0;/* the size is invalid if it is smaller than the number of* elements already inside the hash table */if (dictIsRehashing(d) || d->ht[0].used > size)return DICT_ERR;dictht n; /* the new hash table */unsigned long realsize = _dictNextPower(size);/* Rehashing to the same table size is not useful. */if (realsize == d->ht[0].size) return DICT_ERR;/* Allocate the new hash table and initialize all pointers to NULL */n.size = realsize;n.sizemask = realsize-1;if (malloc_failed) {n.table = ztrycalloc(realsize*sizeof(dictEntry*));*malloc_failed = n.table == NULL;if (*malloc_failed)return DICT_ERR;} elsen.table = zcalloc(realsize*sizeof(dictEntry*));n.used = 0;/* Is this the first initialization? If so it's not really a rehashing* we just set the first hash table so that it can accept keys. */if (d->ht[0].table == NULL) {d->ht[0] = n;return DICT_OK;}/* Prepare a second hash table for incremental rehashing */d->ht[1] = n;d->rehashidx = 0;return DICT_OK; }/* return DICT_ERR if expand was not performed */ int dictExpand(dict *d, unsigned long size) {return _dictExpand(d, size, NULL); }/* Resize the table to the minimal size that contains all the elements,* but with the invariant of a USED/BUCKETS ratio near to <= 1 */ int dictResize(dict *d) {unsigned long minimal;if (!dict_can_resize || dictIsRehashing(d)) return DICT_ERR;minimal = d->ht[0].used;if (minimal < DICT_HT_INITIAL_SIZE)minimal = DICT_HT_INITIAL_SIZE;return dictExpand(d, minimal); }int htNeedsResize(dict *dict) {long long size, used;size = dictSlots(dict);used = dictSize(dict);return (size > DICT_HT_INITIAL_SIZE &&(used*100/size < HASHTABLE_MIN_FILL)); }/* 當HashTable的使用率為10%的時候,開始縮容,進行rehash操作。 */ /* If the percentage of used slots in the HT reaches HASHTABLE_MIN_FILL* we resize the hash table to save memory */ void tryResizeHashTables(int dbid) {if (htNeedsResize(server.db[dbid].dict))dictResize(server.db[dbid].dict);if (htNeedsResize(server.db[dbid].expires))dictResize(server.db[dbid].expires); }

2、服務器目前沒有在執行BGSAVE命令或者BGREWRITEAOF命令,并且哈希表的負載因子大于等于1,則哈希表擴容,擴容大小為當前ht[0].used*2

3、服務器目前正在執行BGSAVE命令或者BGREWRITEAOF命令,并且哈希表的負載因子大于等于5,則擴容hash表,并且擴容為當前ht[0].used*2

上面的說法稍微有點偏頗,實際上雖然傳進去的參數是這樣,比如你的ht[0].used為5,傳進去就是10,但是擴容會是2^4 = 16,即實際擴容量為2^n。

/* Our hash table capability is a power of two */ static unsigned long _dictNextPower(unsigned long size) {unsigned long i = DICT_HT_INITIAL_SIZE;if (size >= LONG_MAX) return LONG_MAX + 1LU;while(1) {if (i >= size)return i;i *= 2; //體現了擴容的大小不是傳入的size(也就是ht[0].used*2),而是距離這個size最近的2^n。} } #define dictIsRehashing(d) ((d)->rehashidx != -1)/* Because we may need to allocate huge memory chunk at once when dict* expands, we will check this allocation is allowed or not if the dict* type has expandAllowed member function. */ static int dictTypeExpandAllowed(dict *d) {if (d->type->expandAllowed == NULL) return 1;return d->type->expandAllowed(_dictNextPower(d->ht[0].used + 1) * sizeof(dictEntry*),(double)d->ht[0].used / d->ht[0].size); }/* Expand the hash table if needed */ static int _dictExpandIfNeeded(dict *d) {/* Incremental rehashing already in progress. Return. */if (dictIsRehashing(d)) return DICT_OK;/* If the hash table is empty expand it to the initial size. */if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);/* If we reached the 1:1 ratio, and we are allowed to resize the hash* table (global setting) or we should avoid it but the ratio between* elements/buckets is over the "safe" threshold, we resize doubling* the number of buckets. */if (d->ht[0].used >= d->ht[0].size &&(dict_can_resize ||d->ht[0].used/d->ht[0].size > dict_force_resize_ratio) &&dictTypeExpandAllowed(d)){return dictExpand(d, d->ht[0].used + 1);}return DICT_OK; }

擴容的步驟如下:

1、為字典ht[1]哈希表分配合適的空間;

2、將ht[0]中所有的鍵值對rehash到ht[1]:rehash 指的是重新計算鍵的哈希值和索引值, 然后將鍵值對 放置到 ht[1] 哈希表的指定位置上;

3、當 ht[0] 包含的所有鍵值對都遷移到了 ht[1] 之后 (ht[0] 變為空表), 釋放 ht[0] , 將 ht[1] 設置 為 ht[0] , 并在 ht[1] 新創建?個空?哈希表, 為下?次 rehash 做準備。

?

2、漸進式rehash

擴展或收縮哈希表需要將ht[0]里面的所有鍵值對rehash到ht[1]里面,但是,這個rehash動作并不是一次性、集中式地完成的,而是分多次、漸進式地完成的。

這樣做的原因在于,如果ht[0]里保存著四個鍵值對,那么服務器可以在瞬間就將這些鍵值對全部rehash到ht[1];但是,如果hash表里保存的鍵值對不是四個,而是四百萬、四千萬甚至四億個鍵值對,那么要一次性將這些鍵值對全部rehash到ht[1]的話,龐大的計算量可能會導致服務器在一段時間內停止服務。

因此,為了避免rehash對服務器性能造成影響,服務器不是一次性將對ht[0]里面所有鍵值對全部rehash到ht[1],而是分多次、漸進式地將ht[0]里面的鍵值對慢慢地rehash到ht[1]。

以下是哈希漸進式rehash的詳細步驟:

1、為ht[1]分配空間,讓字典同時持有ht[0]和ht[1]兩個哈希表。

2、在字典中維持一個索引計數器變量rehashidx,并將它的指設置為0,表示rehash工作正式開始。

3、在rehash進行期間,每次對字典執行添加、刪除、查找或者更新操作時,程序除了執行指定的操作以外,還會順帶將ht[0]哈希表在rehashidx索引上的所有鍵值對rehash到ht[1],當rehash工作完成之后,程序將rehashidx屬性的值一。

4、隨著字典操作的不斷執行,最終在某個時間點,ht[0]的所有鍵值對都會被rehash至ht[1],這時程序將rehashidx屬性設置為-1,表示rehash已經操作完成

漸進式rehash的好處在于它采取分而治之的方式,將rehash鍵值對所需的計算工作均攤到對字典的每個crud操作上,甚至是后臺啟動一個定時器,每次時間循環時只工作一毫秒,從而避免了集中式rehash而帶來的龐大計算量。

我們以dictAddRaw為例:

#define dictIsRehashing(d) ((d)->rehashidx != -1)/* Low level add or find:* This function adds the entry but instead of setting a value returns the* dictEntry structure to the user, that will make sure to fill the value* field as they wish.** This function is also directly exposed to the user API to be called* mainly in order to store non-pointers inside the hash value, example:** entry = dictAddRaw(dict,mykey,NULL);* if (entry != NULL) dictSetSignedIntegerVal(entry,1000);** Return values:** If key already exists NULL is returned, and "*existing" is populated* with the existing entry if existing is not NULL.** If key was added, the hash entry is returned to be manipulated by the caller.*/ dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing) {long index;dictEntry *entry;dictht *ht;if (dictIsRehashing(d)) _dictRehashStep(d); //如果rehash操作未完成,進行一次rehash操作/* Get the index of the new element, or -1 if* the element already exists. */if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)return NULL;/* Allocate the memory and store the new entry.* Insert the element in top, with the assumption that in a database* system it is more likely that recently added entries are accessed* more frequently. */ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];entry = zmalloc(sizeof(*entry)); //為ht[1]開辟新的空間entry->next = ht->table[index];ht->table[index] = entry;ht->used++;/* Set the hash entry fields. */dictSetKey(d, entry, key);return entry; }

可以看到在上面添加過程中,會先判斷一下還有沒有有需要執行的rehash操作,如果有,那就順帶進行rehash操作,那每次rehash都搬運多少個元素呢?

/* This function performs just a step of rehashing, and only if hashing has* not been paused for our hash table. When we have iterators in the* middle of a rehashing we can't mess with the two hash tables otherwise* some element can be missed or duplicated.** This function is called by common lookup or update operations in the* dictionary so that the hash table automatically migrates from H1 to H2* while it is actively used. */ static void _dictRehashStep(dict *d) {if (d->pauserehash == 0) dictRehash(d,1); }/* Performs N steps of incremental rehashing. Returns 1 if there are still* keys to move from the old to the new hash table, otherwise 0 is returned.** Note that a rehashing step consists in moving a bucket (that may have more* than one key as we use chaining) from the old to the new hash table, however* since part of the hash table may be composed of empty spaces, it is not* guaranteed that this function will rehash even a single bucket, since it* will visit at max N*10 empty buckets in total, otherwise the amount of* work it does would be unbound and the function may block for a long time. */ int dictRehash(dict *d, int n) {int empty_visits = n*10; /* Max number of empty buckets to visit. */if (!dictIsRehashing(d)) return 0;while(n-- && d->ht[0].used != 0) {dictEntry *de, *nextde;/* Note that rehashidx can't overflow as we are sure there are more* elements because ht[0].used != 0 */assert(d->ht[0].size > (unsigned long)d->rehashidx);while(d->ht[0].table[d->rehashidx] == NULL) {d->rehashidx++;if (--empty_visits == 0) return 1;}de = d->ht[0].table[d->rehashidx];/* Move all the keys in this bucket from the old to the new hash HT */while(de) {uint64_t h;nextde = de->next;/* Get the index in the new hash table */h = dictHashKey(d, de->key) & d->ht[1].sizemask;de->next = d->ht[1].table[h];d->ht[1].table[h] = de;d->ht[0].used--;d->ht[1].used++;de = nextde;}d->ht[0].table[d->rehashidx] = NULL;d->rehashidx++;}/* Check if we already rehashed the whole table... */if (d->ht[0].used == 0) {zfree(d->ht[0].table); //當ht[0]的元素全部rehash到ht[1]的時候釋放ht[0]的空間d->ht[0] = d->ht[1]; //將原來的ht[1]設置為ht[0]_dictReset(&d->ht[1]); //ht[1] 設置為NULLd->rehashidx = -1;return 0;}/* More to rehash... */return 1; }

根據代碼顯然我們可以看出每次最多rehash10個元素。

總結

以上是生活随笔為你收集整理的Redis:缩容、扩容、渐进式rehash的全部內容,希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。