當(dāng)前位置：首頁 > 人文社科 > 生活经验 >内容正文

生活经验

Redis源码解析——字典遍历

發(fā)布時(shí)間：2023/11/27 生活经验 50 豆豆

生活随笔收集整理的這篇文章主要介紹了 Redis源码解析——字典遍历小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

? ? ? ? 之前兩篇博文講解了字典庫的基礎(chǔ)，本文將講解其遍歷操作。之所以將遍歷操作獨(dú)立成一文來講，是因?yàn)槠渲械膬?nèi)容和之前的基本操作還是有區(qū)別的。特別是高級(jí)遍歷一節(jié)介紹的內(nèi)容，充滿了精妙設(shè)計(jì)的算法智慧。（轉(zhuǎn)載請(qǐng)指明出于breaksoftware的csdn博客）

迭代器遍歷

? ? ? ? 由于Redis字典庫有rehash機(jī)制，而且是漸進(jìn)式的，所以迭代器操作可能會(huì)通過其他特殊方式來實(shí)現(xiàn)，以保證能遍歷到所有數(shù)據(jù)。但是閱讀完源碼發(fā)現(xiàn)，其實(shí)這個(gè)迭代器是個(gè)受限的迭代器，實(shí)現(xiàn)方法也很簡單。我們先看下其基礎(chǔ)結(jié)構(gòu)：

typedef struct dictIterator {dict *d;long index;int table, safe;dictEntry *entry, *nextEntry;/* unsafe iterator fingerprint for misuse detection. */long long fingerprint;
} dictIterator;

? ? ? ? 成員變量d指向迭代器處理的字典。index是dictht中table數(shù)組的下標(biāo)。table是dict結(jié)構(gòu)中dictht數(shù)組的下標(biāo)，即標(biāo)識(shí)ht[0]還是ht[1]。safe字段用于標(biāo)識(shí)該迭代器是否為一個(gè)安全的迭代器。如果是，則可以在迭代過程中使用dictDelete、dictFind等方法；如果不是，則只能使用dictNext遍歷方法。entry和nextEntry分別指向當(dāng)前的元素和下一個(gè)元素。fingerprint是字典的指紋，我們可以先看下指紋算法的實(shí)現(xiàn)：

long long dictFingerprint(dict *d) {long long integers[6], hash = 0;int j;integers[0] = (long) d->ht[0].table;integers[1] = d->ht[0].size;integers[2] = d->ht[0].used;integers[3] = (long) d->ht[1].table;integers[4] = d->ht[1].size;integers[5] = d->ht[1].used;/* We hash N integers by summing every successive integer with the integer* hashing of the previous sum. Basically:** Result = hash(hash(hash(int1)+int2)+int3) ...** This way the same set of integers in a different order will (likely) hash* to a different number. */for (j = 0; j < 6; j++) {hash += integers[j];/* For the hashing step we use Tomas Wang's 64 bit integer hash. */hash = (~hash) + (hash << 21); // hash = (hash << 21) - hash - 1;hash = hash ^ (hash >> 24);hash = (hash + (hash << 3)) + (hash << 8); // hash * 265hash = hash ^ (hash >> 14);hash = (hash + (hash << 2)) + (hash << 4); // hash * 21hash = hash ^ (hash >> 28);hash = hash + (hash << 31);}return hash;
}

? ? ? ? 可以見得，它使用了ht[0]和ht[1]的相關(guān)信息進(jìn)行Hash運(yùn)算，從而得到該字典的指紋。我們可以發(fā)現(xiàn)，如果dictht的table、size和used任意一個(gè)有變化，則指紋將被改變。這也就意味著，擴(kuò)容、鎖容、rehash、新增元素和刪除元素都會(huì)改變指紋（除了修改元素內(nèi)容）。
? ? ? ? 生成一個(gè)迭代器的方法很簡單，該字典庫提供了兩種方式：

dictIterator *dictGetIterator(dict *d)
{dictIterator *iter = zmalloc(sizeof(*iter));iter->d = d;iter->table = 0;iter->index = -1;iter->safe = 0;iter->entry = NULL;iter->nextEntry = NULL;return iter;
}dictIterator *dictGetSafeIterator(dict *d) {dictIterator *i = dictGetIterator(d);i->safe = 1;return i;
}

? ? ? ? 然后我們看下遍歷迭代器的操作。如果是初次迭代，則要查看是否是安全迭代器，如果是安全迭代器則讓其對(duì)應(yīng)的字典對(duì)象的iterators自增；如果不是則記錄當(dāng)前字典的指紋

dictEntry *dictNext(dictIterator *iter)
{while (1) {if (iter->entry == NULL) {dictht *ht = &iter->d->ht[iter->table];if (iter->index == -1 && iter->table == 0) {if (iter->safe)iter->d->iterators++;elseiter->fingerprint = dictFingerprint(iter->d);}

? ? ? ? 因?yàn)橐闅v的時(shí)候，字典可以已經(jīng)處于rehash的中間狀態(tài)，所以還要遍歷ht[1]中的元素

            iter->index++;if (iter->index >= (long) ht->size) {if (dictIsRehashing(iter->d) && iter->table == 0) {iter->table++;iter->index = 0;ht = &iter->d->ht[1];} else {break;}}iter->entry = ht->table[iter->index];} else {iter->entry = iter->nextEntry;}

? ? ? ? 往往使用迭代器獲得元素后，會(huì)讓字典刪除這個(gè)元素，這個(gè)時(shí)候就無法通過迭代器獲取下一個(gè)元素了，于是作者設(shè)計(jì)了nextEntry來記錄當(dāng)前對(duì)象的下一個(gè)對(duì)象指針

        if (iter->entry) {/* We need to save the 'next' here, the iterator user* may delete the entry we are returning. */iter->nextEntry = iter->entry->next;return iter->entry;}}return NULL;
}

? ? ? ? 遍歷完成后，要調(diào)用下面方法釋放迭代器。需要注意的是，如果是安全迭代器，就需要讓其指向的字典的iterators自減以還原；如果不是，則需要檢測前后字典的指紋是否一致

void dictReleaseIterator(dictIterator *iter)
{if (!(iter->index == -1 && iter->table == 0)) {if (iter->safe)iter->d->iterators--;elseassert(iter->fingerprint == dictFingerprint(iter->d));}zfree(iter);
}

? ? ? ? 最后我們探討下什么是安全迭代器。源碼中我們看到如果safe為1，則讓字典iterators自增，這樣dict字典庫中的操作就不會(huì)觸發(fā)rehash漸進(jìn)，從而在一定程度上（消除rehash影響，但是無法阻止用戶刪除元素）保證了字典結(jié)構(gòu)的穩(wěn)定。如果不是安全迭代器，則只能使用dictNext方法遍歷元素，而像獲取元素值的dictFetchValue方法都不能調(diào)用。因?yàn)閐ictFetchValue底層會(huì)調(diào)用_dictRehashStep讓字典結(jié)構(gòu)發(fā)生改變。

static void _dictRehashStep(dict *d) {if (d->iterators == 0) dictRehash(d,1);
}

? ? ? ? 我查了下該庫在Redis中的應(yīng)用，遍歷操作不是為了獲取值就是為了刪除值，而沒有增加元素的操作，如

void clusterBlacklistCleanup(void) {dictIterator *di;dictEntry *de;di = dictGetSafeIterator(server.cluster->nodes_black_list);while((de = dictNext(di)) != NULL) {int64_t expire = dictGetUnsignedIntegerVal(de);if (expire < server.unixtime)dictDelete(server.cluster->nodes_black_list,dictGetKey(de));}dictReleaseIterator(di);
}

高級(jí)遍歷

? ? ? ? 高級(jí)遍歷允許ht[0]和ht[1]之間數(shù)據(jù)在遷移過程中進(jìn)行遍歷，通過相應(yīng)的算法可以保證所有的元素都可以被遍歷到。我們先看下功能的實(shí)現(xiàn)：

unsigned long dictScan(dict *d,unsigned long v,dictScanFunction *fn,void *privdata)

? ? ? ? 參數(shù)d是字典的指針；v是迭代器，這個(gè)迭代器初始值為0，每次調(diào)用dictScan都會(huì)返回一個(gè)新的迭代器。于是下次調(diào)用這個(gè)函數(shù)時(shí)要傳入新的迭代器的值。fn是個(gè)函數(shù)指針，每遍歷到一個(gè)元素時(shí)，都是用該函數(shù)對(duì)元素進(jìn)行操作。

typedef void (dictScanFunction)(void *privdata, const dictEntry *de);

? ? ? ? Redis中這個(gè)方法的調(diào)用樣例是：

        do {cursor = dictScan(ht, cursor, scanCallback, privdata);} while (cursor &&maxiterations-- &&listLength(keys) < (unsigned long)count);

? ? ? ? 對(duì)于不在rehash狀態(tài)的字典，則只要對(duì)ht[0]中迭代器指向的鏈表進(jìn)行遍歷就行了

    dictht *t0, *t1;const dictEntry *de;unsigned long m0, m1;if (dictSize(d) == 0) return 0;if (!dictIsRehashing(d)) {t0 = &(d->ht[0]);m0 = t0->sizemask;/* Emit entries at cursor */de = t0->table[v & m0];while (de) {fn(privdata, de);de = de->next;}

? ? ? ? 如果在rehash狀態(tài)，就要遍歷ht[0]和ht[1]。遍歷前要確定哪個(gè)dictht.table長度短（假定其長度為len=8），先對(duì)短的中該迭代器（假定為iter=4）對(duì)應(yīng)的鏈進(jìn)行遍歷，然后遍歷大的。然而不僅要遍歷大的dictht中迭代器（iter=4）對(duì)應(yīng)的鏈，還要遍歷比iter大len的迭代器（4+8=12）對(duì)應(yīng)的鏈表。

    } else {t0 = &d->ht[0];t1 = &d->ht[1];/* Make sure t0 is the smaller and t1 is the bigger table */if (t0->size > t1->size) {t0 = &d->ht[1];t1 = &d->ht[0];}m0 = t0->sizemask;m1 = t1->sizemask;/* Emit entries at cursor */de = t0->table[v & m0];while (de) {fn(privdata, de);de = de->next;}/* Iterate over indices in larger table that are the expansion* of the index pointed to by the cursor in the smaller table */do {/* Emit entries at cursor */de = t1->table[v & m1];while (de) {fn(privdata, de);de = de->next;}/* Increment bits not covered by the smaller mask */v = (((v | m0) + 1) & ~m0) | (v & m0);/* Continue while bits covered by mask difference is non-zero */} while (v & (m0 ^ m1));}

? ? ? ? 最后要重新計(jì)算下次使用的迭代器并返回

    /* Set unmasked bits so incrementing the reversed cursor* operates on the masked bits of the smaller table */v |= ~m0;/* Increment the reverse cursor */v = rev(v);v++;v = rev(v);return v;
}

? ? ? ? 從上面的設(shè)計(jì)來看，調(diào)用dictScan時(shí)不能有多線程操作該字典，否則會(huì)出現(xiàn)遺漏遍歷的情況。但是在每次調(diào)用dictScan之間可以對(duì)字典進(jìn)行操作。

? ? ? ? 其實(shí)這個(gè)遍歷中最核心的是迭代器v的計(jì)算方法，我們只要讓v從0開始，執(zhí)行“或操作”最短ht.table（~m0）大小、二進(jìn)制翻轉(zhuǎn)、加1、再二進(jìn)制翻轉(zhuǎn)就可以實(shí)現(xiàn)0到~m0的遍歷。我們看個(gè)例子：（下圖有筆誤，第一行十進(jìn)制是4，第三行十進(jìn)制是6）

? ? ? ? 我一直想不出這套算法為什么能滿足這樣的特點(diǎn)，還是需要數(shù)學(xué)大神解釋一下。同時(shí)也可見這種算法的作者Pieter Noordhuis數(shù)學(xué)有一定功底。

? ? ? ? 關(guān)鍵這樣的算法不僅可以完成遍歷，還可以在數(shù)組大小動(dòng)態(tài)變化時(shí)保證元素被全部遍歷到。我把代碼提煉出來，模擬了長度為8的數(shù)組向長度為16的數(shù)組擴(kuò)容，和長度為16的數(shù)組向長度為8的數(shù)組縮容的過程。為了讓問題簡單化，我們先不考慮兩個(gè)數(shù)組的問題，只認(rèn)為數(shù)組在一瞬間被擴(kuò)容和縮容。

? ? ? ? 我們先看下擴(kuò)容前的遍歷過程

? ? ? ? 假如第8次迭代后，數(shù)組瞬間擴(kuò)容，這個(gè)時(shí)候遍歷過程是

? ? ? ? 此時(shí)多了一次對(duì)下標(biāo)為15的遍歷，可以想象這次遍歷應(yīng)該會(huì)重復(fù)下標(biāo)為15%8=7遍歷（即第8次）的元素。所以dictScan具有潛在對(duì)一個(gè)元素遍歷多次的問題。我們?cè)倏吹?次迭代時(shí)發(fā)生瞬間擴(kuò)容的情況

? ? ? ? 此時(shí)數(shù)組下標(biāo)為11的遍歷（即第8次遍歷）會(huì)部分重復(fù)下標(biāo)為3的遍歷（即第7次遍歷）元素。而之后的遍歷就不會(huì)重復(fù)了。

? ? ? ? 我們?cè)倏聪聰?shù)組的縮容。為縮容前的狀態(tài)是

? ? ? ? 如果第16次遍歷時(shí)突然縮容，則遍歷過程是

? ? ? ? 可見第16次遍歷的是新數(shù)組下標(biāo)為7的元素，和第15次遍歷老數(shù)組下標(biāo)為7的元素不同，本次遍歷的結(jié)果包含前者（因?yàn)樗€包含之前下標(biāo)為15的元素）。所以也存在元素重復(fù)遍歷的問題。

? ? ? ? 我們看下第15次遍歷時(shí)突然縮容的遍歷過程

? ? ? ? 因?yàn)榭s容到8，所以最后一次遍歷下標(biāo)7的情況，既包括之前老數(shù)組下標(biāo)為7的元素，也包含老數(shù)組下標(biāo)為15的元素。所以本次遍歷不會(huì)產(chǎn)生重復(fù)遍歷元素的問題。

? ? ? ? 我們?cè)倏聪碌?4次遍歷突然縮容的遍歷過程

? ? ? ? 第14次本來是要遍歷下標(biāo)為11的元素。由于發(fā)生縮容，就遍歷新的數(shù)組的下標(biāo)為3的元素。所以第14的遍歷包含第13次的遍歷元素。

? ? ? ? 一個(gè)數(shù)組如此，像dict結(jié)構(gòu)中有兩個(gè)dictht的情況，則稍微復(fù)雜點(diǎn)。我們通過下圖可以發(fā)現(xiàn)，不同時(shí)機(jī)ht[0]擴(kuò)容或者縮容，都可以保證元素被全遍歷

? ? ? ? 上面測試的代碼是：

#define TWO_FOUR_MASK 15
#define TWO_THREE_MASK 7static unsigned long rev(unsigned long v) {unsigned long s = 8 * sizeof(v);unsigned long mask = ~0;while ((s >>= 1) > 0) {mask ^= (mask <<s);v = ((v >> s) & mask) | ((v << s) & ~mask);}return v;
}unsigned long loop_single_expand_shrinks(unsigned long v, int change, int expand) {unsigned long m0 = 0;if (expand) {if (change) {m0 = TWO_FOUR_MASK;}else {m0 = TWO_THREE_MASK;}}else {if (change) {m0 = TWO_THREE_MASK;}else {m0 = TWO_FOUR_MASK;}}unsigned long t0idx = t0idx = v & m0; printf(" t0Index: %lu ", t0idx);v |= ~m0;v = rev(v);v++;v = rev(v);return v;
}unsigned long loop(unsigned long v) {unsigned long m0 = TWO_THREE_MASK;unsigned long m1 = TWO_FOUR_MASK;unsigned long t0idx = v & m0;printf(" t0Index: %lu ", t0idx);printf(" t1Index: ");do {unsigned long t1idx = v & m1;printf("%lu ", t1idx);v = (((v | m0) + 1) & ~ m0) | (v & m0);} while (v & (m0 ^ m1));v |= ~m0;v = rev(v);v++;v = rev(v);return v;
}unsigned long loop_expand_shrinks(unsigned long v, int change, int expand) {unsigned long m0 = 0;unsigned long m1 = 0;if (!change) {m0 = TWO_THREE_MASK;m1 = TWO_FOUR_MASK;unsigned long t0idx = v & m0;if (expand) {printf(" t0Index: %lu ", t0idx);printf(" t1Index: ");}else {printf(" t1Index: %lu ", t0idx);printf(" t0Index: ");}do {unsigned long t1idx = v & m1;printf("%lu ", t1idx);v = (((v | m0) + 1) & ~ m0) | (v & m0);} while (v & (m0 ^ m1));}else {if (expand) {m0 = TWO_FOUR_MASK;}else {m0 = TWO_THREE_MASK;}unsigned long t0idx = v & m0;printf(" t0Index: %lu ", t0idx);}v |= ~m0;v = rev(v);v++;v = rev(v);return v;
}void print_binary(unsigned long v) {char s[128] = {0};_itoa_s(v, s, sizeof(s), 2);printf("0x%032s", s);
}void check_loop_normal() {unsigned long v = 0;do {print_binary(v);v = loop(v);printf("\n");} while (v != 0);
}void check_loop_expand_shrinks(int expand) {int loop_count = 9;for (int n  = 0; n < loop_count; n++) {unsigned long v = 0;int change = 0;int call_count = 0;do {if (call_count == n) {change = 1;}print_binary(v);v = loop_expand_shrinks(v, change, expand);call_count++;printf("\n");} while (v != 0);printf("\n");}
}void check_loop_single_expand_shrinks(int expand) {int loop_count = 17;for (int n  = 0; n < loop_count; n++) {unsigned long v = 0;int change = 0;int call_count = 0;do {if (call_count == n) {change = 1;}print_binary(v);v = loop_single_expand_shrinks(v, change, expand);call_count++;printf("\n");} while (v != 0);printf("\n");}
}

總結(jié)

以上是生活随笔為你收集整理的Redis源码解析——字典遍历的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Redis源码解析——字典基本操作
下一篇： Redis源码解析——双向链表