Redis 数据类型介绍
Redis 數(shù)據(jù)類(lèi)型介紹
你也許已經(jīng)知道Redis并不是簡(jiǎn)單的key-value存儲(chǔ),實(shí)際上他是一個(gè)數(shù)據(jù)結(jié)構(gòu)服務(wù)器,支持不同類(lèi)型的值。也就是說(shuō),你不必僅僅把字符串當(dāng)作鍵所指向的值。下列這些數(shù)據(jù)類(lèi)型都可作為值類(lèi)型:
- 二進(jìn)制安全的字符串
- Lists: 按插入順序排序的字符串元素的集合。他們基本上就是鏈表(linked lists)。
- Sets: 不重復(fù)且無(wú)序的字符串元素的集合。
- Sorted sets,類(lèi)似Sets,但是每個(gè)字符串元素都關(guān)聯(lián)到一個(gè)叫score浮動(dòng)數(shù)值(floating number value)。里面的元素總是通過(guò)score進(jìn)行著排序,所以不同的是,它是可以檢索的一系列元素。(例如你可能會(huì)問(wèn):給我前面10個(gè)或者后面10個(gè)元素)。
- Hashes,由field和關(guān)聯(lián)的value組成的map。field和value都是字符串的。這和Ruby、Python的hashes很像。
- Bit arrays (或者說(shuō) simply bitmaps): 通過(guò)特殊的命令,你可以將 String 值當(dāng)作一系列 bits 處理:可以設(shè)置和清除單獨(dú)的 bits,數(shù)出所有設(shè)為 1 的 bits 的數(shù)量,找到最前的被設(shè)為 1 或 0 的 bit,等等。
- HyperLogLogs: 這是被用于估計(jì)一個(gè) set 中元素?cái)?shù)量的概率性的數(shù)據(jù)結(jié)構(gòu)。別害怕,它比看起來(lái)的樣子要簡(jiǎn)單…參見(jiàn)本教程的 HyperLogLog 部分。D
學(xué)習(xí)這些數(shù)據(jù)類(lèi)型的原理,以及如何使用它們解決?command reference?中的特定問(wèn)題,并不總是不關(guān)緊要的。所以,本文檔是一個(gè)關(guān)于 Redis 數(shù)據(jù)類(lèi)型和它們最常見(jiàn)特性的導(dǎo)論。 在所有的例子中,我們將使用?redis-cli?工具。它是一個(gè)簡(jiǎn)單而有用的命令行工具,用于向 Redis 服務(wù)器發(fā)出命令。
Redis keys
Redis key值是二進(jìn)制安全的,這意味著可以用任何二進(jìn)制序列作為key值,從形如”foo”的簡(jiǎn)單字符串到一個(gè)JPEG文件的內(nèi)容都可以。空字符串也是有效key值。
關(guān)于key的幾條規(guī)則:
- 太長(zhǎng)的鍵值不是個(gè)好主意,例如1024字節(jié)的鍵值就不是個(gè)好主意,不僅因?yàn)橄膬?nèi)存,而且在數(shù)據(jù)中查找這類(lèi)鍵值的計(jì)算成本很高。
- 太短的鍵值通常也不是好主意,如果你要用”u:1000:pwd”來(lái)代替”user:1000:password”,這沒(méi)有什么問(wèn)題,但后者更易閱讀,并且由此增加的空間消耗相對(duì)于key object和value object本身來(lái)說(shuō)很小。當(dāng)然,沒(méi)人阻止您一定要用更短的鍵值節(jié)省一丁點(diǎn)兒空間。
- 最好堅(jiān)持一種模式。例如:”object-type:id:field”就是個(gè)不錯(cuò)的注意,像這樣”user:1000:password”。我喜歡對(duì)多單詞的字段名中加上一個(gè)點(diǎn),就像這樣:”comment:1234:reply.to”。
Redis Strings
這是最簡(jiǎn)單Redis類(lèi)型。如果你只用這種類(lèi)型,Redis就像一個(gè)可以持久化的memcached服務(wù)器(注:memcache的數(shù)據(jù)僅保存在內(nèi)存中,服務(wù)器重啟后,數(shù)據(jù)將丟失)。
我們用redis-cli來(lái)玩一下字符串類(lèi)型:
> set mykey somevalue OK > get mykey "somevalue"正如你所見(jiàn)到的,通常用SET command 和 GET command來(lái)設(shè)置和獲取字符串值。
值可以是任何種類(lèi)的字符串(包括二進(jìn)制數(shù)據(jù)),例如你可以在一個(gè)鍵下保存一副jpeg圖片。值的長(zhǎng)度不能超過(guò)512 MB。
SET?命令有些有趣的操作,例如,當(dāng)key存在時(shí)SET會(huì)失敗,或相反的,當(dāng)key不存在時(shí)它只會(huì)成功。
> set mykey newval nx (nil) > set mykey newval xx OK雖然字符串是Redis的基本值類(lèi)型,但你仍然能通過(guò)它完成一些有趣的操作。例如:原子遞增:
> set counter 100 OK > incr counter (integer) 101 > incr counter (integer) 102 > incrby counter 50 (integer) 152INCR?命令將字符串值解析成整型,將其加一,最后將結(jié)果保存為新的字符串值,類(lèi)似的命令有INCRBY,?DECR?和?DECRBY。實(shí)際上他們?cè)趦?nèi)部就是同一個(gè)命令,只是看上去有點(diǎn)兒不同。
INCR是原子操作意味著什么呢?就是說(shuō)即使多個(gè)客戶(hù)端對(duì)同一個(gè)key發(fā)出INCR命令,也決不會(huì)導(dǎo)致競(jìng)爭(zhēng)的情況。例如如下情況永遠(yuǎn)不可能發(fā)生:『客戶(hù)端1和客戶(hù)端2同時(shí)讀出“10”,他們倆都對(duì)其加到11,然后將新值設(shè)置為11』。最終的值一定是12,read-increment-set操作完成時(shí),其他客戶(hù)端不會(huì)在同一時(shí)間執(zhí)行任何命令。
對(duì)字符串,另一個(gè)的令人感興趣的操作是GETSET命令,行如其名:他為key設(shè)置新值并且返回原值。這有什么用處呢?例如:你的系統(tǒng)每當(dāng)有新用戶(hù)訪(fǎng)問(wèn)時(shí)就用INCR命令操作一個(gè)Redis key。你希望每小時(shí)對(duì)這個(gè)信息收集一次。你就可以GETSET這個(gè)key并給其賦值0并讀取原值。
為減少等待時(shí)間,也可以一次存儲(chǔ)或獲取多個(gè)key對(duì)應(yīng)的值,使用MSET和MGET命令:
> mset a 10 b 20 c 30 OK > mget a b c 1) "10" 2) "20" 3) "30"MGET 命令返回由值組成的數(shù)組。
修改或查詢(xún)鍵空間
有些指令不是針對(duì)任何具體的類(lèi)型定義的,而是用于和整個(gè)鍵空間交互的。因此,它們可被用于任何類(lèi)型的鍵。
使用EXISTS命令返回1或0標(biāo)識(shí)給定key的值是否存在,使用DEL命令可以刪除key對(duì)應(yīng)的值,DEL命令返回1或0標(biāo)識(shí)值是被刪除(值存在)或者沒(méi)被刪除(key對(duì)應(yīng)的值不存在)。
> set mykey hello OK > exists mykey (integer) 1 > del mykey (integer) 1 > exists mykey (integer) 0TYPE命令可以返回key對(duì)應(yīng)的值的存儲(chǔ)類(lèi)型:
> set mykey x OK > type mykey string > del mykey (integer) 1 > type mykey noneRedis超時(shí):數(shù)據(jù)在限定時(shí)間內(nèi)存活
在介紹復(fù)雜類(lèi)型前我們先介紹一個(gè)與值類(lèi)型無(wú)關(guān)的Redis特性:超時(shí)。你可以對(duì)key設(shè)置一個(gè)超時(shí)時(shí)間,當(dāng)這個(gè)時(shí)間到達(dá)后會(huì)被刪除。精度可以使用毫秒或秒。
> set key some-value OK > expire key 5 (integer) 1 > get key (immediately) "some-value" > get key (after some time) (nil)上面的例子使用了EXPIRE來(lái)設(shè)置超時(shí)時(shí)間(也可以再次調(diào)用這個(gè)命令來(lái)改變超時(shí)時(shí)間,使用PERSIST命令去除超時(shí)時(shí)間 )。我們也可以在創(chuàng)建值的時(shí)候設(shè)置超時(shí)時(shí)間:
> set key 100 ex 10 OK > ttl key (integer) 9TTL命令用來(lái)查看key對(duì)應(yīng)的值剩余存活時(shí)間。
Redis Lists
要說(shuō)清楚列表數(shù)據(jù)類(lèi)型,最好先講一點(diǎn)兒理論背景,在信息技術(shù)界List這個(gè)詞常常被使用不當(dāng)。例如”P(pán)ython Lists”就名不副實(shí)(名為L(zhǎng)inked Lists),但他們實(shí)際上是數(shù)組(同樣的數(shù)據(jù)類(lèi)型在Ruby中叫數(shù)組)
一般意義上講,列表就是有序元素的序列:10,20,1,2,3就是一個(gè)列表。但用數(shù)組實(shí)現(xiàn)的List和用Linked List實(shí)現(xiàn)的List,在屬性方面大不相同。
Redis lists基于Linked Lists實(shí)現(xiàn)。這意味著即使在一個(gè)list中有數(shù)百萬(wàn)個(gè)元素,在頭部或尾部添加一個(gè)元素的操作,其時(shí)間復(fù)雜度也是常數(shù)級(jí)別的。用LPUSH 命令在十個(gè)元素的list頭部添加新元素,和在千萬(wàn)元素list頭部添加新元素的速度相同。
那么,壞消息是什么?在數(shù)組實(shí)現(xiàn)的list中利用索引訪(fǎng)問(wèn)元素的速度極快,而同樣的操作在linked list實(shí)現(xiàn)的list上沒(méi)有那么快。
Redis Lists用linked list實(shí)現(xiàn)的原因是:對(duì)于數(shù)據(jù)庫(kù)系統(tǒng)來(lái)說(shuō),至關(guān)重要的特性是:能非常快的在很大的列表上添加元素。另一個(gè)重要因素是,正如你將要看到的:Redis lists能在常數(shù)時(shí)間取得常數(shù)長(zhǎng)度。
如果快速訪(fǎng)問(wèn)集合元素很重要,建議使用可排序集合(sorted sets)。可排序集合我們會(huì)隨后介紹。
Redis lists 入門(mén)
LPUSH?命令可向list的左邊(頭部)添加一個(gè)新元素,而RPUSH命令可向list的右邊(尾部)添加一個(gè)新元素。最后LRANGE?命令可從list中取出一定范圍的元素:
> rpush mylist A (integer) 1 > rpush mylist B (integer) 2 > lpush mylist first (integer) 3 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B"注意:LRANGE?帶有兩個(gè)索引,一定范圍的第一個(gè)和最后一個(gè)元素。這兩個(gè)索引都可以為負(fù)來(lái)告知Redis從尾部開(kāi)始計(jì)數(shù),因此-1表示最后一個(gè)元素,-2表示list中的倒數(shù)第二個(gè)元素,以此類(lèi)推。
上面的所有命令的參數(shù)都可變,方便你一次向list存入多個(gè)值。
> rpush mylist 1 2 3 4 5 "foo bar" (integer) 9 > lrange mylist 0 -1 1) "first" 2) "A" 3) "B" 4) "1" 5) "2" 6) "3" 7) "4" 8) "5" 9) "foo bar"還有一個(gè)重要的命令是pop,它從list中刪除元素并同時(shí)返回刪除的值。可以在左邊或右邊操作。
> rpush mylist a b c (integer) 3 > rpop mylist "c" > rpop mylist "b" > rpop mylist "a"我們?cè)黾恿巳齻€(gè)元素,并彈出了三個(gè)元素,因此,在這最后 列表中的命令序列是空的,沒(méi)有更多的元素可以被彈出。如果我們嘗試彈出另一個(gè)元素,這是我們得到的結(jié)果:
> rpop mylist (nil)當(dāng)list沒(méi)有元素時(shí),Redis 返回了一個(gè)NULL。
List的常用案例
正如你可以從上面的例子中猜到的,list可被用來(lái)實(shí)現(xiàn)聊天系統(tǒng)。還可以作為不同進(jìn)程間傳遞消息的隊(duì)列。關(guān)鍵是,你可以每次都以原先添加的順序訪(fǎng)問(wèn)數(shù)據(jù)。這不需要任何SQL ORDER BY 操作,將會(huì)非常快,也會(huì)很容易擴(kuò)展到百萬(wàn)級(jí)別元素的規(guī)模。
例如在評(píng)級(jí)系統(tǒng)中,比如社會(huì)化新聞網(wǎng)站 reddit.com,你可以把每個(gè)新提交的鏈接添加到一個(gè)list,用LRANGE可簡(jiǎn)單的對(duì)結(jié)果分頁(yè)。
在博客引擎實(shí)現(xiàn)中,你可為每篇日志設(shè)置一個(gè)list,在該list中推入博客評(píng)論,等等。
Capped lists
可以使用LTRIM把list從左邊截取指定長(zhǎng)度。
> rpush mylist 1 2 3 4 5 (integer) 5 > ltrim mylist 0 2 OK > lrange mylist 0 -1 1) "1" 2) "2" 3) "3"List上的阻塞操作
可以使用Redis來(lái)實(shí)現(xiàn)生產(chǎn)者和消費(fèi)者模型,如使用LPUSH和RPOP來(lái)實(shí)現(xiàn)該功能。但會(huì)遇到這種情景:list是空,這時(shí)候消費(fèi)者就需要輪詢(xún)來(lái)獲取數(shù)據(jù),這樣就會(huì)增加redis的訪(fǎng)問(wèn)壓力、增加消費(fèi)端的cpu時(shí)間,而很多訪(fǎng)問(wèn)都是無(wú)用的。為此redis提供了阻塞式訪(fǎng)問(wèn)?BRPOP?和?BLPOP?命令。 消費(fèi)者可以在獲取數(shù)據(jù)時(shí)指定如果數(shù)據(jù)不存在阻塞的時(shí)間,如果在時(shí)限內(nèi)獲得數(shù)據(jù)則立即返回,如果超時(shí)還沒(méi)有數(shù)據(jù)則返回null, 0表示一直阻塞。
同時(shí)redis還會(huì)為所有阻塞的消費(fèi)者以先后順序排隊(duì)。
如需了解詳細(xì)信息請(qǐng)查看?RPOPLPUSH?和?BRPOPLPUSH。
key 的自動(dòng)創(chuàng)建和刪除
目前為止,在我們的例子中,我們沒(méi)有在推入元素之前創(chuàng)建空的 list,或者在 list 沒(méi)有元素時(shí)刪除它。在 list 為空時(shí)刪除 key,并在用戶(hù)試圖添加元素(比如通過(guò)?LPUSH)而鍵不存在時(shí)創(chuàng)建空 list,是 Redis 的職責(zé)。
這不光適用于 lists,還適用于所有包括多個(gè)元素的 Redis 數(shù)據(jù)類(lèi)型 – Sets, Sorted Sets 和 Hashes。
基本上,我們可以用三條規(guī)則來(lái)概括它的行為:
規(guī)則 1 示例:
> del mylist (integer) 1 > lpush mylist 1 2 3 (integer) 3但是,我們不能對(duì)存在但類(lèi)型錯(cuò)誤的 key 做操作: ? > set foo bar OK > lpush foo 1 2 3 (error) WRONGTYPE Operation against a key holding the wrong kind of value > type foo string
規(guī)則 2 示例:
> lpush mylist 1 2 3 (integer) 3 > exists mylist (integer) 1 > lpop mylist "3" > lpop mylist "2" > lpop mylist "1" > exists mylist (integer) 0所有的元素被彈出之后, key 不復(fù)存在。
規(guī)則 3 示例:
> del mylist (integer) 0 > llen mylist (integer) 0 > lpop mylist (nil)Redis Hashes —
Redis hash 看起來(lái)就像一個(gè) “hash” 的樣子,由鍵值對(duì)組成:
> hmset user:1000 username antirez birthyear 1977 verified 1 OK > hget user:1000 username "antirez" > hget user:1000 birthyear "1977" > hgetall user:1000 1) "username" 2) "antirez" 3) "birthyear" 4) "1977" 5) "verified" 6) "1"Hash 便于表示?objects,實(shí)際上,你可以放入一個(gè) hash 的域數(shù)量實(shí)際上沒(méi)有限制(除了可用內(nèi)存以外)。所以,你可以在你的應(yīng)用中以不同的方式使用 hash。
HMSET?指令設(shè)置 hash 中的多個(gè)域,而?HGET?取回單個(gè)域。HMGET?和?HGET?類(lèi)似,但返回一系列值:
> hmget user:1000 username birthyear no-such-field 1) "antirez" 2) "1977" 3) (nil)也有一些指令能夠?qū)为?dú)的域執(zhí)行操作,比如?HINCRBY:
> hincrby user:1000 birthyear 10 (integer) 1987 > hincrby user:1000 birthyear 10 (integer) 1997你可以在文檔中找到?hash 指令的完整列表。
值得注意的是,小的 hash 被用特殊方式編碼,非常節(jié)約內(nèi)存。
Redis Sets —
Redis Set 是 String 的無(wú)序排列。SADD?指令把新的元素添加到 set 中。對(duì) set 也可做一些其他的操作,比如測(cè)試一個(gè)給定的元素是否存在,對(duì)不同 set 取交集,并集或差,等等。
> sadd myset 1 2 3 (integer) 3 > smembers myset 1. 3 2. 1 3. 2現(xiàn)在我已經(jīng)把三個(gè)元素加到我的 set 中,并告訴 Redis 返回所有的元素。可以看到,它們沒(méi)有被排序 —— Redis 在每次調(diào)用時(shí)可能按照任意順序返回元素,因?yàn)閷?duì)于元素的順序并沒(méi)有規(guī)定。
Redis 有檢測(cè)成員的指令。一個(gè)特定的元素是否存在?
> sismember myset 3 (integer) 1 > sismember myset 30 (integer) 0“3” 是 set 的一個(gè)成員,而 “30” 不是。
Sets 適合用于表示對(duì)象間的關(guān)系。 例如,我們可以輕易使用 set 來(lái)表示標(biāo)記。
一個(gè)簡(jiǎn)單的建模方式是,對(duì)每一個(gè)希望標(biāo)記的對(duì)象使用 set。這個(gè) set 包含和對(duì)象相關(guān)聯(lián)的標(biāo)簽的 ID。
假設(shè)我們想要給新聞打上標(biāo)簽。 假設(shè)新聞 ID 1000 被打上了 1,2,5 和 77 四個(gè)標(biāo)簽,我們可以使用一個(gè) set 把 tag ID 和新聞條目關(guān)聯(lián)起來(lái):
> sadd news:1000:tags 1 2 5 77 (integer) 4但是,有時(shí)候我可能也會(huì)需要相反的關(guān)系:所有被打上相同標(biāo)簽的新聞列表:
> sadd tag:1:news 1000 (integer) 1 > sadd tag:2:news 1000 (integer) 1 > sadd tag:5:news 1000 (integer) 1 > sadd tag:77:news 1000 (integer) 1獲取一個(gè)對(duì)象的所有 tag 是很方便的:
> smembers news:1000:tags 1. 5 2. 1 3. 77 4. 2注意:在這個(gè)例子中,我們假設(shè)你有另一個(gè)數(shù)據(jù)結(jié)構(gòu),比如一個(gè) Redis hash,把標(biāo)簽 ID 對(duì)應(yīng)到標(biāo)簽名稱(chēng)。
使用 Redis 命令行,我們可以輕易實(shí)現(xiàn)其它一些有用的操作。比如,我們可能需要一個(gè)含有 1, 2, 10, 和 27 標(biāo)簽的對(duì)象的列表。我們可以用?SINTER?命令來(lái)完成這件事。它獲取不同 set 的交集。我們可以用:
> sinter tag:1:news tag:2:news tag:10:news tag:27:news ... results here ...不光可以取交集,還可以取并集,差集,獲取隨機(jī)元素,等等。
獲取一個(gè)元素的命令是?SPOP,它很適合對(duì)特定問(wèn)題建模。比如,要實(shí)現(xiàn)一個(gè)基于 web 的撲克游戲,你可能需要用 set 來(lái)表示一副牌。假設(shè)我們用一個(gè)字符的前綴來(lái)表示不同花色:
> sadd deck C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 CJ CQ CKD1 D2 D3 D4 D5 D6 D7 D8 D9 D10 DJ DQ DK H1 H2 H3H4 H5 H6 H7 H8 H9 H10 HJ HQ HK S1 S2 S3 S4 S5 S6S7 S8 S9 S10 SJ SQ SK(integer) 52現(xiàn)在,我們想要給每個(gè)玩家 5 張牌。SPOP?命令刪除一個(gè)隨機(jī)元素,把它返回給客戶(hù)端,因此它是完全合適的操作。
但是,如果我們對(duì)我們的牌直接調(diào)用它,在下一盤(pán)我們就需要重新充滿(mǎn)這副牌。開(kāi)始,我們可以復(fù)制?deck?鍵中的內(nèi)容,并放入?game:1:deck?鍵中。
這是通過(guò)?SUNIONSTORE?實(shí)現(xiàn)的,它通常用于對(duì)多個(gè)集合取并集,并把結(jié)果存入另一個(gè) set 中。但是,因?yàn)橐粋€(gè) set 的并集就是它本身,我可以這樣復(fù)制我的牌:
> sunionstore game:1:deck deck (integer) 52現(xiàn)在,我已經(jīng)準(zhǔn)備好給 1 號(hào)玩家發(fā)五張牌:
> spop game:1:deck "C6" > spop game:1:deck "CQ" > spop game:1:deck "D1" > spop game:1:deck "CJ" > spop game:1:deck "SJ"One pair of jacks, not great…
Now it’s a good time to introduce the set command that provides the number of elements inside a set. This is often called the?cardinality of a set?in the context of set theory, so the Redis command is called?SCARD.
> scard game:1:deck (integer) 47The math works: 52 - 5 = 47.
When you need to just get random elements without removing them from the set, there is the?SRANDMEMBER?command suitable for the task. It also features the ability to return both repeating and non-repeating elements.
Redis Sorted sets —
Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well.
However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, called?the score?(this is why the type is also similar to a hash, since every element is mapped to a value).
Moreover, elements in a sorted sets are?taken in order?(so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule:
- If A and B are two elements with a different score, then A > B if A.score is > B.score.
- If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can’t be equal since sorted sets only have unique elements.
Let’s start with a simple example, adding a few selected hackers names as sorted set elements, with their year of birth as “score”.
> zadd hackers 1940 "Alan Kay" (integer) 1 > zadd hackers 1957 "Sophie Wilson" (integer 1) > zadd hackers 1953 "Richard Stallman" (integer) 1 > zadd hackers 1949 "Anita Borg" (integer) 1 > zadd hackers 1965 "Yukihiro Matsumoto" (integer) 1 > zadd hackers 1914 "Hedy Lamarr" (integer) 1 > zadd hackers 1916 "Claude Shannon" (integer) 1 > zadd hackers 1969 "Linus Torvalds" (integer) 1 > zadd hackers 1912 "Alan Turing" (integer) 1As you can see?ZADD?is similar to?SADD, but takes one additional argument (placed before the element to be added) which is the score.?ZADD?is also variadic, so you are free to specify multiple score-value pairs, even if this is not used in the example above.
With sorted sets it is trivial to return a list of hackers sorted by their birth year because actually?they are already sorted.
Implementation note: Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs an O(log(N)) operation. That’s good, but when we ask for sorted elements Redis does not have to do any work at all, it’s already all sorted:
> zrange hackers 0 -1 1) "Alan Turing" 2) "Hedy Lamarr" 3) "Claude Shannon" 4) "Alan Kay" 5) "Anita Borg" 6) "Richard Stallman" 7) "Sophie Wilson" 8) "Yukihiro Matsumoto" 9) "Linus Torvalds"Note: 0 and -1 means from element index 0 to the last element (-1 works here just as it does in the case of the?LRANGE?command).
What if I want to order them the opposite way, youngest to oldest? Use?ZREVRANGE?instead of?ZRANGE:
> zrevrange hackers 0 -1 1) "Linus Torvalds" 2) "Yukihiro Matsumoto" 3) "Sophie Wilson" 4) "Richard Stallman" 5) "Anita Borg" 6) "Alan Kay" 7) "Claude Shannon" 8) "Hedy Lamarr" 9) "Alan Turing"It is possible to return scores as well, using the?WITHSCORES?argument:
> zrange hackers 0 -1 withscores 1) "Alan Turing" 2) "1912" 3) "Hedy Lamarr" 4) "1914" 5) "Claude Shannon" 6) "1916" 7) "Alan Kay" 8) "1940" 9) "Anita Borg" 10) "1949" 11) "Richard Stallman" 12) "1953" 13) "Sophie Wilson" 14) "1957" 15) "Yukihiro Matsumoto" 16) "1965" 17) "Linus Torvalds" 18) "1969"Operating on ranges
Sorted sets are more powerful than this. They can operate on ranges. Let’s get all the individuals that were born up to 1950 inclusive. We use the?ZRANGEBYSCORE?command to do it:
> zrangebyscore hackers -inf 1950 1) "Alan Turing" 2) "Hedy Lamarr" 3) "Claude Shannon" 4) "Alan Kay" 5) "Anita Borg"We asked Redis to return all the elements with a score between negative infinity and 1950 (both extremes are included).
It’s also possible to remove ranges of elements. Let’s remove all the hackers born between 1940 and 1960 from the sorted set:
> zremrangebyscore hackers 1940 1960 (integer) 4ZREMRANGEBYSCORE?is perhaps not the best command name, but it can be very useful, and returns the number of removed elements.
Another extremely useful operation defined for sorted set elements is the get-rank operation. It is possible to ask what is the position of an element in the set of the ordered elements.
> zrank hackers "Anita Borg" (integer) 4The?ZREVRANK?command is also available in order to get the rank, considering the elements sorted a descending way.
Lexicographical scores
With recent versions of Redis 2.8, a new feature was introduced that allows getting ranges lexicographically, assuming elements in a sorted set are all inserted with the same identical score (elements are compared with the C?memcmp?function, so it is guaranteed that there is no collation, and every Redis instance will reply with the same output).
The main commands to operate with lexicographical ranges are?ZRANGEBYLEX,?ZREVRANGEBYLEX,?ZREMRANGEBYLEX?and?ZLEXCOUNT.
For example, let’s add again our list of famous hackers, but this time use a score of zero for all the elements:
> zadd hackers 0 "Alan Kay" 0 "Sophie Wilson" 0 "Richard Stallman" 0"Anita Borg" 0 "Yukihiro Matsumoto" 0 "Hedy Lamarr" 0 "Claude Shannon"0 "Linus Torvalds" 0 "Alan Turing"Because of the sorted sets ordering rules, they are already sorted lexicographically:
> zrange hackers 0 -1 1) "Alan Kay" 2) "Alan Turing" 3) "Anita Borg" 4) "Claude Shannon" 5) "Hedy Lamarr" 6) "Linus Torvalds" 7) "Richard Stallman" 8) "Sophie Wilson" 9) "Yukihiro Matsumoto"Using?ZRANGEBYLEX?we can ask for lexicographical ranges:
> zrangebylex hackers [B [P 1) "Claude Shannon" 2) "Hedy Lamarr" 3) "Linus Torvalds"Ranges can be inclusive or exclusive (depending on the first character), also string infinite and minus infinite are specified respectively with the?+?and?-?strings. See the documentation for more information.
This feature is important because it allows us to use sorted sets as a generic index. For example, if you want to index elements by a 128-bit unsigned integer argument, all you need to do is to add elements into a sorted set with the same score (for example 0) but with an 8 byte prefix consisting of?the 128 bit number in big endian. Since numbers in big endian, when ordered lexicographically (in raw bytes order) are actually ordered numerically as well, you can ask for ranges in the 128 bit space, and get the element’s value discarding the prefix.
If you want to see the feature in the context of a more serious demo, check the?Redis autocomplete demo.
Updating the score: leader boards
Just a final note about sorted sets before switching to the next topic. Sorted sets’ scores can be updated at any time. Just calling?ZADD?against an element already included in the sorted set will update its score (and position) with O(log(N)) time complexity. As such, sorted sets are suitable when there are tons of updates.
Because of this characteristic a common use case is leader boards. The typical application is a Facebook game where you combine the ability to take users sorted by their high score, plus the get-rank operation, in order to show the top-N users, and the user rank in the leader board (e.g., “you are the #4932 best score here”).
Bitmaps —
Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 2^32 different bits.
Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting).
One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory.
Bits are set and retrieved using the?SETBIT?and?GETBIT?commands:
> setbit key 10 1 (integer) 1 > getbit key 10 (integer) 1 > getbit key 11 (integer) 0The?SETBIT?command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length.
GETBIT?just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero.
There are three commands operating on group of bits:
Both?BITPOS?and?BITCOUNT?are able to operate with byte ranges of the string, instead of running for the whole length of the string. The following is a trivial example of?BITCOUNT?call:
> setbit key 0 1 (integer) 0 > setbit key 100 1 (integer) 0 > bitcount key (integer) 2Common user cases for bitmaps are:
- Real time analytics of all kinds.
- Storing space efficient but high performance boolean information associated with object IDs.
For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with?SETBIT?every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by 3600*24.
This way for each user you have a small string containing the visit information for each day. With?BITCOUNT?it is possible to easily get the number of days a given user visited the web site, while with a few?BITPOS?calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak.
Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with?bit-number/M?and the Nth bit to address inside the key with?bit-number MOD M.
HyperLogLogs —
A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, in the case of the Redis implementation, which is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We’ll just call them HLL from now) has seen very few elements.
HLLs in Redis, while technically a different data structure, is encoded as a Redis string, so you can call?GET?to serialize a HLL, and?SET?to deserialize it back to the server.
Conceptually the HLL API is like using Sets to do the same task. You would?SADD?every observed element into a set, and would use?SCARD?to check the number of elements inside the set, which are unique since?SADD?will not re-add an existing element.
While you don’t really?add items?into an HLL, because the data structure only contains a state that does not include actual elements, the API is the same:
- Every time you see a new element, you add it to the count with?PFADD.
-
Every time you want to retrieve the current approximation of the unique elements?added?with?PFADD?so far, you use the?PFCOUNT.
> pfadd hll a b c d(integer) 1> pfcount hll(integer) 4
An example of use case for this data structure is counting unique queries performed by users in a search form every day.
Redis is also able to perform the union of HLLs, please check the?full documentation?for more information.
Other notable features
There are other important things in the Redis API that can’t be explored in the context of this document, but are worth your attention:
- It is possible to?iterate the key space of a large collection incrementally.
- It is possible to run?Lua scripts server side?to win latency and bandwidth.
- Redis is also a?Pub-Sub server.
Learn more
This tutorial is in no way complete and has covered just the basics of the API. Read the?command reference?to discover a lot more.
Thanks for reading, and have fun hacking with Redis!
總結(jié)
以上是生活随笔為你收集整理的Redis 数据类型介绍的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: face recognition[翻译]
- 下一篇: Redis FAQ