lz4压缩算法--速度之王
簡介
lz4是目前綜合來看效率最高的壓縮算法,更加側(cè)重壓縮解壓速度,壓縮比并不是第一。在當前的安卓和蘋果操作系統(tǒng)中,內(nèi)存壓縮技術就使用的是lz4算法,及時壓縮手機內(nèi)存以帶來更多的內(nèi)存空間。本質(zhì)上是時間換空間。
壓縮原理
lz4壓縮算法其實很簡單,舉個壓縮的栗子
輸入:abcde_bcdefgh_abcdefghxxxxxxx 輸出:abcde_(5,4)fgh_(14,5)fghxxxxxxx其中兩個括號內(nèi)的便代表的是壓縮時檢測到的重復項,(5,4) 代表向前5個byte,匹配到的內(nèi)容長度有4,即"bcde"是一個重復。當然也可以說"cde"是個重復項,但是根據(jù)算法實現(xiàn)的輸入流掃描順序,我們?nèi)〉降氖堑谝粋€匹配到的,并且長度最長的作為匹配。
1.壓縮格式
壓縮后的數(shù)據(jù)是下面的格式
?
輸入:abcde_bcdefgh_abcdefghxxxxxxx 輸出:tokenabcde_(5,4)fgh_(14,5)fghxxxxxxx 格式:[token]literals(offset,match length)[token]literals(offset,match length)....其他情況也可能有連續(xù)的匹配:
輸入:fghabcde_bcdefgh_abcdefghxxxxxxx 輸出:fghabcde_(5,4)(13,3)_(14,5)fghxxxxxxx 格式:[token]literals(offset,match length)[token](offset,match length).... 這里(13,3)長度3其實并不對,match length匹配的長度默認是4Literals指沒有重復、首次出現(xiàn)的字節(jié)流,即不可壓縮的部分
Match指重復項,可以壓縮的部分
Token記錄literal長度,match長度。作為解壓時候memcpy的參數(shù)
2.壓縮率
可以想到,如果重復項越多或者越長,壓縮率就會越高。上述例子中"bcde"在壓縮后,用(5,4)表示,即從4個bytes壓縮成了3個bytes來表示,其中offset 2bytes, match length 1byte,能節(jié)省1個byte。
3.壓縮算法實現(xiàn)
大致流程,壓縮過程以至少4個bytes為掃描窗口查找匹配,每次移動1byte進行掃描,遇到重復的就進行壓縮。
由于offset用2bytes表示,只能查找到到2^16(64kb)距離的匹配,對于壓縮4Kb的內(nèi)核頁,只需要用到12位。
掃描的步長1byte是可以調(diào)整的,即對應LZ4_compress_fast機制,步長變長可以提高壓縮解壓速度,減少壓縮率。
?
我們來看下apple的lz4實現(xiàn)
//src是輸入流,dst是輸出,還需要使用一個hash表記錄前面一段距離內(nèi)的字符串,用來查找之前是否有匹配 void lz4_encode_2gb(uint8_t ** dst_ptr,size_t dst_size,const uint8_t ** src_ptr,const uint8_t * src_begin,size_t src_size,lz4_hash_entry_t hash_table[LZ4_COMPRESS_HASH_ENTRIES],int skip_final_literals) {uint8_t *dst = *dst_ptr; // current output stream positionuint8_t *end = dst + dst_size - LZ4_GOFAST_SAFETY_MARGIN;const uint8_t *src = *src_ptr; // current input stream literal to encodeconst uint8_t *src_end = src + src_size - LZ4_GOFAST_SAFETY_MARGIN;const uint8_t *match_begin = 0; // first byte of matched sequenceconst uint8_t *match_end = 0; // first byte after matched sequence //蘋果這里使用了一個early abort機制,即輸入流掃描到lz4_do_abort_eval位置的時候,仍然沒有匹配,則認為該輸入無法壓縮,提前結(jié)束不用全部掃描完 #if LZ4_EARLY_ABORTuint8_t * const dst_begin = dst;uint32_t lz4_do_abort_eval = lz4_do_early_abort; #endifwhile (dst < end){ptrdiff_t match_distance = 0;//for循環(huán)一次查找到一個match即跳出到EXPAND_FORWARDfor (match_begin = src; match_begin < src_end; match_begin += 1) {const uint32_t pos = (uint32_t)(match_begin - src_begin);//蘋果這里實現(xiàn)比較奇怪,還在思考為何同時查找連續(xù)四個bytes的匹配const uint32_t w0 = load4(match_begin);//該位置4個bytes的內(nèi)容const uint32_t w1 = load4(match_begin + 1);const uint32_t w2 = load4(match_begin + 2);const uint32_t w3 = load4(match_begin + 3);const int i0 = lz4_hash(w0);const int i1 = lz4_hash(w1);const int i2 = lz4_hash(w2);const int i3 = lz4_hash(w3);const uint8_t *c0 = src_begin + hash_table[i0].offset;const uint8_t *c1 = src_begin + hash_table[i1].offset;const uint8_t *c2 = src_begin + hash_table[i2].offset;const uint8_t *c3 = src_begin + hash_table[i3].offset;const uint32_t m0 = hash_table[i0].word;//取出hash表中以前有沒有一樣的值const uint32_t m1 = hash_table[i1].word;const uint32_t m2 = hash_table[i2].word;const uint32_t m3 = hash_table[i3].word;hash_table[i0].offset = pos;hash_table[i0].word = w0;hash_table[i1].offset = pos + 1;hash_table[i1].word = w1;hash_table[i2].offset = pos + 2;hash_table[i2].word = w2;hash_table[i3].offset = pos + 3;hash_table[i3].word = w3;match_distance = (match_begin - c0);//比較hash表中的值和當前指針位置的hash值if (w0 == m0 && match_distance < 0x10000 && match_distance > 0) {match_end = match_begin + 4;goto EXPAND_FORWARD;}match_begin++;match_distance = (match_begin - c1);if (w1 == m1 && match_distance < 0x10000 && match_distance > 0) {match_end = match_begin + 4;goto EXPAND_FORWARD;}match_begin++;match_distance = (match_begin - c2);if (w2 == m2 && match_distance < 0x10000 && match_distance > 0) {match_end = match_begin + 4;goto EXPAND_FORWARD;}match_begin++;match_distance = (match_begin - c3);if (w3 == m3 && match_distance < 0x10000 && match_distance > 0) {match_end = match_begin + 4;goto EXPAND_FORWARD;}#if LZ4_EARLY_ABORT//DRKTODO: Evaluate unrolling further. 2xunrolling had some modest benefitsif (lz4_do_abort_eval && ((pos) >= LZ4_EARLY_ABORT_EVAL)) {ptrdiff_t dstd = dst - dst_begin;//到這仍然沒有匹配,放棄if (dstd == 0) {lz4_early_aborts++;return;}/* if (dstd >= pos) { */ /* return; */ /* } */ /* ptrdiff_t cbytes = pos - dstd; */ /* if ((cbytes * LZ4_EARLY_ABORT_MIN_COMPRESSION_FACTOR) > pos) { */ /* return; */ /* } */lz4_do_abort_eval = 0;} #endif}//到這,整個for循環(huán)都沒有找到match,直接把整個src拷貝到dst即可if (skip_final_literals) { *src_ptr = src; *dst_ptr = dst; return; } // do not emit the final literal sequence// Emit a trailing literal that covers the remainder of the source buffer,// if we can do so without exceeding the bounds of the destination buffer.size_t src_remaining = src_end + LZ4_GOFAST_SAFETY_MARGIN - src;if (src_remaining < 15) {*dst++ = (uint8_t)(src_remaining << 4);memcpy(dst, src, 16); dst += src_remaining;} else {*dst++ = 0xf0;dst = lz4_store_length(dst, end, (uint32_t)(src_remaining - 15));if (dst == 0 || dst + src_remaining >= end) return;memcpy(dst, src, src_remaining); dst += src_remaining;}*dst_ptr = dst;*src_ptr = src + src_remaining;return;EXPAND_FORWARD:// Expand match forward 查看匹配是否能向前擴展,擴大匹配長度{const uint8_t * ref_end = match_end - match_distance;while (match_end < src_end){size_t n = lz4_nmatch(LZ4_MATCH_SEARCH_LOOP_SIZE, ref_end, match_end);if (n < LZ4_MATCH_SEARCH_LOOP_SIZE) { match_end += n; break; }match_end += LZ4_MATCH_SEARCH_LOOP_SIZE;ref_end += LZ4_MATCH_SEARCH_LOOP_SIZE;}}// Expand match backward 查看匹配是否能向后擴展,擴大匹配長度{// match_begin_min = max(src_begin + match_distance,literal)const uint8_t * match_begin_min = src_begin + match_distance;match_begin_min = (match_begin_min < src)?src:match_begin_min;const uint8_t * ref_begin = match_begin - match_distance;while (match_begin > match_begin_min && ref_begin[-1] == match_begin[-1] ) { match_begin -= 1; ref_begin -= 1; }}// Emit match 確定好match的offset和length以后,編碼成壓縮后的格式dst = lz4_emit_match((uint32_t)(match_begin - src), (uint32_t)(match_end - match_begin), (uint32_t)match_distance, dst, end, src);if (!dst) return;// Update statesrc = match_end;// Update return values to include the last fully encoded match//刷新src和dst位置,回到while重新開始for循環(huán)*dst_ptr = dst;*src_ptr = src;} }安卓內(nèi)存中壓縮的實例
該例子是一個起址0xffffffc06185f000的4K頁,大部分是0和1,由于length或者offset超長,多了一些特殊處理,這部分可以看安卓的lz4源碼發(fā)現(xiàn)兩個匹配,壓縮后的數(shù)據(jù)為31bytes,壓縮后概覽如下 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194336] src 0xffffffc06185f000 literallen 1 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194349] src 0xffffffc06185f000 (1,219) #(offset,match length) 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194359] src 0xffffffc06185f000 literallen 1 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194386] src 0xffffffc06185f000 (3044,7) 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194400] src 0xffffffc06185f000 count 2 compressed 31 ---------------------------對應壓縮后的原始數(shù)據(jù)----------------------------- 第一個匹配: 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194411] 0xffffffc06185f000 31 #token:0001 1111 前四位是literal長度1,低4位15表示matchlength長度溢出,要看后面 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194422] 0xffffffc06185f000 0 #literal 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194433] 0xffffffc06185f000 1 #offset 小端序01 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194444] 0xffffffc06185f000 0 #offset 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194459] 0xffffffc06185f000 255 #matchLength begin 09-15 14:35:06.821 <3>[138, kswapd0][ 638.194469] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194483] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194494] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194505] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194551] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194565] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194579] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194590] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194602] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194612] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194624] 0xffffffc06185f000 219 #matchLength end: 219+255*11 3024 第二個匹配: 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194635] 0xffffffc06185f000 31 #Token:0001 1111 前四位是literal長度1 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194646] 0xffffffc06185f000 1 #literal 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194657] 0xffffffc06185f000 228 #offset 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194667] 0xffffffc06185f000 11 #offset 228(1110 0100) 11(1011) 改為小端序(1011 1110 0100)即3044 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194678] 0xffffffc06185f000 255 #matchLength begin 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194689] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194701] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194712] 0xffffffc06185f000 255 09-15 14:35:06.822 <3>[138, kswapd0][ 638.194747] 0xffffffc06185f000 7 #matchLength end:255*4+7 1027解壓算法
壓縮理解了其實解壓也很簡單
輸入:[token]abcde_(5,4)[token]fgh_(14,5)fghxxxxxxx 輸出:abcde_bcdefgh_abcdefghxxxxxxx根據(jù)解壓前的數(shù)據(jù)流,取出token內(nèi)的length,literals直接復制到輸出,即memcpy(src,dst,length)
遇到match,在從前面已經(jīng)拷貝的literals復制到后面即可
總結(jié)
以上是生活随笔為你收集整理的lz4压缩算法--速度之王的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: c#通讯西门子plc
- 下一篇: eclipse使用git提交代码步骤