websocket binary 数据解析_WebSocket实现原理相关知识点
WebSocket 是建立在 TCP/IP 協(xié)議之上,屬于應(yīng)用層的協(xié)議,而 Socket 是在應(yīng)用層和傳輸層中的一個抽象層,它是將 TCP/IP 層的復(fù)雜操作抽象成幾個簡單的接口來提供給應(yīng)用層調(diào)用。為什么要做這次替換呢?原因是我們服務(wù)端在做改造,同時網(wǎng)頁版 IM 已經(jīng)使用了 WebSocket ,客戶端也采用的話對于服務(wù)端來說維護一套代碼會更好更方便,而且 WebSocket 在體積、實時性和擴展上都具有一定的優(yōu)勢。
WebSocket 最新的協(xié)議是 13 RFC 6455 ,要理解 WebSocket 的實現(xiàn),一定要去理解它的協(xié)議!~
前言
WebSocket 的實現(xiàn)分為握手,數(shù)據(jù)發(fā)送/讀取,關(guān)閉連接。
這里首先放上一張我們組 @省長 (推薦大家去讀一讀省長的博客,干貨很多)整理出來的流程圖,方便大家去理解:
握手
握手要從請求頭去理解。
WebSocket 首先發(fā)起一個 HTTP 請求,在請求頭加上 Upgrade 字段,該字段用于改變 HTTP 協(xié)議版本或者是換用其他協(xié)議,這里我們把 Upgrade 的值設(shè)為 websocket ,將它升級為 WebSocket 協(xié)議。
同時要注意 Sec-WebSocket-Key 字段,它由客戶端生成并發(fā)給服務(wù)端,用于證明服務(wù)端接收到的是一個可受信的連接握手,可以幫助服務(wù)端排除自身接收到的由非 WebSocket 客戶端發(fā)起的連接,該值是一串隨機經(jīng)過 base64 編碼的字符串。
GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13我們可以簡化請求頭,將請求以字符串方式發(fā)送出去,當(dāng)然別忘了最后的兩個空行作為包結(jié)束:
const char * fmt = "GET %s HTTP/1.1" "Upgrade: websocket" "Connection: Upgrade" "Host: %s" "Sec-WebSocket-Key: %s" "Sec-WebSocket-Version: 13" ""; size = strlen(fmt) + strlen(path) + strlen(host) + strlen(ws->key); buf = (char *)malloc(size); sprintf(buf, fmt, path, host, ws->key); size = strlen(buf); nbytes = ws->io_send(ws, ws->context, buf, size); 收到請求后,服務(wù)端也會做一次響應(yīng):HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=里面重要的是 Sec-WebSocket-Accept ,服務(wù)端通過從客戶端請求頭中讀取 Sec-WebSocket-Key 與一串全局唯一的標(biāo)識字符串(俗稱魔串)“258EAFA5-E914-47DA- 95CA-C5AB0DC85B11”做拼接,生成長度為160位的 SHA-1 字符串,然后進行 base64 編碼,作為 Sec-WebSocket-Accept 的值回傳給客戶端。
處理握手 HTTP 響應(yīng)解析的時候,可以用 nodejs 的 http-paser ,解析方式也比較簡單,就是對頭信息的逐字讀取再處理,具體處理你可以看一下它的狀態(tài)機實現(xiàn)。解析完成后你需要對其內(nèi)容進行解析,看返回是否正確,同時去管理你的握手狀態(tài)。
數(shù)據(jù)發(fā)送/讀取
數(shù)據(jù)的處理就要拿這個幀協(xié)議圖來說明了:
首先我們來看看數(shù)字的含義,數(shù)字表示位,0-7表示有8位,等于1個字節(jié)。
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1所以如果要組裝一個幀數(shù)據(jù)可以這樣子:
char *rev = (rev *)malloc(4); rev[0] = (char)(0x81 & 0xff); rev[1] = 126 & 0x7f; rev[2] = 1; rev[3] = 0;ok,了解了幀數(shù)據(jù)的樣子,我們反過來去理解值對應(yīng)的幀字段。
首先0x81是什么,這個是十六進制數(shù)據(jù),轉(zhuǎn)換成二進制就是1000 0001, 是一個字節(jié)的長度,也就是這一段里面每一位的值:
- FIN 表示該幀是不是消息的最后一幀,1表示結(jié)束,0表示還有下一幀。
- RSV1, RSV2, RSV3 必須為0,除非擴展協(xié)商定義了一個非0的值,如果沒有定義非0值,且收到了非0的 RSV ,那么 WebSocket 的連接會失效。
- opcode 用來描述 Payload data 的定義,如果收到了一個未知的 opcode ,同樣會使 WebSocket 連接失效,協(xié)議定義了以下值:
0xff 作用就是取出需要的二進制值。
下面再來看126,126則表示的是 Payload len ,也就是 Payload 的長度:
- MASK 表示Playload data 是否要加掩碼,如果設(shè)成1,則需要賦值 Masking-key 。所有從客戶端發(fā)到服務(wù)端的幀都要加掩碼
- Playload len 表示 Payload 的長度,這里分為三種情況
- 長度小于126,則只需要7位
- 長度是126,則需要額外2個字節(jié)的大小,也就是 Extended payload length
- 長度是127,則需要額外8個字節(jié)的大小,也就是 Extended payload length + Extended payload length continued ,Extended payload length 是2個字節(jié),Extended payload length continued 是6個字節(jié)
- Playload len 則表示 Extension data 與 Application data 的和
而數(shù)據(jù)的發(fā)送和讀取就是對幀的封裝和解析。
數(shù)據(jù)發(fā)送:void ws__wrap_packet(_WS_IN websocket_t *ws, _WS_IN const char *payload, _WS_IN unsigned long long payload_size, _WS_IN int flags, _WS_OUT char** out, _WS_OUT uint64_t *out_size) { struct timeval tv; char mask[4]; unsigned int mask_int; unsigned int payload_len_bits; unsigned int payload_bit_offset = 6; unsigned int extend_payload_len_bits, i; unsigned long long frame_size; const int MASK_BIT_LEN = 4; gettimeofday(&tv, NULL); srand(tv.tv_usec * tv.tv_sec); mask_int = rand(); memcpy(mask, &mask_int, 4); /** * payload_len bits * ref to https://tools.ietf.org/html/rfc6455#section-5.2 * If 0-125, that is the payload length * * If payload length is equals 126, the following 2 bytes interpreted as a * 16-bit unsigned integer are the payload length * * If 127, the following 8 bytes interpreted as a 64-bit unsigned integer (the * most significant bit MUST be 0) are the payload length. */ if (payload_size 125) { // consts of ((fin + rsv1/2/3 + opcode) + payload-len bits + mask bit len + payload len) extend_payload_len_bits = 0; frame_size = 1 + 1 + MASK_BIT_LEN + payload_size; payload_len_bits = payload_size; } else if (payload_size > 125 && payload_size 0xffff) { extend_payload_len_bits = 2; // consts of ((fin + rsv1/2/3 + opcode) + payload-len bits + extend-payload-len bites + mask bit len + payload len) frame_size = 1 + 1 + extend_payload_len_bits + MASK_BIT_LEN + payload_size; payload_len_bits = 126; payload_bit_offset += extend_payload_len_bits; } else if (payload_size > 0xffff && payload_size 0xffffffffffffffffLL) { extend_payload_len_bits = 8; // consts of ((fin + rsv1/2/3 + opcode) + payload-len bits + extend-payload-len bites + mask bit len + payload len) frame_size = 1 + 1 + extend_payload_len_bits + MASK_BIT_LEN + payload_size; payload_len_bits = 127; payload_bit_offset += extend_payload_len_bits; } else { if (ws->error_cb) { ws_error_t *err = ws_new_error(WS_SEND_DATA_TOO_LARGE_ERR); ws->error_cb(ws, err); free(err); } return ; } *out_size = frame_size; char *data = (*out) = (char *)malloc(frame_size); char *buf_offset = data; bzero(data, frame_size); *data = flags & 0xff; buf_offset = data + 1; // set mask bit = 1 *(buf_offset) = payload_len_bits | 0x80; //payload length with mask bit on buf_offset = data + 2; if (payload_len_bits == 126) { payload_size &= 0xffff; } else if (payload_len_bits == 127) { payload_size &= 0xffffffffffffffffLL; } for (i = 0; i *(buf_offset + i) = *((char *)&payload_size + (extend_payload_len_bits - i - 1)); } /** * according to https://tools.ietf.org/html/rfc6455#section-5.3 * * buf_offset is set to mask bit */ buf_offset = data + payload_bit_offset - 4; for (i = 0; i 4; i++) { *(buf_offset + i) = mask[i] & 0xff; } /** * mask the payload data */ buf_offset = data + payload_bit_offset; memcpy(buf_offset, payload, payload_size); mask_payload(mask, buf_offset, payload_size); } void mask_payload(char mask[4], char *payload, unsigned long long payload_size) { unsigned long long i; for(i = 0; i *(payload + i) ^= mask[i % 4] & 0xff; } } 數(shù)據(jù)解析:int ws_recv(websocket_t *ws) { if (ws->state return ws_do_handshake(ws); } int ret; while(true) { ret = ws__recv(ws); if (ret != OK) { break; } } return ret; } int ws__recv(websocket_t *ws) { if (ws->state return ws_do_handshake(ws); } int ret = OK, i; int state = ws->rd_state; char *rd_buf; switch(state) { case WS_READ_IDLE: { ret = ws__make_up(ws, 2); if (ret != OK) { return ret; } ws_frame_t * frame; if (ws->c_frame == NULL) { ws__append_frame(ws); } frame = ws->c_frame; rd_buf = ws->buf; frame->fin = (*(rd_buf) & 0x80) == 0x80 ? 1 : 0; frame->op_code = *(rd_buf) & 0x0fu; frame->payload_len = *(rd_buf + 1) & 0x7fu; if (frame->payload_len 126) { frame->payload_bit_offset = 2; ws->rd_state = WS_READ_PAYLOAD; } else if (frame -> payload_len == 126) { frame->payload_bit_offset = 4; ws->rd_state = WS_READ_EXTEND_PAYLOAD_2_WORDS; } else { frame->payload_bit_offset = 8; ws->rd_state = WS_READ_EXTEND_PAYLOAD_8_WORDS; } ws__reset_buf(ws, 2); break; } case WS_READ_EXTEND_PAYLOAD_2_WORDS: { #define PAYLOAD_LEN_BITS 2 ret = ws__make_up(ws, PAYLOAD_LEN_BITS); if (ret != OK) { return ret; } rd_buf = ws->buf; ws_frame_t * frame = ws->c_frame; char *payload_len_bytes = (char *)&frame->payload_len; for (i = 0; i *(payload_len_bytes + i) = rd_buf[PAYLOAD_LEN_BITS - 1 - i]; } ws__reset_buf(ws, PAYLOAD_LEN_BITS); ws->rd_state = WS_READ_PAYLOAD; #undef PAYLOAD_LEN_BITS break; } case WS_READ_EXTEND_PAYLOAD_8_WORDS: { #define PAYLOAD_LEN_BITS 8 ret = ws__make_up(ws, PAYLOAD_LEN_BITS); if (ret != OK) { return ret; } rd_buf = ws->buf; ws_frame_t * frame = ws->c_frame; char *payload_len_bytes = (char *)&frame->payload_len; for (i = 0; i *(payload_len_bytes + i) = rd_buf[PAYLOAD_LEN_BITS - 1 - i]; } ws__reset_buf(ws, PAYLOAD_LEN_BITS); ws->rd_state = WS_READ_PAYLOAD; #undef PAYLOAD_LEN_BITS break; } case WS_READ_PAYLOAD: { ws_frame_t * frame = ws->c_frame; uint64_t payload_len = frame->payload_len; ret = ws__make_up(ws, payload_len); if (ret != OK) { return ret; } rd_buf = ws->buf; frame->payload = malloc(payload_len); memcpy(frame->payload, rd_buf, payload_len); ws__reset_buf(ws, payload_len); if (frame->fin == 1) { // is control frame ws__dispatch_msg(ws, frame); ws__clean_frame(ws); } else { ws__append_frame(ws); } ws->rd_state = WS_READ_IDLE; break; } } return ret; }關(guān)閉連接
關(guān)閉連接分為兩種:服務(wù)端發(fā)起關(guān)閉和客戶端主動關(guān)閉。
服務(wù)端跟客戶端的處理基本一致,以服務(wù)端為例:
服務(wù)端發(fā)起關(guān)閉的時候,會客戶端發(fā)送一個關(guān)閉幀,客戶端在接收到幀的時候通過解析出幀的opcode來判斷是否是關(guān)閉幀,然后同樣向服務(wù)端再發(fā)送一個關(guān)閉幀作為回應(yīng)。
if (op_code == OP_CLOSE) { int status_code; char *reason; char *status_code_buf = (char *)&status_code; status_code_buf[0] = payload[1]; status_code_buf[1] = payload[0]; reason = payload + 2; if (ws->state != WS_STATE_CLOSED) { /** * should send response to remote server */ ws_send(ws, NULL, 0, OP_CLOSE | FLAG_FIN); ws->state = WS_STATE_CLOSED; } // close connection if (ws->close_cb) { ws->close_cb(ws, status_code, reason); } }總結(jié)
對WebSocket的學(xué)習(xí)主要是對協(xié)議的理解,理解了協(xié)議,上面復(fù)雜的代碼自然而然就會明白~
總結(jié)
以上是生活随笔為你收集整理的websocket binary 数据解析_WebSocket实现原理相关知识点的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 8G内存条单颗粒,性能独步全球,稳定可靠
- 下一篇: 10g数据库入门与实践 oracle_从