TCP连接的建立(二)
被動(dòng)打開
SYN cookies
TCP協(xié)議開辟了一個(gè)比較大的內(nèi)存空間請(qǐng)求連接隊(duì)列來(lái)存儲(chǔ)連接請(qǐng)求塊,當(dāng)SYN請(qǐng)求不斷添加,請(qǐng)求連接數(shù)目到達(dá)上限時(shí),會(huì)致使系統(tǒng)丟棄SYN連接請(qǐng)求。SYN cookies技術(shù)就能夠使server在半連接隊(duì)列已滿的情況下仍能處理新的SYN請(qǐng)求。
當(dāng)半連接隊(duì)列滿時(shí),SYN cookies并不丟棄SYN請(qǐng)求。而是通過(guò)加密技術(shù)來(lái)標(biāo)識(shí)半連接狀態(tài)。在TCP實(shí)現(xiàn)中,當(dāng)收到client的SYN請(qǐng)求時(shí),server須要回復(fù)SYN+ACK包給client,然后client再發(fā)送確認(rèn)包給server。通常,server的初始序列號(hào)是由server依照一定的規(guī)律計(jì)算得到的隨機(jī)數(shù),而在SYN cookies中,server的初始序列號(hào)是由clientIP地址、clientport號(hào)、serverIP地址和serverport號(hào)、接收到的client初始序列號(hào)以及其它一些安全數(shù)值進(jìn)行hash運(yùn)算,并加密后得到的,稱之為cookies。
當(dāng)server遭受SYN攻擊使得請(qǐng)求連接隊(duì)列滿時(shí),server并不拒絕新的SYN請(qǐng)求,而是回復(fù)一個(gè)初始化序列號(hào)為cookies的SYN包給client。假設(shè)收到client的ACK段。server將client的ACK序列號(hào)減1得到的值。與用上述那些要素hash運(yùn)算得到的值比較,假設(shè)相等。直接完畢三次握手。注意:此時(shí)并不比查看此連接是否屬于請(qǐng)求連接隊(duì)列。
啟用SYN cookies是通過(guò)在啟動(dòng)環(huán)境中設(shè)置一下命令來(lái)完畢:
echo 1 > /proc/sys/net/ipv4/tcp_syncookies
第一次握手:接收SYN段
傳輸控制塊接收處理的段都有tcp_v4_do_rcv()處理,在該函數(shù)中再依據(jù)不同的狀態(tài)由不同的函數(shù)處理。
/* The socket must have it's spinlock held when we get* here.** We have a potential double-lock case here, so even when* doing backlog processing we use the BH locking scheme.* This is because we cannot sleep with the original spinlock* held.*/ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb) {struct sock *rsk; #ifdef CONFIG_TCP_MD5SIG/** We really want to reject the packet as early as possible* if:* o We're expecting an MD5'd packet and this is no MD5 tcp option* o There is an MD5 option and we're not expecting one*/if (tcp_v4_inbound_md5_hash(sk, skb))goto discard; #endifif (sk->sk_state == TCP_ESTABLISHED) { /* Fast path */TCP_CHECK_TIMER(sk);if (tcp_rcv_established(sk, skb, tcp_hdr(skb), skb->len)) {rsk = sk;goto reset;}TCP_CHECK_TIMER(sk);return 0;}if (skb->len < tcp_hdrlen(skb) || tcp_checksum_complete(skb))goto csum_err;if (sk->sk_state == TCP_LISTEN) {struct sock *nsk = tcp_v4_hnd_req(sk, skb);if (!nsk)goto discard;if (nsk != sk) {if (tcp_child_process(sk, nsk, skb)) {rsk = nsk;goto reset;}return 0;}}TCP_CHECK_TIMER(sk);if (tcp_rcv_state_process(sk, skb, tcp_hdr(skb), skb->len)) {rsk = sk;goto reset;}TCP_CHECK_TIMER(sk);return 0;reset:tcp_v4_send_reset(rsk, skb); discard:kfree_skb(skb);/* Be careful here. If this function gets more complicated and* gcc suffers from register pressure on the x86, sk (in %ebx)* might be destroyed here. This current version compiles correctly,* but you have been warned.*/return 0;csum_err:TCP_INC_STATS_BH(sock_net(sk), TCP_MIB_INERRS);goto discard; } 第二次握手:發(fā)送SYN+ACK段
tcp_v4_send_synack()用來(lái)為服務(wù)端構(gòu)造回應(yīng)client連接請(qǐng)求SYN段的SYN+ACK段,并將其封裝在IP數(shù)據(jù)報(bào)中發(fā)送給client。
/** Send a SYN-ACK after having received a SYN.* This still operates on a request_sock only, not on a big* socket.*/ static int __tcp_v4_send_synack(struct sock *sk, struct request_sock *req,struct dst_entry *dst) {const struct inet_request_sock *ireq = inet_rsk(req);int err = -1;struct sk_buff * skb;/* First, grab a route. */if (!dst && (dst = inet_csk_route_req(sk, req)) == NULL)return -1;skb = tcp_make_synack(sk, dst, req);if (skb) {struct tcphdr *th = tcp_hdr(skb);th->check = tcp_v4_check(skb->len,ireq->loc_addr,ireq->rmt_addr,csum_partial(th, skb->len,skb->csum));err = ip_build_and_send_pkt(skb, sk, ireq->loc_addr,ireq->rmt_addr,ireq->opt);err = net_xmit_eval(err);}dst_release(dst);return err; } static int tcp_v4_send_synack(struct sock *sk, struct request_sock *req) {return __tcp_v4_send_synack(sk, req, NULL); } 第三次握手:接收ACK段
服務(wù)端接收到SYN段后,會(huì)為將建立的連接創(chuàng)建一個(gè)連接請(qǐng)求塊,同一時(shí)候發(fā)送SYN+ACK段給client作為回應(yīng),然后啟動(dòng)建立連接定時(shí)器,等待client最后一次握手的ACK段
connect系統(tǒng)調(diào)用的實(shí)現(xiàn)
inet_stream_connect()是connect系統(tǒng)調(diào)用的套接口層實(shí)現(xiàn),首先校驗(yàn)設(shè)置的地址族,然后校驗(yàn)套接口狀態(tài),套接口狀態(tài)為SS_UNCONNECTED時(shí)調(diào)用傳輸層接口。TCP中為tcp_v4_connect()。最后,等待連接的完畢或失敗。
/** Connect to a remote host. There is regrettably still a little* TCP 'magic' in here.*/ int inet_stream_connect(struct socket *sock, struct sockaddr *uaddr,int addr_len, int flags) {struct sock *sk = sock->sk;int err;long timeo;lock_sock(sk);if (uaddr->sa_family == AF_UNSPEC) {err = sk->sk_prot->disconnect(sk, flags);sock->state = err ? SS_DISCONNECTING : SS_UNCONNECTED;goto out;}switch (sock->state) {default:err = -EINVAL;goto out;case SS_CONNECTED:err = -EISCONN;goto out;case SS_CONNECTING:err = -EALREADY;/* Fall out of switch with err, set for this state */break;case SS_UNCONNECTED:err = -EISCONN;if (sk->sk_state != TCP_CLOSE)goto out;err = sk->sk_prot->connect(sk, uaddr, addr_len);if (err < 0)goto out;sock->state = SS_CONNECTING;/* Just entered SS_CONNECTING state; the only* difference is that return value in non-blocking* case is EINPROGRESS, rather than EALREADY.*/err = -EINPROGRESS;break;}timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) {/* Error code is set above */if (!timeo || !inet_wait_for_connect(sk, timeo))goto out;err = sock_intr_errno(timeo);if (signal_pending(current))goto out;}/* Connection was closed by RST, timeout, ICMP error* or another process disconnected us.*/if (sk->sk_state == TCP_CLOSE)goto sock_error;/* sk->sk_err may be not zero now, if RECVERR was ordered by user* and error was received after socket entered established state.* Hence, it is handled normally after connect() return successfully.*/sock->state = SS_CONNECTED;err = 0; out:release_sock(sk);return err;sock_error:err = sock_error(sk) ? : -ECONNABORTED;sock->state = SS_UNCONNECTED;if (sk->sk_prot->disconnect(sk, flags))sock->state = SS_DISCONNECTING;goto out; } 調(diào)用傳輸層接口,連接須要三層握手,connect接口僅僅是完畢發(fā)送SYN段過(guò)程,興許兩次握手由協(xié)議棧完畢。
SYN段發(fā)送成功后,興許僅僅需等待第三次握手結(jié)束。
主動(dòng)打開
第一次握手:發(fā)送SYN段
初始化client傳輸控制塊并發(fā)送SYN段,通過(guò)tcp_v4_connect()完畢
/* This will initiate an outgoing connection. */ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) {struct inet_sock *inet = inet_sk(sk);struct tcp_sock *tp = tcp_sk(sk);struct sockaddr_in *usin = (struct sockaddr_in *)uaddr;struct rtable *rt;__be32 daddr, nexthop;int tmp;int err;if (addr_len < sizeof(struct sockaddr_in))return -EINVAL;if (usin->sin_family != AF_INET)return -EAFNOSUPPORT;nexthop = daddr = usin->sin_addr.s_addr;if (inet->opt && inet->opt->srr) {if (!daddr)return -EINVAL;nexthop = inet->opt->faddr;}tmp = ip_route_connect(&rt, nexthop, inet->saddr,RT_CONN_FLAGS(sk), sk->sk_bound_dev_if,IPPROTO_TCP,inet->sport, usin->sin_port, sk, 1);if (tmp < 0) {if (tmp == -ENETUNREACH)IP_INC_STATS_BH(sock_net(sk), IPSTATS_MIB_OUTNOROUTES);return tmp;}if (rt->rt_flags & (RTCF_MULTICAST | RTCF_BROADCAST)) {ip_rt_put(rt);return -ENETUNREACH;}if (!inet->opt || !inet->opt->srr)daddr = rt->rt_dst;if (!inet->saddr)inet->saddr = rt->rt_src;inet->rcv_saddr = inet->saddr;if (tp->rx_opt.ts_recent_stamp && inet->daddr != daddr) {/* Reset inherited state */tp->rx_opt.ts_recent = 0;tp->rx_opt.ts_recent_stamp = 0;tp->write_seq = 0;}if (tcp_death_row.sysctl_tw_recycle &&!tp->rx_opt.ts_recent_stamp && rt->rt_dst == daddr) {struct inet_peer *peer = rt_get_peer(rt);/** VJ's idea. We save last timestamp seen from* the destination in peer table, when entering state* TIME-WAIT * and initialize rx_opt.ts_recent from it,* when trying new connection.*/if (peer != NULL &&peer->tcp_ts_stamp + TCP_PAWS_MSL >= get_seconds()) {tp->rx_opt.ts_recent_stamp = peer->tcp_ts_stamp;tp->rx_opt.ts_recent = peer->tcp_ts;}}inet->dport = usin->sin_port;inet->daddr = daddr;inet_csk(sk)->icsk_ext_hdr_len = 0;if (inet->opt)inet_csk(sk)->icsk_ext_hdr_len = inet->opt->optlen;tp->rx_opt.mss_clamp = 536;/* Socket identity is still unknown (sport may be zero).* However we set state to SYN-SENT and not releasing socket* lock select source port, enter ourselves into the hash tables and* complete initialization after this.*/tcp_set_state(sk, TCP_SYN_SENT);err = inet_hash_connect(&tcp_death_row, sk);if (err)goto failure;err = ip_route_newports(&rt, IPPROTO_TCP,inet->sport, inet->dport, sk);if (err)goto failure;/* OK, now commit destination to socket. */sk->sk_gso_type = SKB_GSO_TCPV4;sk_setup_caps(sk, &rt->u.dst);if (!tp->write_seq)tp->write_seq = secure_tcp_sequence_number(inet->saddr,inet->daddr,inet->sport,usin->sin_port);inet->id = tp->write_seq ^ jiffies;err = tcp_connect(sk);rt = NULL;if (err)goto failure;return 0;failure:/** This unhashes the socket and releases the local port,* if necessary.*/tcp_set_state(sk, TCP_CLOSE);ip_rt_put(rt);sk->sk_route_caps = 0;inet->dport = 0;return err; } 第二次握手:接收SYN+ACK段
處于SYN_SENT狀態(tài)的傳輸控制塊,通過(guò)tcp_rcv_state_process()來(lái)處理。
tcp_send_ack()用來(lái)發(fā)送一個(gè)ACK段,同一時(shí)候更新窗體
/* This routine sends an ack and also updates the window. */ void tcp_send_ack(struct sock *sk) {struct sk_buff *buff;/* If we have been reset, we may not send again. */if (sk->sk_state == TCP_CLOSE)return;/* We are not putting this on the write queue, so* tcp_transmit_skb() will set the ownership to this* sock.*/buff = alloc_skb(MAX_TCP_HEADER, GFP_ATOMIC);if (buff == NULL) {inet_csk_schedule_ack(sk);inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN;inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,TCP_DELACK_MAX, TCP_RTO_MAX);return;}/* Reserve space for headers and prepare control bits. */skb_reserve(buff, MAX_TCP_HEADER);tcp_init_nondata_skb(buff, tcp_acceptable_seq(sk), TCPCB_FLAG_ACK);/* Send it off, this clears delayed acks for us. */TCP_SKB_CB(buff)->when = tcp_time_stamp;tcp_transmit_skb(sk, buff, 0, GFP_ATOMIC); } 發(fā)送ACK段時(shí),TCP必須不在CLOSE狀態(tài)。
為ACK段分配一個(gè)SKB,假設(shè)分配失敗則在啟動(dòng)延時(shí)定時(shí)器后返回。
本文轉(zhuǎn)自mfrbuaa博客園博客,原文鏈接:http://www.cnblogs.com/mfrbuaa/p/5126446.html,如需轉(zhuǎn)載請(qǐng)自行聯(lián)系原作者
總結(jié)
以上是生活随笔為你收集整理的TCP连接的建立(二)的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: php - Api 接口写法规范和要求
- 下一篇: skynet 报错 skynet 服务缺