[网络协议] TCP-Nagle：代码版本重新解释Nagle算法

开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 网络协议 -> TCP-Nagle：代码版本重新解释Nagle算法 -> 正文阅读

[网络协议]TCP-Nagle：代码版本重新解释Nagle算法

开年来的第一份工作，就是在最新的内核上打补丁。

可没想到Nagle算法也被我冲进了去年的垃圾桶里。

在网上找了一些资料，理论很快被消化，但看了看内核的实现，久久没能动弹。坐了一天，才摸索出来点什么，觉得需要一份代码解释TCP-Nagle的版本说明，这样和Nagle算法的黑盒解释，才能对TCP-Nagle有全面的理解。

于是，有了本文。

Nagle算法

RFC对Nagle算法解释

Nagle’s algorithm, named after John Nagle, is a means of improving the efficiency of TCP/IP networks
by reducing the number of packets that need to be sent over the network.
Nagle’s algorithm works by combining a number of small outgoing messages, and sending them
all at once. Specifically, as long as there is a sent packet for which the sender has received no
acknowledgment, the sender should keep buffering its output until it has a full packet’s worth of output,
so that output can be sent all at once.
Nagle’s algorithm purposefully delays transmission, increasing bandwidth efficiency at the expense of latency.

中文意思是：TCP-Nagle避免发送大量的小包（阻塞小包，粘合成大包），减少传输次数，但会有延迟的牺牲。

至于为什么会这样，大佬的文章已经有详细的展开，最后会贴上链接，欢迎去阅读。

TCP-Nagle的逻辑

if there is new data to send
    if the window size >= MSS and available data is >= MSS
        send complete MSS segment now
    else
        if there is unconfirmed data still in the pipe
            enqueue data in the buffer until an acknowledge is received
        else
            send data immediately
        end if
    end if
end if

TCP-Nagle 图解

在这里插入图片描述

Nagle图解完全依赖最新的内核实现。

可以看出，代码的实现短小精悍。那么来，详细解释下。

TCP-Nagle代码版本

TCP-Nagle显式标识事件

/* Flags in tp->nonagle */
#define TCP_NAGLE_OFF		1	/* Nagle's algo is disabled */
#define TCP_NAGLE_CORK		2	/* Socket is corked	    */
#define TCP_NAGLE_PUSH		4	/* Cork is overridden for already queued data */

除了这三个事件外，nonagle = 0 。

代码解释TCP-Nagle

TCP-Nagle仅仅对新报文有用，这使得它的实现只在tcp_write_xmit里。
在tcp_transmit_skb前，TCP会优先判断报文发送的时机——可靠性检查，TCP-Nagle是一种影响发送的因素，目的是为了减少小包的发送。

if (tso_segs == 1) {
			if (unlikely(!tcp_nagle_test(tp, skb, mss_now,
						     (tcp_skb_is_last(sk, skb) ?
						      nonagle : TCP_NAGLE_PUSH))))
				break;
		} else {
			if (!push_one &&
			    tcp_tso_should_defer(sk, skb, &is_cwnd_limited,
						 &is_rwnd_limited, max_segs))
				break;
		}

实际上，tcp_nagle_test和tcp_tso_should_defer都是TCP-Nagle，其中tcp_tso_should_defer讨论的是有GSO/TSO机制的介入，除了处理细节更复杂些，其他的道理和tcp_nagle_test相通，所以这里仅介绍tcp_nagle_test。

这里有个细节，tcp_skb_is_last(sk, skb) ? nonagle : TCP_NAGLE_PUSH保证了Nagle仅对发送队列sk->sk_write_queue最后一个skb有作用，原因是粘包只能发生在最后一个skb上。这也在tcp_nagle_test的具体实现中有提及。

接下来，看下TCP-Nagle的主战场tcp_nagle_test的实现。

/* Return true if the Nagle test allows this packet to be
 * sent now.
 */
 //表明tcp_nagle_test返回true意味TCP-Nagle不生效，false才会启用Nagle粘包。
static inline bool tcp_nagle_test(const struct tcp_sock *tp, const struct sk_buff *skb, unsigned int cur_mss, int nonagle)
{
	/* Nagle rule does not apply to frames, which sit in the middle of the
	 * write_queue (they have no chances to get new data).
	 *
	 * This is implemented in the callers, where they modify the 'nonagle'
	 * argument based upon the location of SKB in the send queue.
	 */
	if (nonagle & TCP_NAGLE_PUSH)
		return true;

	/* Don't use the nagle rule for urgent data (or for the final FIN). */
	if (tcp_urg_mode(tp) || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))
		return true;

	if (!tcp_nagle_check(skb->len < cur_mss, tp, nonagle))
		return true;

	return false;
}

tcp_nagle_test的实现表明TCP-Nagle会受到很多条件限制。

前面的条件很直白，TCP_NAGLE_PUSH、 tcp_urg_mode(tp)、 (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)都表示数据包必须立即发送。

读TCP-Nagle时，有些误区是在tcp_nagle_check(skb->len < cur_mss, tp, nonagle)中出现的。

所以，tcp_nagle_check值得看看。

/* Return false, if packet can be sent now without violation Nagle's rules:
 * 1. It is full sized. (provided by caller in %partial bool)
 * 2. Or it contains FIN. (already checked by caller)
 * 3. Or TCP_CORK is not set, and TCP_NODELAY is set.
 * 4. Or TCP_CORK is not set, and all sent packets are ACKed.
 *    With Minshall's modification: all sent small packets are ACKed.
 */
static bool tcp_nagle_check(bool partial, const struct tcp_sock *tp,
			    int nonagle)
{
	return partial &&
		((nonagle & TCP_NAGLE_CORK) ||
		 (!nonagle && tp->packets_out && tcp_minshall_check(tp)));
}

又是一堆条件。可是，在这里需要小心，所以我们仔细看看。

tcp_nagle_check的第一个行参是bool partial，上面传递的参数是skb->len < cur_mss，表明TCP-Nagle要生效的第一个条件是skb所有的数据存储长度在小于MSS值，这意味着一旦skb包的所有数据大小大于等于MSS时，立刻发出数据包。

这里严格的来说，TCP-Nagle并不是一个完全的包停-等协议。

再看看小于MSS时，后面使能TCP-Nagle的条件，即skb所有数据长度小于MSS值的后续条件。

((nonagle & TCP_NAGLE_CORK) ||
		 (!nonagle && tp->packets_out && tcp_minshall_check(tp)))

第一个条件，(nonagle & TCP_NAGLE_CORK)，TCP_NAGLE_CORK标识显式注定了TCP-Nagle使能。

若没有TCP_NAGLE_CORK，第二个条件也可以使能TCP-Nagle。(!nonagle && tp->packets_out && tcp_minshall_check(tp))表明：

nonagle没有被标记TCP_NAGLE_OFF，原因是前面已经排除了TCP_NAGLE_PUSH和TCP_NAGLE_CORK，Nagle的显式表示只有TCP_NAGLE_OFF，其余情况nonagle皆为0。
tp->packets_out表示网络中的包，这是个强类型，当tp->packets_out为0时，网络中的包全部被确认，此时数据包需要立刻发送；当tp->packets_out>0，表明网络中存在数据包，使能TCP-Nagle。
tcp_minshall_check(tp)是网络中仍存在小包，这是个弱类型，只要网络中没有小包，不管网络中是否还存在数据包未确认，都立即发送待发送的数据包；当网络中有小包时，则选择使能TCP-Nagle。

网络中有小包时，表明网络中必然有数据包；网络中无小包时，不管你有没有数据包，都立刻发送待发送的数据，所以，条件2是强类型（：更强的条件），条件3是弱类型。

对于大多数博客上说的delayACK和Nagle算法会导致时延增大，这点没有任何问题，但是黑盒方式的描述可能会让读者产生歧义，认为一个延迟的ACK是影响TCP-Nagle导致数据包滞后发送的原因。

实际上，TCP-Nagle在条件2、3上有解释，当接收所有的未确认报文或者是网络中不存在小包时，数据包会立刻发送。

这时，你可以这么理解，当对条件2、3判断时，可能会有多个延迟ACK的确认才会使得TCP-Nagle不使能，一个延迟ACK尚且有对数据包延迟有影响，当网络容量大时，多个延迟的ACK只会更加加重数据包的滞后，TCP-Nagle就变成了鸡肋。（TCP-Nagle是对网络中包的判断，并不是直接对ACK的判断）