网络流量控制 网络流量监控工具( 二 )


网络就像高速公路网,时而拥堵时而畅通无阻,对应在网络里就是RTT抖动,如何计算RTT和RTO直接影响重传进而影响通信效率,其计算也经过长时间的理论和实践过程 。
最初Phil Karn和Craig Partridge提出SRTT(Smoothed RTT,估算重传时间)和RTO计算如下:
SRTT = (α * SRTT) + ( (1-α) *RTT)RTO = min[UBOUND, max[LBOUND,(β * SRTT)]]其中α取0.8 ~ 0.9,β取1.3 ~ 2.0,UBOUND和LBOUND分别为上限时间和下限时间后来Jacobson 和 Karels做了一定的修正,最终形成RFC6298
https://datatracker.ietf.org/doc/html/rfc6298,SRTT,DevRTT(RTT偏差) ,RTO计算公式,具体如下:
SRTT = SRTT + α * (RTT - SRTT)DevRTT = (1-β) * DevRTT + β *(|RTT - SRTT|)RTO = μ * SRTT + δ * DevRTT其中α取0.125,β取0.25,μ取1,δ取4Linux内核4.x中SRTT计算代码如下:
/* Called to compute a smoothed rtt estimate. The data fed to this * routine either comes from timestamps, or from segments that were * known _not_ to have been retransmitted [see Karn/Partridge * Proceedings SIGCOMM 87]. The algorithm is from the SIGCOMM 88 * piece by Van Jacobson. * NOTE: the next three routines used to be one big routine. * To save cycles in the RFC 1323 implementation it was better to break * it up into three procedures. -- erics */static void tcp_rtt_estimator(struct sock *sk, long mrtt_us){struct tcp_sock *tp = tcp_sk(sk);long m = mrtt_us; /* RTT */u32 srtt = tp->srtt_us;/*The following amusing code comes from Jacobson's*article in SIGCOMM '88.Note that rtt and mdev*are scaled versions of rtt and mean deviation.*This is designed to be as fast as possible*m stands for "measurement".**On a 1990 paper the rto value is changed to:*RTO = rtt + 4 * mdev** Funny. This algorithm seems to be very broken.* These formulae increase RTO, when it should be decreased, increase* too slowly, when it should be increased quickly, decrease too quickly* etc. I guess in BSD RTO takes ONE value, so that it is absolutely* does not matter how to _calculate_ it. Seems, it was trap* that VJ failed to avoid. 8)*/if (srtt != 0) {m -= (srtt >> 3);/* m is now error in rtt est */srtt += m;/* rtt = 7/8 rtt + 1/8 new */if (m < 0) {m = -m;/* m is now abs(error) */m -= (tp->mdev_us >> 2);/* similar update on mdev *//* This is similar to one of Eifel findings.* Eifel blocks mdev updates when rtt decreases.* This solution is a bit different: we use finer gain* for mdev in this case (alpha*beta).* Like Eifel it also prevents growth of rto,* but also it limits too fast rto decreases,* happening in pure Eifel.*/if (m > 0)m >>= 3;} else {m -= (tp->mdev_us >> 2);/* similar update on mdev */}tp->mdev_us += m;/* mdev = 3/4 mdev + 1/4 new */if (tp->mdev_us > tp->mdev_max_us) {tp->mdev_max_us = tp->mdev_us;if (tp->mdev_max_us > tp->rttvar_us)tp->rttvar_us = tp->mdev_max_us;}if (after(tp->snd_una, tp->rtt_seq)) {if (tp->mdev_max_us < tp->rttvar_us)tp->rttvar_us -= (tp->rttvar_us - tp->mdev_max_us) >> 2;tp->rtt_seq = tp->snd_nxt;tp->mdev_max_us = tcp_rto_min_us(sk);}} else {/* no previous measure. */srtt = m << 3;/* take the measured time to be rtt */tp->mdev_us = m << 1;/* make sure rto = 3*rtt */tp->rttvar_us = max(tp->mdev_us, tcp_rto_min_us(sk));tp->mdev_max_us = tp->rttvar_us;tp->rtt_seq = tp->snd_nxt;}tp->srtt_us = max(1U, srtt);}Linux内核4.x中RTO计算代码:
/* Calculate rto without backoff.This is the second half of Van Jacobson's * routine referred to above. */void tcp_set_rto(struct sock *sk){const struct tcp_sock *tp = tcp_sk(sk);/* Old crap is replaced with new one. 8)** More seriously:* 1. If rtt variance happened to be less 50msec, it is hallucination.*It cannot be less due to utterly erratic ACK generation made*at least by solaris and freebsd. "Erratic ACKs" has _nothing_*to do with delayed acks, because at cwnd>2 true delack timeout*is invisible. Actually, Linux-2.4 also generates erratic*ACKs in some circumstances.*/inet_csk(sk)->icsk_rto = __tcp_set_rto(tp);/* 2. Fixups made earlier cannot be right.*If we do not estimate RTO correctly without them,*all the algo is pure shit and should be replaced*with correct one. It is exactly, which we pretend to do.*//* NOTE: clamping at TCP_RTO_MIN is not required, current algo* guarantees that rto is higher.*/tcp_bound_rto(sk);}static inline u32 __tcp_set_rto(const struct tcp_sock *tp){return usecs_to_jiffies((tp->srtt_us >> 3) + tp->rttvar_us);}

推荐阅读