Foundation TopicsThe term "congestion avoidance" describes a small set of IOS tools that help queues avoid congestion. Queues fill when the cumulative offered load from the various senders of packets exceeds the line rate of the interface (or the shaping rate if shaping is enabled). When more traffic needs to exit the interface than the interface can support, queues form. Queuing tools help us manage the queues; congestion-avoidance tools help us reduce the level of congestion in the queues by selectively dropping packets. Once again, a QoS tool gives you the opportunity to make tradeoffs between QoS characteristicsin this case, packet loss versus delay and jitter. However, the tradeoff is not so simple in this case. It turns out that by selectively discarding some packets before the queues get completely full, cumulative packet loss can be reduced, and queuing delay and queuing jitter can also be reduced! When you prune a plant, you kill some of the branches, but the plant gets healthier and more beautiful through the process. Similarly, congestion-avoidance tools discard some packets, but in doing so achieve the overall effect of a healthier network. This chapter begins by explaining the core concepts that create the need for congestion-avoidance tools. Following this discussion, the underlying algorithms, which are based RED, are covered. Finally, the chapter includes coverage of configuration and monitoring for two IOS congestionavoidance tools, WRED and ECN. Congestion-Avoidance Concepts and Random Early Detection (RED)Congestion-avoidance tools rely on the behavior of TCP to reduce congestion. A large percentage of Internet traffic consists of TCP traffic, and TCP senders reduce the rate at which they send packets after packet loss. By purposefully discarding a percentage of packets, congestion-avoidance tools cause some TCP connections to slow down, which reduces congestion. This section begins with a discussion of User Datagram Protocol (UDP) and TCP behavior when packets are lost. By understanding TCP behavior in particular, you can appreciate what happens as a result of tail drop, which is covered next in this section. Finally, to close this section, RED is covered. The two IOS congestion-avoidance tools both use the underlying concepts of RED. TCP and UDP Reactions to Packet LossUDP and TCP behave very differently when packets are lost. UDP, by itself, does not react to packet loss, because UDP does not include any mechanism with which to know whether a packet was lost. TCP senders, however, slow down the rate at which they send after recognizing that a packet was lost. Unlike UDP, TCP includes a field in the TCP header to number each TCP segment (sequence number), and another field used by the receiver to confirm receipt of the packets (acknowledgment number). When a TCP receiver signals that a packet was not received, or if an acknowledgment is not received at all, the TCP sender assumes the packet was lost, and resends the packet. More importantly, the sender also slows down sending data into the network. TCP uses two separate window sizes that determine the maximum window size of data that can be sent before the sender must stop and wait for an acknowledgment. The first of the two different windowing features of TCP uses the Window field in the TCP header, which is also called the receiver window or the advertised window. The receiver grants the sender the right to send x bytes of data before requiring an acknowledgment, by setting the value x into the Window field of the TCP header. The receiver grants larger and larger windows as time goes on, reaching the point at which the TCP sender never stops sending, with acknowledgments arriving just before a complete window of traffic has been sent. The second window used by TCP is called the congestion window, or CWND, as defined by RFC 2581. Unlike the advertised window, the congestion window is not communicated between the receiver and sender using fields in the TCP header. Instead, the TCP sender calculates CWND. CWND varies in size much more quickly than does the advertised window, because it was designed to react to congestion in networks. The TCP sender always uses the lower of the two windows to determine how much data it can send before receiving an acknowledgment. The receiver window is designed to let the receiver prevent the sender from sending data faster than the receiver can process the data. The CWND is designed to let the sender react to network congestion by slowing down its sending rate. It is the variation in the CWND, in reaction to lost packets, which RED relies upon. To appreciate how RED works, you need to understand the processes by which a TCP sender lowers and increases the CWND. CWND is lowered in response to lost segments. CWND is raised based on the logic defined as the TCP slow start and TCP congestion-avoidance algorithms. In fact, most people use the term "slow start" to describe both features together, in part because they work closely together. The process works like this:
Therefore, when a TCP sender fails to receive an acknowledgment, it reduces the CWND to a very low value (one segment size of window). This process is sometimes called slamming the window or slamming the window shut. The sender progressively increases CWND based first on slow start, and then on congestion avoidance. As you go through this text, remember that the TCP windows use a unit of bytes in reality. To make the discussion a little easier, I have listed the windows as a number of segments, which makes the actual numbers more obvious. Slow start increases CWND by the maximum segment size for every packet for which it receives an acknowledgment. Because TCP receivers may, and typically do, acknowledge segments well before the full window has been sent by the sender, CWND grows at an exponential rate during slow starta seemingly contradictory concept. Slow start gets its name from the fact that CWND has been set to a very low value at the beginning of the process, meaning it starts slowly, but slow start does cause CWND to grow quickly. Figure 7-1 outlines the process of how the TCP sender grows CWND upon the receipt of each acknowledgement. Figure 7-1. Growing CWND for each Received Acknowledgement
By increasing CWND when each acknowledgment is received, CWND actually increases at an exponential rate. So, Slow Start might be better called slow start but fast recovery. Congestion avoidance is the second mechanism that dictates how quickly CWND increases after being lowered. As CWND grows, it begins to approach the original CWND value. If the original packet loss was a result of queue congestion, letting this TCP connection increase back to the original CWND may then induce the same congestion that caused the CWND to be lowered in the first place. Congestion avoidance just reduces the rate of increase for CWND as it approaches the previous CWND value. Once slow start has increased CWND to the value of SSTHRESH, which was set to 50 percent of the original CWND, congestion-avoidance logic replaces the slow start logic for increasing CWND. Congestion avoidance uses a formula that allows CWND to grow more slowly, essentially at a linear rate. Figure 7-2 shows a graph of CWND with just slow start, and with slow start and congestion avoidance, after the sender times out waiting for an acknowledgment. Figure 7-2. Graphs of CWND with Slow Start and Congestion Avoidance
Many people do not realize that the slow start process consists of a combination of the slow start algorithm and the congestion-avoidance algorithm. With slow start, CWND is lowered, but it grows quickly. With congestion avoidance, the CWND value grows more slowly as it approaches the previous CWND value. In summary, UDP and TCP react to packet loss in the following ways:
Note Depending on the circumstances, TCP sometimes halves CWND in reaction to lost packets, and in some cases it lowers CWND to one segment size, as was described in this first section. The more severe reaction of reducing the window to one segment size was shown in this section for a more complete description of slow start and congestion avoidance. The course upon which the QoS exam combines all these concepts under the term Slow Start. Tail Drop, Global Synchronization, and TCP StarvationTail drop occurs when a packet needs to be added to a queue, but the queue is full, so the router must discard the packet. Yes, tail drop is indeed that simple. However, tail drop results in some interesting behavior in real networks, particularly when most traffic is TCP based, but with some UDP traffic. Of course, the Internet today delivers mostly TCP traffic, because web and email traffic use TCP. The preceding section described the behavior of a single TCP connection after a single packet loss. Now imagine an Internet router, with 100,000 or more TCP connections running their traffic out of a high-speed interface. The amount of traffic in the combined TCP connections finally exceeds the output line rate, causing the output queue on the interface to fill, which in turn causes tail drop. What happens to those 100,000 TCP connections after many of them have at least one packet dropped? The TCP connections reduce their CWND; the congestion in the queue abates; the various CWND values increase with slow start, and then with congestion avoidance. Eventually, however, as the CWND values of the collective TCP connections approach the previous CWND value, the congestion occurs again, and the process is repeated. When a large number of TCP connections experience near simultaneous packet loss, the lowering and growth of CWND at about the same time causes the TCP connections to synchronize. The result is called global synchronization. The graph in Figure 7-3 shows this behavior. Figure 7-3. Graph of Global Synchronization
The graph shows the results of global synchronization. The router never fully utilizes the bandwidth on the link because the offered rate keeps dropping as a result of synchronization. Note that the overall rate does not drop to almost nothing because not all TCP connections happen to have packets drop when tail drop occurs, and some traffic uses UDP, which does not slow down in reaction to lost packets. Weighted RED (WRED), when applied to the interface that was tail dropping packets, significantly reduces global synchronization. WRED allows the average output rates to approach line rate, with even more significant throughput improvements, because avoiding congestion and tail drops decreases the overall number of lost packets. Figure 7-4 shows an example graph of the same interface, after WRED was applied. Figure 7-4. Graph of Traffic Rates After the Application of WRED
Another problem can occur if UDP traffic competes with TCP for bandwidth and queue space. Although UDP traffic consumes a much lower percentage of Internet bandwidth than TCP does, UDP can get a disproportionate amount of bandwidth as a result of TCP's reaction to packet loss. Imagine that on the same Internet router, 20 percent of the offered packets were UDP, and 80 percent TCP. Tail drop causes some TCP and UDP packets to be dropped; however, because the TCP senders slow down, and the UDP senders do not, additional UDP streams from the UDP senders can consume more and more bandwidth during congestion. Taking the same concept a little deeper, imagine that several people crank up some UDP-based audio or video streaming applications, and that traffic also happens to need to exit this same congested interface. The interface output queue on this Internet router could fill with UDP packets. If a few high-bandwidth UDP applications fill the queue, a larger percentage of TCP packets might get tail droppedresulting in further reduction of TCP windows, and less TCP traffic relative to the amount of UDP traffic. The term "TCP starvation" describes the phenomena of the output queue being filled with larger volumes of UDP, causing TCP connections to have packets tail dropped. Tail drop does not distinguish between packets in any way, including whether they are TCP or UDP, or whether the flow uses a lot of bandwidth or just a little bandwidth. TCP connections can be starved for bandwidth because the UDP flows behave poorly in terms of congestion control. Flow-Based WRED (FRED), which is also based on RED, specifically addresses the issues related to TCP starvation. FRED has limited applicability, and is not currently mentioned in the QoS exam topics. However, if you would like to read more about it, you can refer to Appendix B, "Additional QoS Reference Materials," which contains coverage of FRED from the previous edition of this book. Random Early Detection (RED)Random Early Detection (RED) reduces the congestion in queues by dropping packets so that some of the TCP connections temporarily send fewer packets into the network. Instead of waiting until a queue fills, causing a large number of tail drops, RED purposefully drops a percentage of packets before a queue fills. This action attempts to make the computers sending the traffic reduce the offered load that is sent into the network. The name "Random Early Detection" itself describes the overall operation of the algorithm. RED randomly picks the packets that are dropped after the decision to drop some packets has been made. RED detects queue congestion early, before the queue actually fills, thereby avoiding tail drops and synchronization. In short, RED discards some randomly picked packets early, before congestion gets really bad and the queue fills. Note IOS supports three RED-based tools: Weighted RED (WRED), Explicit Congestion Notification (ECN), and Flow-Based WRED (FRED). RED itself is not supported in IOS. RED logic contains two main parts. RED must first detect when congestion occurs; in other words, RED must choose under what conditions it should discard packets. When RED decides to discard packets, it must decide how many to discard. First, RED measures the average queue depth of the queue in question. RED calculates the average depth, and then decides whether congestion is occurring based on the average depth. RED uses the average depth, and not the actual queue depth, because the actual queue depth will most likely change much more quickly than the average depth. Because RED wants to avoid the effects of synchronization, it needs to act in a balanced fashion, not a jerky, sporadic fashion. Figure 7-5 shows a graph of the actual queue depth for a particular queue, compared with the average queue depth. Figure 7-5. Graph of Actual Queue Depth Versus Average Queue Depth
As seen in the graph, the calculated average queue depth changes more slowly than does the actual queue depth. RED uses the following algorithm when calculating the average queue depth:
For you test takers out there, do not worry about memorizing the formula, but focus on the idea. WRED uses this algorithm, with a default for n of 9. This makes the equation read as follows:
In other words, the current queue depth only accounts for .2 percent of the new average each time it is calculated. Therefore, the average changes slowly, which helps RED prevent overreaction to changes in the queue depth. When configuring WRED, you can change the value of n in this formula by setting the exponential weighting constant parameter. By making the exponential weighting constant smaller, you make the average change more quickly; by making it larger, the average changes more slowly. RED decides whether to discard packets by comparing the average queue depth to two thresholds, called the minimum threshold and maximum threshold. Table 7-2 describes the overall logic of when RED discards packets, as illustrated in Figure 7-6.
Figure 7-6. RED Discarding Logic Using Average Depth, Minimum Threshold, and Maximum Threshold
When the average queue depth is very low or very high, the actions are somewhat obvious. As seen in Table 7-2 and Figure 7-6, RED does not discard packets when the average queue depth falls below the minimum threshold. When the average depth rises above the maximum threshold, RED discards all packets. While this action might seem like a Tail Drop action, technically it is not, because the actual queue might not be full yet. So, to distinguish between true Tail Drop, and the case when the RED average queue depth exceeds the maximum threshold, RED calls this action category Full Drop. In between the two thresholds, however, RED discards a percentage of packets, with the percentage growing linearly as the average queue depth grows. The core concept behind RED becomes more obvious if you notice that the maximum percentage of packets discarded is still much less than discarding all packets. Once again, RED wants to discard some packets, but not all packets. As congestion increases, RED discards a higher percentage of packets. Eventually, the congestion can increase to the point that RED discards all packets. You can set the maximum percentage of packets discarded by WRED by setting the mark probability denominator (MPD) setting in IOS. IOS calculates the maximum percentage using the formula 1/MPD. For instance, an MPD of 10 yields a calculated value of 1/10, meaning the maximum discard rate is 10 percent. RED discards a larger percentage of packets as the average queue depth approaches the maximum threshold, as shown in the graph of Figure 7-6. RED also randomly picks the packets that will be discarded. Table 7-3 summarizes some of the key terms related to RED.
The next two sections in this chapter cover WRED and ECN, including their respective configurations. Weighted RED (WRED)WRED behaves almost identically to RED, as described in the preceding section of this chapter. It calculates the average queue depth, and decides whether to discard packets, and what percentage of packets to discard, based on all the same variables as RED. The difference between RED and WRED lies in the fact that WRED creates a WRED profile for each precedence or DSCP value. A WRED profile is a set of minimum and maximum thresholds plus a packet discard percentage. The minimum and maximum thresholds are defined as a number of entries in the queue. Instead of directly configuring the discard percentage, you configure the Mark Probability Denominator (MPD), with the percentage being 1/MPD. By using a different WRED profile for each IP Precedence or DSCP value, WRED can treat packets differently. The other major concept that needs to be covered, before diving into WRED configuration, relates to where WRED can be enabled, and how it interoperates with queuing tools. Interestingly, although WRED can be enabled on a physicalinterface, it cannot be concurrently enabled along with any other queuing tool! When using Modular QoS command-line interface (MQC) to configure queuing, however, WRED can be used for individual class queues. The following sections cover the following:
How WRED Weights PacketsWRED bases its decisions about when to discard packets, and what percentage to discard, on the following four factors:
First, just like RED, WRED calculates the average queue depth. WRED then compares the average queue depth to the minimum and maximum thresholds to decide whether it should discard packets. If the average queue depth is between the two thresholds, WRED discards a percentage of the packets, with the percentage based on the MPD; if the average queue depth exceeds the maximum threshold, WRED discards all new packets. To weight based on precedence or DSCP markings, WRED sets the minimum threshold, maximum threshold, and the MPD to different values per precedence or DSCP value. The average queue depth calculation, however, is not based on the precedence or DSCP value, but is instead calculated for all packets in the queue, regardless of the precedence or DSCP value. An example of how WRED weights packets can help you make more sense out of how WRED behaves differently than RED. First, consider Figure 7-7, which happens to show the default settings for precedence 0; these settings together define the WRED profile for Precedence 0 traffic. Figure 7-7. Default WRED Profile for Precedence 0
WRED calculates the average queue depth just like RED, ignoring precedence, but it decides when to discard packets based on the precedence or DSCP value. Suppose, for instance, that the average queue depth just passed 20. For new precedence 0 packets that need to be placed into the queue, WRED begins discarding some packets. If the average queue depth continues to increase toward 40, WRED continues to discard precedence 0 packets, but more aggressively, up to a rate of 10 percent, when the average queue depth reaches 40. After the average queue depth passes 40, WRED discards all new precedence 0 packets. In fact, if all packets were precedence 0, RED and WRED would behave identically. The real differences between RED and WRED can be seen with more than one IP precedence value. Figure 7-8 shows the WRED profile for both precedence 0 and precedence 3. (The settings in the figure do not match WRED's precedence 3 defaults, which are listed later in this section.) Figure 7-8. Example WRED Profiles for Precedences 0 and 3
Suppose that the queue associated with the interface has a bunch of packets in it, marked with different precedence values, and the average queue depth just passed 20. For new precedence 0 packets that need to be placed into the queue, WRED begins discarding some precedence 0 packets, because the minimum threshold for precedence 0 is 20. WRED does not discard any precedence 3 packets, however, because the precedence 3 minimum threshold is 30. After the average queue depth reaches 30, WRED starts discarding precedence 3 packets as well. As the average queue depth reaches 40, precedence 0 packets are discarded at a rate approaching 10 percent, but precedence 3 packets are only discarded 5 percent of the time, because the MPD is set to 20, and 1/20 is 5 percent. With these two WRED profiles, WRED discards precedence 0 packets earlier, and at a higher rate, as compared to precedence 3 packets. In short, the weighting feature of WRED just determines when WRED begins discarding a percentage of the packets (per-precedence minimum threshold), the maximum percentage discarded (based on per-precedence MPD), and the point at which WRED discards all packets of that precedence (based on the per-precedence maximum threshold). IOS uses logical choices for the default settings for all WRED parameters. However, you can choose to override the parameters with configuration commands. Tables 7-4 and 7-5 list the IOS default values for minimum threshold, maximum threshold, and MPD with precedence-based WRED (Table 7-4) and DSCP-based WRED (Table 7-5).
Cisco IOS Software follows the suggested meaning of all DSCP values, including the fact that these four AF DSCP values should be given equal treatment. The last digit of the name of the AF DSCP value identifies the drop preference, with 3 being most likely to be dropped, and 1 being least likely to be dropped. Note, for instance, that the settings for assured forwarding (AF) DSCPs AF11, AF21, AF31, and AF41 are all identical. For the same reason, AF12, AF22, AF32, and AF42 have the same defaults, as do AF13, AF23, AF33, and AF43. WRED and QueuingWRED relies on the average queue depth concept, which calculates a rolling average of the queue depth of some queue. But which queue? Well, first consider a serial interface on a router, on which Weighted Fair Queuing (WFQ) is enabled by default. In this case, however, WFQ has been disabled, leaving a single first-in, first-out (FIFO) output queue on the interface. Figure 7-9 shows the basic idea. Figure 7-9. FIFO Output Queue and WRED Interaction
As was covered in depth in Chapter 5, "Congestion Management," each interface has a TX Queue or TX Ring. If the TX Ring/TX Queue fills, IOS places new packets into the software queue(s) awaiting transmission. In this example, a single FIFO output queue is used, as shown. With WRED also enabled, WRED calculates the average queue depth of the single FIFO output queue. As new packets arrive, before being placed into the FIFO output queue, WRED logic decides whether the packet should be discarded, as described in detail earlier in this chapter. With WRED enabled directly on a physical interface, IOS supports FIFO Queuing, and FIFO Queuing only! That fact certainly makes the explanation easier, because there is less to cover! So, WRED works just like Figure 6-9 when it is enabled directly on a physical interface, because WRED can only work with a single FIFO queue in that case. You might recall that of all the queuing tools listed in Chapter 5, CBWFQ and Low Latency Queuing (LLQ, which is merely a variation of CBFWQ) are the only queuing tools that claim to be capable of using WRED. To use WRED with CBWFQ or LLQ, you need to configure CBWFQ or LLQ as you normally would, and then enable WRED inside the individual classes as needed. However, you cannot enable WRED inside a class configured as the low-latency queue (in other words, you cannot use WRED in a class that uses the priority command.) Figure 7-10 illustrates an expanded diagram of CBWFQ, with the details that include WRED's part of the process. Figure 7-10. WRED with CBWFQ
As you recall, CBWFQ classifies traffic into various classes. Each class has a single FIFO queue inside the class, so WRED bases its average queue depth calculation on the actual depth of each per-class FIFO queue, respectively. In other words, a different instance of WRED operates on each of the FIFO queues in each class. WRED might be discarding packets aggressively in one congested class, without discarding any packets in a class that is not congested. WRED can be enabled for some CBWFQ classes, and not for others. For instance, with LLQ, voice traffic is typically placed into the priority queue. Because voice is drop sensitive, and UDP based, it would be better not to just apply WRED to the voice class. Instead, you can apply WRED to the data classes that serve predominantly TCP flows. This way, WRED can be used to limit the queue congestion for the interface without performing drops on the voice traffic. Now that you understand the basic operation of WRED, along with the meaning of the parameters that can be tuned, you can configure WRED. WRED ConfigurationWRED requires very little configuration if you want to take the IOS defaults for the various tunable settings, such as per-precedence and per-DSCP thresholds. If you want to change the defaults, the configuration details can become quite large. This section begins with a table of configuration commands (Table 7-6) and show commands (Table 7-7), followed by three separate examples.
In the first example, R3 enables WRED on its S0/0 interface. WRED treats packets differently based on the IP precedence value, which has been marked with CB marking as the packets enter R3's E0/0 interface. The marking logic performed by CB marking is as follows:
To generate traffic in this network, two voice calls will be made between the analog phones attached to R1 and R4. Multiple web browsers will load the standard page (this is the same page we have used in other chapters in this book) with two TCP connections created by each browserone to get a file with the word "important" in it, and the other getting a file with "not-so" in it. An FTP download of a large file will also be initiated from the Server to Client1. Example 7-1 shows the basic configuration and show commands output. Only the required commands and parameters have been used, with defaults for all other settings. The example uses the familiar network diagram, as repeated in Figure 7-11. Figure 7-11. Sample Network for All WRED ExamplesConfiguration on R3
Example 7-11. WRED Default Configuration, R3, S0/0R3#show running-config ! hostname R3 ! no ip domain-lookup ip host r4 192.168.3.254 ip host r2 192.168.23.252 ip host r1 192.168.1.251 ! ip cef ! class-map match-all voip-rtp match ip rtp 16384 16383 class-map match-all http-impo match protocol http url "*important*" class-map match-all http-not match protocol http url "*not-so*" class-map match-all class-default match any ! ! policy-map laundry-list class voip-rtp set ip dscp ef class http-impo set ip dscp af21 class http-not set ip dscp af23 class class-default set ip dscp default ! call rsvp-sync ! interface Ethernet0/0 description connected to SW2, where Server1 is connected ip address 192.168.3.253 255.255.255.0 half-duplex service-policy input laundry-list ! interface Serial0/0 description connected to FRS port S0. Single PVC to R1. no ip address encapsulation frame-relay load-interval 30 random-detect clockrate 128000 ! interface Serial0/0.1 point-to-point description point-point subint global DLCI 103, connected via PVC to DLCI 101 ( R1) ip address 192.168.2.253 255.255.255.0 frame-relay interface-dlci 101 ! ! Lines omitted for brevity. ! R3#show queueing interface serial 0/0 Interface Serial0/0 queueing strategy: random early detection (WRED) Exp-weight-constant: 9 (1/512) Mean queue depth: 37 class Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes thresh thresh prob 0 1776/315688 1012/179987 20 40 1/10 1 0/0 0/0 22 40 1/10 2 5/4725 16/17152 24 40 1/10 3 0/0 0/0 26 40 1/10 4 0/0 0/0 28 40 1/10 5 0/0 0/0 31 40 1/10 6 0/0 0/0 33 40 1/10 7 0/0 0/0 35 40 1/10 rsvp 0/0 0/0 37 40 1/10 R3#show queue s 0/0 Output queue for Serial0/0 is 57/0 Packet 1, linktype: ip, length: 64, flags: 0x88 source: 192.168.3.254, destination: 192.168.2.251, id: 0x053E, ttl: 253, TOS: 184 prot: 17, source port 18378, destination port 17260 data: 0x47CA 0x436C 0x0028 0x0000 0x8012 0x3F73 0x4C7E 0x8D44 0x18D1 0x03FE 0xFC77 0xA2A7 0x35A2 0x54E7 Packet 2, linktype: ip, length: 64, flags: 0x88 source: 192.168.3.254, destination: 192.168.2.251, id: 0x0545, ttl: 253, TOS: 184 prot: 17, source port 16640, destination port 17178 data: 0x4100 0x431A 0x0028 0x0000 0x8012 0x6330 0x21B4 0x82AF 0x05C9 0x03FE 0x1448 0x8706 0xAFD9 0xD364 ! ! Output omitted for brevity. ! R3#show interfaces s 0/0 Serial0/0 is up, line protocol is up Hardware is PowerQUICC Serial Description: connected to FRS port S0. Single PVC to R1. MTU 1500 bytes, BW 1544 Kbit, DLY 20000 usec, reliability 255/255, txload 20/255, rxload 4/255 Encapsulation FRAME-RELAY, loopback not set Keepalive set (10 sec) LMI enq sent 591, LMI stat recvd 591, LMI upd recvd 0, DTE LMI up LMI enq recvd 0, LMI stat sent 0, LMI upd sent 0 LMI DLCI 1023 LMI type is CISCO frame relay DTE FR SVC disabled, LAPF state down Broadcast queue 0/64, broadcasts sent/dropped 2726/0, interface broadcasts 252 2 Last input 00:00:02, output 00:00:00, output hang never Last clearing of "show interface" counters 01:38:28 Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 4391 Queueing strategy: random early detection(RED) 30 second input rate 29000 bits/sec, 58 packets/sec 30 second output rate 122000 bits/sec, 91 packets/sec 23863 packets input, 1535433 bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 2 input errors, 0 CRC, 2 frame, 0 overrun, 0 ignored, 0 abort 36688 packets output, 5638653 bytes, 0 underruns 0 output errors, 0 collisions, 4 interface resets 0 output buffer failures, 0 output buffers swapped out 0 carrier transitions DCD=up DSR=up DTR=up RTS=up CTS=up The WRED part of the configuration is quite short. The configuration shows the random-detect interface subcommand under serial 0/0. As you will see later, the command actually disables WFQ if configured. The rest of the highlighted configuration commands show the CB marking configuration, which implements the functions listed before the example. (For more information about CB marking, see Chapter 4, "Classification and Marking.") After the configuration, the show queueing interface serial 0/0 command output lists the WRED settings, and the statistics for each precedence value. The defaults for the exponential weighting constant, and the per-precedence defaults of minimum threshold, maximum threshold, and MPD are all listed. In addition, the command lists statistics for bytes/packets dropped by WRED, per precedence value. For those of you who did not memorize the DSCP values, you may not be able to correlate the DSCP values set by CB marking, and the precedence values interpreted by WRED. WRED just looks at the first 3 bits of the IP ToS byte when performing precedence-based WRED. So, DSCP best effort (BE) equates to precedence 0, DSCP AF21 and AF23 both equate to precedence 2, and DSCP expedited forwarding (EF) equates to precedence 5. The show queueing command also lists a column of statistics for tail drop as well as random drops. In this example, WRED has dropped several packets, and the queue has filled, causing tail drops, as shown with the nonzero counters for random drops and tail drops in the show queueing command output. The show queue serial 0/0 command lists the same type of information seen in earlier chapters. However, this command lists one particularly interesting item relating to WRED in the first line of the command output. The actual queue depth for the single FIFO queue used with WRED is listed at 57 entries in this particular example. The earlier show queueing command lists an average queue depth of 37 just instants before. These two numbers just give us a small reminder that WRED decides to drop based on average queue depth, as opposed to actual queue depth. Finally, the show interfaces command at the end of the example reminds us that WRED does not work with any other queuing method directly on the interface. The command uses the statement "Queueing strategy: random early detection (RED)" to remind us of that fact. WRED uses a single FIFO queue, and measures its average queue depth based on the queue depth of the FIFO queue. The second WRED configuration example uses WRED on R3's S0/0 interface again, but this time with DSCP WRED, and a few changes to the defaults. In fact, Example 7-2 just shows the changed configuration, with most of the configuration staying the same. For instance, the same CB marking configuration is used to mark the traffic, so the details are not repeated in the example. The example uses the familiar network diagram that was also used in the preceding example. Example 7-12. DSCP-Based WRED on R3 S0/0R3#configure terminal Enter configuration commands, one per line. End with CNTL/Z. R3(config)#interface serial 0/0 R3(config-if)#random-detect dscp-based R3(config-if)#random-detect dscp af21 50 60 R3(config-if)#random-detect dscp af23 20 30 R3(config-if)#random-detect ? dscp parameters for each dscp value dscp-based Enable dscp based WRED on an interface exponential-weighting-constant weight for mean queue depth calculation flow enable flow based WRED prec-based Enable prec based WRED on an interface precedence parameters for each precedence value <cr> R3(config-if)#random-detect exponential-weighting-constant 5 R3(config-if)#^Z R3#show queue serial 0/0 Output queue for Serial0/0 is 37/0 Packet 1, linktype: ip, length: 64, flags: 0x88 source: 192.168.3.254, destination: 192.168.2.251, id: 0x0545, ttl: 253, TOS: 184 prot: 17, source port 16640, destination port 17178 data: 0x4100 0x431A 0x0028 0x0000 0x8012 0xAB15 0x21E1 0x71CF 0x05C9 0x03FE 0x7AA3 0x770B 0x2408 0x8264 Packet 2, linktype: ip, length: 64, flags: 0x88 source: 192.168.3.254, destination: 192.168.2.251, id: 0x053E, ttl: 253, TOS: 184 prot: 17, source port 18378, destination port 17260 data: 0x47CA 0x436C 0x0028 0x0000 0x8012 0x8759 0x4CAB 0x7D04 0x18D1 0x03FE 0xDC15 0x3E4A 0x4E92 0x5447 R3#show queueing interface s 0/0 Interface Serial0/0 queueing strategy: random early detection (WRED) Exp-weight-constant: 5 (1/32) Mean queue depth: 38 dscp Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes thresh thresh prob af11 0/0 0/0 33 40 1/10 af12 0/0 0/0 28 40 1/10 af13 0/0 0/0 24 40 1/10 af21 8/9904 18/21288 50 60 1/10 af22 0/0 0/0 28 40 1/10 af23 13/18252 33/34083 20 30 1/10 af31 0/0 0/0 33 40 1/10 af32 0/0 0/0 28 40 1/10 af33 0/0 0/0 24 40 1/10 af41 0/0 0/0 33 40 1/10 af42 0/0 0/0 28 40 1/10 af43 0/0 0/0 24 40 1/10 cs1 0/0 0/0 22 40 1/10 cs2 0/0 0/0 24 40 1/10 cs3 0/0 0/0 26 40 1/10 cs4 0/0 0/0 28 40 1/10 cs5 0/0 0/0 31 40 1/10 cs6 0/0 0/0 33 40 1/10 cs7 0/0 0/0 35 40 1/10 ef 0/0 0/0 37 40 1/10 rsvp 0/0 0/0 37 40 1/10 default 16/16254 20/23216 20 40 1/10 The configuration begins with a change from precedence-based WRED to DSCP-based WRED using the random-detect dscp-based interface subcommand. The random-detect dscp af21 50 60 changes the default minimum and maximum thresholds for AF21 to 50 and 60, respectively, with the random-detect dscp af23 20 30 changing these same values for AF23. In addition, although Cisco does not recommend changing the exponential weighting constant, the configuration does offer an example of the syntax with the random-detect exponential- weighting-constant 5 command. By setting it to a smaller number than the default (9), WRED will more quickly change the average queue depth calculation, more quickly reacting to changes in the queue depth. The command output from the various show commands do not differ much compared to when DSCP-based WRED is enabled. The format now includes DSCP values rather than precedence values, as you may notice with the counters that point out drops for both AF21 and AF23, which were previously both treated as precedence 2. WRED suffers from the lack of concurrent queuing tool support on an interface. However, WRED can be enabled inside a CBWFQ class, operating on the queue for the class, effectively enabling WRED concurrently with CBWFQ. The final WRED example shows a configuration for WRED using LLQ. The last WRED configuration example repeats base configuration similar to one of the CBWFQ examples from Chapter 5. Voice, HTTP, and FTP traffic compete for the same bandwidth, with WRED applied per-class for the two HTTP classes and one FTP class. Note that because voice traffic is drop sensitive, WRED is not enabled for the low-latency queue. Because WRED can be enabled per class in conjunction with CBWFQ, WRED calculates average queue depth based on the per-class queue. The criteria for each type of traffic is as follows:
Example 7-3 lists the configuration and show commands used when WRED is enabled in LLQ classes dscp-af21, dscp-af23, and class-default. Example 7-13. WRED Used in LLQ Classes dscp-af21, dscp-af23, and class-defaultR3#show running-config Building configuration... ! !Portions omitted for brevity ! ip cef ! ! The following classes are used in the LLQ configuration applied to S0/0 ! class-map match-all dscp-ef match ip dscp ef class-map match-all dscp-af21 match ip dscp af21 class-map match-all dscp-af23 match ip dscp af23 ! ! The following classes are used on ingress for CB marking ! class-map match-all http-impo match protocol http url "*important*" class-map match-all http-not match protocol http url "*not-so*" class-map match-all class-default match any class-map match-all voip-rtp match ip rtp 16384 16383 ! ! Policy-map laundry-list creates CB marking configuration, used on ! ingress on E0/0 ! policy-map laundry-list class voip-rtp set ip dscp ef class http-impo set ip dscp af21 class http-not set ip dscp af23 class class-default set ip dscp default ! ! Policy-map queue-on-dscp creates LLQ configuration, with WRED ! inside three classes ! policy-map queue-on-dscp class dscp-ef priority 58 class dscp-af21 bandwidth 20 random-detect dscp-based class dscp-af23 bandwidth 8 random-detect dscp-based class class-default fair-queue random-detect dscp-based ! interface Ethernet0/0 description connected to SW2, where Server1 is connected ip address 192.168.3.253 255.255.255.0 ip nbar protocol-discovery half-duplex service-policy input laundry-list ! interface Serial0/0 description connected to FRS port S0. Single PVC to R1. bandwidth 128 no ip address encapsulation frame-relay load-interval 30 max-reserved-bandwidth 85 service-policy output queue-on-dscp clockrate 128000 ! interface Serial0/0.1 point-to-point description point-point subint global DLCI 103, connected via PVC to DLCI 101 (R1) ip address 192.168.2.253 255.255.255.0 frame-relay interface-dlci 101 ! R3#show policy-map interface serial 0/0 Serial0/0 Service-policy output: queue-on-dscp Class-map: dscp-ef (match-all) 46437 packets, 2971968 bytes 30 second offered rate 0 bps, drop rate 0 bps Match: ip dscp ef Weighted Fair Queueing Strict Priority Output Queue: Conversation 264 Bandwidth 58 (kbps) Burst 1450 (Bytes) (pkts matched/bytes matched) 42805/2739520 (total drops/bytes drops) 0/0 Class-map: dscp-af21 (match-all) 2878 packets, 3478830 bytes 30 second offered rate 76000 bps, drop rate 0 bps Match: ip dscp af21 Weighted Fair Queueing Output Queue: Conversation 266 Bandwidth 20 (kbps) (pkts matched/bytes matched) 2889/3494718 (depth/total drops/no-buffer drops) 11/26/0 exponential weight: 9 mean queue depth: 5 dscp Transmitted Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes pkts/bytes thresh thresh prob af11 0/0 0/0 0/0 32 40 1/10 af12 0/0 0/0 0/0 28 40 1/10 af13 0/0 0/0 0/0 24 40 1/10 af21 2889/3494718 8/9904 18/21288 32 40 1/10 af22 0/0 0/0 0/0 28 40 1/10 af23 0/0 0/0 0/0 24 40 1/10 af31 0/0 0/0 0/0 32 40 1/10 af32 0/0 0/0 0/0 28 40 1/10 af33 0/0 0/0 0/0 24 40 1/10 af41 0/0 0/0 0/0 32 40 1/10 af42 0/0 0/0 0/0 28 40 1/10 af43 0/0 0/0 0/0 24 40 1/10 cs1 0/0 0/0 0/0 22 40 1/10 cs2 0/0 0/0 0/0 24 40 1/10 cs3 0/0 0/0 0/0 26 40 1/10 cs4 0/0 0/0 0/0 28 40 1/10 cs5 0/0 0/0 0/0 30 40 1/10 cs6 0/0 0/0 0/0 32 40 1/10 cs7 0/0 0/0 0/0 34 40 1/10 ef 0/0 0/0 0/0 36 40 1/10 rsvp 0/0 0/0 0/0 36 40 1/10 default 0/0 0/0 0/0 20 40 1/10 Class-map: dscp-af23 (match-all) 1034 packets, 1250984 bytes 30 second offered rate 32000 bps, drop rate 0 bps Match: ip dscp af23 Weighted Fair Queueing Output Queue: Conversation 267 Bandwidth 8 (kbps) (pkts matched/bytes matched) 1047/1266140 (depth/total drops/no-buffer drops) 11/46/0 exponential weight: 9 mean queue depth: 5 dscp Transmitted Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes pkts/bytes thresh thresh prob af11 0/0 0/0 0/0 32 40 1/10 af12 0/0 0/0 0/0 28 40 1/10 af13 0/0 0/0 0/0 24 40 1/10 af21 0/0 0/0 0/0 32 40 1/10 af22 0/0 0/0 0/0 28 40 1/10 af23 1047/1266140 13/18252 33/34083 24 40 1/10 af31 0/0 0/0 0/0 32 40 1/10 af32 0/0 0/0 0/0 28 40 1/10 af33 0/0 0/0 0/0 24 40 1/10 af41 0/0 0/0 0/0 32 40 1/10 af42 0/0 0/0 0/0 28 40 1/10 af43 0/0 0/0 0/0 24 40 1/10 cs1 0/0 0/0 0/0 22 40 1/10 cs2 0/0 0/0 0/0 24 40 1/10 cs3 0/0 0/0 0/0 26 40 1/10 cs4 0/0 0/0 0/0 28 40 1/10 cs5 0/0 0/0 0/0 30 40 1/10 cs6 0/0 0/0 0/0 32 40 1/10 cs7 0/0 0/0 0/0 34 40 1/10 ef 0/0 0/0 0/0 36 40 1/10 rsvp 0/0 0/0 0/0 36 40 1/10 default 0/0 0/0 0/0 20 40 1/10 Class-map: class-default (match-any) 847 packets, 348716 bytes 30 second offered rate 2000 bps, drop rate 0 bps Match: any Weighted Fair Queueing Flow Based Fair Queueing Maximum Number of Hashed Queues 256 (total queued/total drops/no-buffer drops) 0/0/0 exponential weight: 9 dscp Transmitted Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes pkts/bytes thresh thresh prob af11 0/0 0/0 0/0 32 40 1/10 af12 0/0 0/0 0/0 28 40 1/10 af13 0/0 0/0 0/0 24 40 1/10 af21 0/0 0/0 0/0 32 40 1/10 af22 0/0 0/0 0/0 28 40 1/10 af23 0/0 0/0 0/0 24 40 1/10 af31 0/0 0/0 0/0 32 40 1/10 af32 0/0 0/0 0/0 28 40 1/10 af33 0/0 0/0 0/0 24 40 1/10 af41 0/0 0/0 0/0 32 40 1/10 af42 0/0 0/0 0/0 28 40 1/10 af43 0/0 0/0 0/0 24 40 1/10 cs1 0/0 0/0 0/0 22 40 1/10 cs2 0/0 0/0 0/0 24 40 1/10 cs3 0/0 0/0 0/0 26 40 1/10 cs4 0/0 0/0 0/0 28 40 1/10 cs5 0/0 0/0 0/0 30 40 1/10 cs6 0/0 0/0 0/0 32 40 1/10 cs7 0/0 0/0 0/0 34 40 1/10 ef 0/0 0/0 0/0 36 40 1/10 rsvp 0/0 0/0 0/0 36 40 1/10 default 59/767 0/0 0/0 20 40 1/10 The example lists a large configuration, but only a small amount pertains to WRED. Two sets of class maps have been configuredone set is used by the CB marking policy called laundry-list, and the other set is used by the LLQ policy map called queue-on-dscp. In policy-map queue-on-dscp, inside classes dscp-af21, dscp-af23, and class-default, the random-detect dscp-based command enables WRED. These three random-detect commands are highlighted in the show running-config output in the example. Also note that WRED is not enabled on interface serial 0/0 in this configuration, because WRED applies to the output queues used by each class. Because WRED is not enabled on the main interface, to see statistics for WRED, you must use the show policy-map interface command. This command in the example lists WRED statistics inside each class in which WRED has been enabled. For the classes in which WRED is not enabled, such as the dscp-ef class, no additional WRED statistical information is listed. The default values for exponential weighting constant, and the per-DSCP defaults for minimum threshold, maximum threshold, and MPD are all listed in the command output. WRED SummaryWRED provides a valuable tool for managing congestion in queues. Cisco IOS uses defaults that conform to the DiffServ Assured Forwarding conventions, which reduce the likelihood that you will need to configure thresholds for WRED. WRED can be particularly effective when used with MQC-based queuing tools, but when enabled directly on an interface, WRED has the unfortunate side effect of disallowing other queuing tools to be used. Table 7-8 lists some of WRED's key points.
Explicit Congestion NotificationECN is very much interrelated with WRED. This section begins with a description of how ECN works with WRED, followed by a short section on ECN configuration. ECN ConceptsWRED's main goal is to get some TCP senders to temporarily slow down the rate at which they send data into the network. By doing so, the temporary congestion may abate, avoiding problems such as tail drop and global synchronization. However, to cause TCP senders to slow down, WRED resorts to an inherently harmful actionthe discarding of packets. It's a classic case of doing some harm now, in order to prevent more harm later. Explicit Congestion Notification (ECN) provides the same benefit as WRED, without discarding packets. In fact, ECN is really just a feature of WRED in which TCP senders are signaled to slow down by setting bits in the packet headers. By signaling TCP senders to slow down, congestion may abate, all the while avoiding the use of packet drop. When ECN is enabled, a router's WRED logic works almost exactly as before. For instance, WRED profiles are defined for each precedence or DSCP value. Average queue depths are calculated. WRED compares the average queue depth with the thresholds, and decides whether to drop nothing, to randomly drop a percentage of the packets, or to perform full drop. The difference lies in what WRED does once it randomly chooses a packet to be discarded, which happens when the average queue depth is between the minimum and maximum threshold. With ECN enabled, WRED still randomly picks the packet, but instead of discarding it, WRED marks a couple of bits in the packet header, and forwards the packet. Marking these bits begins a process, defined in RFC 3168, which causes the sender of the TCP segment to reduce the congestion window (CWND) by 50 percent. ECN causes the sender of the randomly-chosen packet to slow down. To do so, the sender of the packet must be told that congestion occurred, and ECN wants it to slow down. To trigger the process, the router, which notices the congestion, needs some bits to set in order to signal that the packet experienced congestion. Figure 7-12 shows the bits that are set by the router. Figure 7-12. ECN Bits in DSCP Byte
You might recall from back in Chapter 2, "QoS Tools and Architectures," that the two low-order bits in the DSCP byte were formerly unused, but they were later defined for use by ECN. With RFC 3168, which defines ECN, the two extra bits have been defined as the ECT and CE bitstogether known as the ECN field. To see how these bits are used, Figure 7-13 shows a full example. When looking at the figure, and the explanation that follows, keep in mind these two important points: Figure 7-13. Example of ECN Signaling to Reduce CWND
While Figure 7-13 holds several details, the general idea is that the router sets some bits in the packet instead of discarding it. In order to get the original sender of the packet to slow down, bits need to be set in the next packet sent back to the original sender. In other words, the router can set bits in the packet flowing left-to-right in the figure, but some other bits must be set in the packet flowing in the opposite direction (right-to-left) in order for PC Client2 to know to slow down. The following steps explain the contents of Figure 7-13, with the text following the circled numbers in the figure:
As you can see, the sender does indeed slow down by reducing its CWND, and the router didn't have to discard any packets. Overall, it's a better solution than WRED without ECN. However, this process depends on whether the TCP implementations on the endpoint hosts supports ECN or not. For instance, if Client2 and Server1 had negotiated about ECN when initializing the TCP connection, and one of them didn't support ECN, they would decide not to use ECN. Packets sent for this TCP connection would set ECN = 00. Under these circumstances, even with ECN configured on the router, the router's WRED logic could still discard the packet. That's because the router's ECN logic first checks to see whether ECN is supported for the underlying TCP connection; if not supported, the router uses the same old WRED logic, and discards the packet. In summary, the WRED ECN logic works just like WRED without ECN, until a packet has been randomly chosen for discard (when average queue depth is between the min and max thresholds). At that point:
ECN ConfigurationAs you can understand from the details, ECN relies on routers that can mark the ECN bits, as well as IP hosts that support ECN with their TCP implementations. Implementation on Cisco routers is relatively easy, with one additional command required for configuration as compared with WRED configuration. Example 7-4 lists a simple WRED configuration, with ECN enabled, along with a few show commands. Example 7-14. WRED Used in LLQ Classes dscp-af21, dscp-af23, and class-defaultR3#show running-config Building configuration... ! !Portions omitted for brevity ! ip cef ! class-map match-all class1 match protocol http ! policy-map ecn-test class class1 bandwidth percent 50 random-detect dscp-based random-detect ecn ! ! interface Serial0/0 no ip address service-policy output ecn-test encapsulation frame-relay clockrate 128000 ! interface Serial0/0.1 point-to-point bandwidth 128 ip address 192.168.2.3 255.255.255.0 frame-relay interface-dlci 143 ! ! The rest has been omitted for brevity ! R3#show policy-map interface s0/0 Serial0/0 Service-policy output: ecn-test Class-map: class1 (match-all) 0 packets, 0 bytes 5 minute offered rate 0 bps, drop rate 0 bps Match: protocol http Queueing Output Queue: Conversation 265 Bandwidth 50 (%) Bandwidth 772 (kbps) (pkts matched/bytes matched) 0/0 (depth/total drops/no-buffer drops) 0/0/0 exponential weight: 9 explicit congestion notification mean queue depth: 0 dscp Transmitted Random drop Tail drop Minimum Maximum Mark pkts/bytes pkts/bytes pkts/bytes thresh thresh prob af11 0/0 0/0 0/0 32 40 1/10 af12 0/0 0/0 0/0 28 40 1/10 af13 0/0 0/0 0/0 24 40 1/10 af21 0/0 0/0 0/0 32 40 1/10 af22 0/0 0/0 0/0 28 40 1/10 af23 0/0 0/0 0/0 24 40 1/10 af31 0/0 0/0 0/0 32 40 1/10 af32 0/0 0/0 0/0 28 40 1/10 af33 0/0 0/0 0/0 24 40 1/10 af41 0/0 0/0 0/0 32 40 1/10 af42 0/0 0/0 0/0 28 40 1/10 af43 0/0 0/0 0/0 24 40 1/10 cs1 0/0 0/0 0/0 22 40 1/10 cs2 0/0 0/0 0/0 24 40 1/10 cs3 0/0 0/0 0/0 26 40 1/10 cs4 0/0 0/0 0/0 28 40 1/10 cs5 0/0 0/0 0/0 30 40 1/10 cs6 0/0 0/0 0/0 32 40 1/10 cs7 0/0 0/0 0/0 34 40 1/10 ef 0/0 0/0 0/0 36 40 1/10 rsvp 0/0 0/0 0/0 36 40 1/10 default 0/0 0/0 0/0 20 40 1/10 ! note this new statistical section that follows: dscp ECN Mark pkts/bytes af11 0/0 af12 0/0 af13 0/0 af21 0/0 af22 0/0 af23 0/0 af31 0/0 af32 0/0 af33 0/0 af41 0/0 af42 0/0 af43 0/0 cs1 0/0 cs2 0/0 cs3 0/0 cs4 0/0 cs5 0/0 cs6 0/0 cs7 0/0 ef 0/0 rsvp 0/0 default 0/0 Class-map: class-default (match-any) 0 packets, 0 bytes 5 minute offered rate 0 bps, drop rate 0 bps Match: any The configuration is indeed quite simple. Notice the inclusion of the random-detect ecn command inside policy-map ecn-test. The rest of the configuration looks like normal CBWFQ configuration, with WRED enabled. Remember, ECN is really just a feature of WRED, with the same details of WRED thresholds and discard percentages, per DSCP. ECN simply means that if WRED randomly chooses to discard a packet, if that packet supports ECN (ECN field is 01 or 10), then the router marks ECN=11, and doesn't discard it. Note that the ECN logic also means that randomly-chosen packets that have ECN set as 00 will be discarded like WRED normally would. So, the WRED section of the show policy-map interface s0/0 command has the same types of statistics for discarded packets. A separate new section of output is included later to count the number of packets marked with ECN = 11. The new section is denoted with a heading in gray background near the end of the example. |