Queue Management

FIFO, Priority, Weighted Fair Queuing, and active queue management.

1. The Epicenter of Congestion: The Role of Queues

At the heart of every network device that forwards traffic, from a simple home router to a massive carrier-grade switch, lies a fundamental component: a buffer, which is organized into one or more queues. A is a temporary holding area for data packets. Queues are necessary because devices cannot always forward packets at the exact moment they arrive.

Consider a router with multiple input ports and a single output port. If packets arrive from several different inputs all destined for the same output link, and the combined arrival rate exceeds the link's capacity, a bottleneck is created. The router cannot magically transmit data faster than the physical link allows. Instead, it must place the excess packets into the queue associated with that output port. Without queues, any packet that could not be transmitted instantly would be dropped, leading to a highly inefficient and unreliable network.

Therefore, queues are the epicenter where network congestion becomes a physical reality. Queue management is the set of algorithms and strategies used by a network device to control which packets are stored in these queues, in what order they are transmitted, and which packets should be dropped when the queue inevitably becomes full. This is one of the most critical mechanisms for implementing Quality of Service (QoS).

2. FIFO (First-In, First-Out) Queueing

FIFO is the simplest and most fundamental queueing discipline. It is the default behavior in a best-effort network where no explicit QoS policies are configured.

How FIFO Works

The concept is identical to a checkout line at a grocery store. The first person to get in line is the first person to be served. In networking terms, a router using FIFO has a single queue for each output interface. Packets arriving at the interface are placed at the back of this queue. The router then serves packets from the front of the queue in the exact order they arrived.

Strengths of FIFO

Simplicity: It requires minimal processing overhead and memory management, making it fast and easy to implement.
Order Preservation: For a single flow of data, FIFO guarantees that packets will be transmitted in the same order they were received, preventing out-of-order delivery by that router.

The Critical Weakness: No Differentiation

The major drawback of FIFO is that it treats all traffic with absolute equality. It has no concept of priority or application requirements. This leads to a significant problem known as . If a large, non-critical packet (like a segment of a file download) arrives at the front of the queue, it will be transmitted first. Any small, latency-sensitive packets (like a VoIP or DNS request packet) that arrive immediately after it must wait for the large packet to be fully transmitted. This single large packet can introduce significant delay for all subsequent, more important traffic, severely degrading the performance of real-time applications.

3. Priority Queueing (PQ): The VIP Treatment

Priority Queueing is one of the earliest QoS mechanisms designed to overcome the limitations of FIFO. It introduces the concept of differentiation by classifying traffic into multiple priority levels.

How Priority Queueing Works

With PQ, a router uses multiple queues for a single output interface, typically four: High, Medium, Normal, and Low. A classifier at the input assigns each arriving packet to one of these queues based on a predefined policy (e.g., based on protocol, port number, or DSCP marking). The then follows a very strict rule:

It always checks the High-priority queue first. If there are any packets in the High queue, it will transmit them until that queue is completely empty.
Only when the High queue is empty will the scheduler even look at the Medium queue.
Only when both the High and Medium queues are empty will it service the Normal queue, and so on.

Strengths of Priority Queueing

PQ is an extremely effective tool for protecting a small amount of critical, low-latency traffic. By placing VoIP packets, for example, in the High-priority queue, a network administrator can ensure that this traffic will experience the absolute minimum possible delay, as it will always be sent ahead of any other traffic.

The Risk of Starvation

The absolute nature of PQ is also its greatest weakness. If the volume of high-priority traffic is high enough to constantly keep the High queue occupied, the scheduler will never get a chance to service the lower-priority queues. This phenomenon is called starvation. Packets in the lower-priority queues could be delayed indefinitely or eventually dropped due to buffer age-out without ever being transmitted. For this reason, PQ is generally recommended only when the amount of high-priority traffic is well-known and guaranteed to be a small percentage of the total link capacity.

4. Weighted Fair Queueing (WFQ): The Fair Share Model

Weighted Fair Queueing (WFQ) offers a more balanced and sophisticated approach than Priority Queueing. Instead of giving absolute preference to one type of traffic, WFQ aims to provide a "fair" allocation of bandwidth to different traffic flows or classes, while also intelligently giving preferential treatment to low-latency traffic.

How Weighted Fair Queueing Works

WFQ automatically classifies traffic into different "flows" (often based on source/destination IPs and ports) and assigns each flow to its own logical queue. The scheduler then serves these queues in a round-robin fashion. The "fairness" and "weighting" come into play in two ways:

Fair Bandwidth Sharing: The scheduler attempts to ensure that each flow gets an equal share of the output bandwidth. If there are 10 active flows on a 10 Mbps link, each flow will be allocated approximately 1 Mbps. This prevents any single, aggressive flow (like a large FTP download) from starving other, less aggressive flows (like an interactive Telnet session).
Prioritizing Small Packets: WFQ is not purely round-robin. It is designed to prioritize packets that have a lower "finish time," meaning the time they would have finished transmitting if all flows were being served simultaneously in a perfect, bit-by-bit manner. Because smaller packets have an earlier finish time, this mechanism naturally gives higher priority to interactive traffic with small packet sizes, which dramatically improves response times.

Class-Based Weighted Fair Queueing (CBWFQ)

A powerful extension of WFQ is Class-Based WFQ (CBWFQ). In this model, the network administrator explicitly defines traffic classes based on various criteria and assigns each class to a specific queue. The administrator can then assign a weight or guaranteed minimum bandwidth to each class. For example, a policy might state:

The "Voice" class is guaranteed 2 Mbps.
The "Database" class is guaranteed 3 Mbps.
The "Web" class is guaranteed 4 Mbps.
All other traffic goes into a default class.

The scheduler then services the queues based on these weights, ensuring that each class receives at least its guaranteed bandwidth during times of congestion. If extra bandwidth is available, it can be shared among the classes.

To combine the best of both worlds, CBWFQ is often used in conjunction with a single strict-priority queue. This is known as Low Latency Queueing (LLQ). The LLQ mechanism creates one strict priority queue for real-time traffic (like VoIP), which is always served first (as in PQ), and then the remaining bandwidth is fairly allocated among the other traffic classes using CBWFQ. This provides minimal latency for voice while preventing starvation for all other data.

5. Active Queue Management (AQM)

The previously discussed mechanisms are "passive" in that they primarily deal with the order of packet transmission from non-empty queues. takes a more proactive approach, aiming to prevent severe congestion by signaling senders to slow down before queues become completely full.

The Problem: Tail Drop and Global Synchronization

The default behavior when a queue is full is tail drop. The router simply drops any and all newly arriving packets. This has a detrimental side effect called TCP global synchronization. When the queue fills, packets from many different TCP flows are dropped simultaneously. All of these TCP flows detect the loss at roughly the same time, all enter their congestion avoidance phase (cutting their windows in half), and all then try to ramp back up at the same time. This leads to inefficient cycles of network underutilization followed by waves of congestion.

Random Early Detection (RED)

The most famous AQM algorithm is Random Early Detection (RED). It avoids global synchronization by dropping packets proactively and randomly.

RED monitors the average queue depth, which smooths out short-term bursts.
It uses two thresholds: a minimum threshold ( $min_{th}$ ) and a maximum threshold ( $max_{th}$ ).
When the average queue depth is below $min_{th}$ , no packets are dropped.
When the average queue depth is between $min_{th}$ and $max_{th}$ , incoming packets are dropped with a probability that increases linearly from 0 to a pre-configured maximum as the average queue approaches $max_{th}$ .
When the average queue depth exceeds $max_{th}$ , all incoming packets are dropped.

By randomly dropping packets from different flows at different times, RED causes different TCP sessions to back off asynchronously, avoiding global synchronization and keeping the average queue depth lower and more stable. Weighted RED (WRED) is a popular variant that uses packet priorities (like DSCP values) to drop low-priority packets with a higher probability than high-priority packets.