Quality of Service (QoS)

Quality of Service (QoS) is the ability of a network to provide better or special service to selected network traffic over various technologies.

1. The Internet's Default: The Best-Effort Model

The foundational design of the internet and most IP-based networks is based on a "best-effort" delivery model. This means the network does its absolute best to deliver every packet of data to its destination, but it makes no guarantees about when it will arrive, how long it will take, or if it will arrive at all. In this model, all packets are treated with equal priority, whether they are part of a critical video conference, a massive file download, or a simple email.

For many applications, this model works perfectly fine. For example, when you download a file, the TCP protocol ensures that even if some packets are lost or delayed, they will be retransmitted until the entire file is correctly assembled at the destination. The total time it takes is not as critical as the final integrity of the data. However, as networks have evolved to carry a diverse mix of traffic, the limitations of the best-effort model have become glaringly apparent. For real-time applications like Voice over IP (VoIP), online gaming, or video streaming, the timing and consistency of packet delivery are paramount. A delayed or lost packet in a phone call results in a glitch or dropped audio, and it cannot be simply retransmitted later.

This fundamental conflict between the needs of different applications and the network's one-size-fits-all approach is the problem that Quality of Service (QoS) was created to solve.

2. Defining Quality of Service (QoS)

is the ability of a network to provide better or special service to selected network traffic over various technologies. The goal of QoS is to move away from the unpredictable best-effort model and toward a more controlled, predictable network environment. It achieves this by providing mechanisms to manage key network resources such as bandwidth, latency, and packet loss.

Implementing QoS allows a network administrator to define policies that treat different types of traffic differently. For example, they can ensure that a video conference gets the necessary bandwidth and low delay it needs to function smoothly, even if other users on the same network are engaged in heavy file downloading. QoS does not create new bandwidth; rather, it intelligently manages the existing bandwidth to meet the specific requirements of the applications running on the network.

3. The Core Metrics of Network Performance

QoS is fundamentally about managing four key parameters that define the performance of a network.

Bandwidth (Throughput)
Bandwidth refers to the amount of data that can be transmitted over a network link in a given amount of time, typically measured in bits per second (bps). While the physical link has a maximum capacity, QoS mechanisms can be used to guarantee a minimum amount of bandwidth to a critical application or to limit the bandwidth consumed by a non-critical one.
Delay (Latency)
is the total time it takes for a packet to travel from its source to its destination. This total delay is a sum of several components, including:
- Transmission Delay: The time required to push all of a packet's bits onto the link.
- Propagation Delay: The time it takes for a bit to travel the physical distance of the link at the speed of light in that medium.
- Processing Delay: Time taken by routers to process the packet's header.
- Queueing Delay: The time a packet spends waiting in a buffer (queue) inside a router before it can be transmitted. This is the component most affected by network congestion and the one that QoS primarily aims to control. A common quality threshold for many standard data packets is a delay of less than 150 milliseconds (ms).
Jitter (Delay Variation)
is the variation in the latency of packets. In an ideal network, packets would arrive at precisely regular intervals. In reality, due to varying levels of congestion in routers, the delay for each packet can be different. This variation is jitter. High jitter is extremely detrimental to real-time audio and video, as it causes the playback to sound choppy or the video to appear jerky. For many applications, low and consistent jitter is even more important than low latency. For video transmission, jitter should typically be less than 6.5 ms to ensure smooth playback.
Packet Loss (Loss)
Packet loss occurs when one or more packets of data traveling across a computer network fail to reach their destination. This is most often caused by network congestion, where a router's input buffers overflow and it is forced to drop incoming packets. For applications using TCP (like file transfers), lost packets are detected and retransmitted, resulting in lower throughput but no data loss. For real-time applications using UDP (like VoIP), lost packets are usually gone for good, resulting in audible gaps or video artifacts.

4. Application Requirements and Traffic Types

Different applications have vastly different requirements for network performance. A key aspect of QoS is understanding and classifying this traffic.

Real-Time Traffic (e.g., VoIP, Video Conferencing): This traffic is extremely sensitive to latency and jitter but can tolerate a small amount of packet loss. It requires consistent, predictable delivery.
Streaming Traffic (e.g., Video on Demand): This traffic is highly sensitive to jitter and requires a guaranteed level of bandwidth. It is less sensitive to latency than VoIP, as a buffer on the client-side can absorb some of the delay.
Critical Data (e.g., Medical Applications, Engineering): This traffic often requires lossless transmission (zero packet loss) and may need guaranteed bandwidth, but it might not be as sensitive to latency as real-time traffic.
Transactional Traffic (e.g., E-commerce, Database Queries): This type of traffic is bursty. Responsiveness (low latency) is key for a good user experience, but it does not typically require high sustained bandwidth.
Bulk Data Traffic (e.g., FTP, backups): This is often called "scavenger" traffic. Its primary requirement is high bandwidth to complete the transfer as quickly as possible. It is highly tolerant of latency and jitter because protocols like TCP manage reliability. QoS policies often assign this type of traffic the lowest priority.

Flow Characteristics

Traffic can also be classified by its flow characteristics:

CBR (Constant Bit Rate): Traffic that generates data at a steady, constant rate, like an uncompressed voice stream.
VBR (Variable Bit Rate): Traffic that generates data at a variable rate, like compressed video, which produces more data for complex scenes and less for simple ones.
Elastic Traffic: Traffic that can adapt its transmission rate based on network conditions, like a TCP file transfer.
Inelastic Traffic: Traffic that cannot easily adapt its rate and has strict performance requirements, like VoIP.

5. The QoS Toolkit: Mechanisms for Managing Traffic

Network devices like routers and switches use a suite of tools to implement QoS policies. The general process involves identifying traffic, classifying it into groups, and then applying specific treatments to each group.

Classification and Marking: The first step is to identify different types of traffic. This can be done by looking at various fields in the packet header, such as source/destination IP addresses, port numbers, or the protocol type. Once identified, the traffic is "marked" by setting a specific value in the packet's header. In IP networks, this is done using the Differentiated Services Code Point (DSCP) field in the IP header.
Queueing and Scheduling: When multiple packets arrive at a router's interface destined for the same output link, they are placed in a queue. A scheduler then decides the order in which packets are transmitted. This is the primary mechanism for prioritization. Instead of a single "first-in, first-out" (FIFO) queue, routers can use multiple queues with different scheduling algorithms like Priority Queueing (which always services the high-priority queue first) or Weighted Fair Queueing (which allocates a certain percentage of the bandwidth to each queue).
Traffic Shaping and Policing: These are two methods for controlling the rate of traffic.
- Policing: Enforces a strict bandwidth limit. Packets that exceed the configured rate are either dropped or re-marked to a lower priority. It acts like a gatekeeper that discards any excess.
- Shaping: Also enforces a bandwidth limit, but instead of dropping excess packets, it delays them in a buffer and sends them out later when bandwidth is available. This has the effect of smoothing out traffic bursts.
Congestion Avoidance: These are proactive mechanisms that try to prevent congestion before it becomes severe. For example, a technique called Random Early Detection (RED) monitors the average queue depth in a router. As the queue starts to fill up (indicating impending congestion), RED begins to randomly drop a small number of packets from different flows. This signals to TCP senders that they should slow down their transmission rates, thus helping to avoid the complete buffer overflow that would lead to mass packet drops.