SDH Protection Schemes
1+1, MSP, SNCP, and ring-based protection approaches.
Introduction: The Network's Safety Net
A telecommunications network is like a massive highway system for data. But what happens if a bridge collapses or a road is blocked by an accident? In a real-world highway, traffic grinds to a halt. In a well-designed telecommunications network, traffic is automatically and almost instantaneously rerouted around the failure. This ability to survive faults is a feature called resiliency, and it is one of the most critical aspects of network design.
One of the primary reasons SDH/SONET became the backbone of global communication for decades was its powerful, standardized, and extremely fast protection switching mechanisms. These mechanisms, built directly into the technology's DNA, allow networks to "self-heal" in milliseconds, often without the end-user ever noticing a problem. This page explores the most important protection schemes defined in SDH/SONET.
Fundamental Concepts of Protection
All protection schemes are built on the core principle of redundancy: having a backup resource ready to take over in case the primary one fails.
- Working vs. Protection: In any scheme, the primary path or channel used for traffic under normal conditions is called the working entity. The backup path or channel is called the protection entity. The key requirement is that the working and protection paths must be physically diverse (geographically separated) to be effective against events like a cable cut.
- Automatic Protection Switching (APS): This is the protocol that governs the detection of a failure and the coordination of the switchover from the working to the protection path. In SDH/SONET, APS communication is carried in dedicated bytes within the (specifically, the K1 and K2 bytes). These bytes allow the devices at both ends of a link to signal failures and coordinate a switch in under 50 milliseconds.
- Line vs. Path Protection: Protection can be applied at different layers. Line protection (like MSP) protects the entire high-speed line signal (the whole STM-N). Path protection (like SNCP) protects a specific, individual payload (a Virtual Container) traveling within the line signal.
Linear Protection Schemes (Point-to-Point)
These schemes are used to protect simple point-to-point links, like a single fiber optic connection between two cities.
Interactive linear protection
Compare how SDH line and path schemes recover when a span fails.
Permanent bridge sends the same VC on both fibers; the receiver continually picks the better one.
Both copies are live; the destination continuously monitors signal quality.
- Head-end bridges traffic simultaneously onto working and protection fibers.
- Tail-end evaluates Path Overhead to pick the healthier stream.
- No signaling is required—the spare copy is already in place.
1+1 (One-Plus-One) Architecture: Dedicated Protection
This is the simplest and fastest form of protection. The name describes its function: for one working path, there is one dedicated protection path.
- Principle: "Bridge and Select." The sending node (head-end) permanently transmits the exact same signal on both the working and protection fibers simultaneously. This is known as a permanent bridge.
- Operation: The receiving node (tail-end) continuously monitors the signal quality on both incoming fibers. It selects the higher-quality signal to pass on to the rest of the network.
- Switching: If the working fiber is cut, the receiver simply continues using the signal from the protection fiber, which it was already receiving. The switch is virtually instantaneous (typically well under 50 ms) because there is no signaling or coordination required (the backup signal is already there).
- Advantages: Extremely fast and simple. It's the highest level of protection.
- Disadvantages: Highly inefficient. 50% of the network's line capacity is permanently reserved for backup and cannot be used for any other traffic. It's like paying for two first-class plane tickets but only using one seat.
1:1 (One-to-One) Architecture
This architecture improves upon the efficiency of 1+1 by allowing the protection path to be used for other traffic during normal operation.
- Principle: The working path carries high-priority traffic. The protection path is not idle; it can carry low-priority, preemptible traffic.
- Operation: The head-end and tail-end nodes constantly communicate via the K1/K2 APS bytes. Under normal conditions, traffic flows on both paths.
- Switching: When a failure is detected on the working path, the head-end signals a switch request to the tail-end using the K1/K2 bytes. Both nodes coordinate to immediately switch the high-priority traffic onto the protection path, dropping (preempting) the low-priority traffic that was previously using it.
- Advantages: More efficient use of bandwidth compared to 1+1, as the protection fiber is not wasted.
- Disadvantages: The switchover is slightly slower due to the required signaling, and the mechanism is more complex.
1:N (One-to-N) Architecture
This is the most resource-efficient linear scheme, where one shared protection channel is used to protect N working channels.
- Analogy: This is like having a single spare tire that can be used for any of the four wheels on a car.
- Operation: It requires more complex switching hardware at both ends. When any one of the N working channels fails, the APS protocol coordinates switching its specific traffic to the single shared protection channel.
- Limitation: It can only protect against a single failure at a time within the group of N channels. If two working channels fail simultaneously, only one can be protected.
Ring Protection: Self-Healing Networks
While linear protection is useful, the most common and robust architecture for core networks is the ring. A ring topology naturally provides two physically diverse paths between any two nodes on the ring, making it ideal for protection.
Multiplex Section-Shared Protection Ring (MS-SPRing)
Known in SONET as Bidirectional Line Switched Ring (BLSR), this is the most powerful and widely deployed ring protection mechanism.
BLSR / MS-SPRing shared protection
Follow how idle protection channels spring into action during a span failure.
Only the shortest path carries payload; the opposite direction stays empty for protection.
- Node A sends traffic to Node C clockwise using the working channels.
- Protection capacity remains idle but synchronised around the ring.
- No bandwidth is consumed on the reserved path until a failure occurs.
Architecture (2-Fiber Example)
A 2-fiber ring uses two optical fibers between each node. On each fiber, the total capacity (e.g., all timeslots in SDH) is divided in half: 50% is for working traffic, and 50% is reserved for protection traffic. Traffic normally travels in both directions around the ring (e.g., channels 1-8 go clockwise, channels 9-16 go counter-clockwise).
The Loopback Mechanism in Action
The genius of MS-SPRing is the "loopback" switch, which protects against a complete cable cut:
- Failure: A cable is cut between Node B and Node C. Both fibers are severed.
- Detection: Node B detects a Loss of Signal (LoS) on its link towards C. Node C detects a LoS from B.
- Signaling: Node B and C immediately use the K1/K2 bytes to broadcast a failure notification around the entire ring.
- Loopback Switch:
- Node B, the node just before the break, performs a "loopback". The high-priority working traffic that it would have sent towards C is now looped back onto the protection capacity of the fiber going away from the break (back towards A).
- Node C, the node just after the break, performs the same action in reverse. Working traffic arriving from D meant for B is looped back onto the protection capacity going back towards D.
- Result: The traffic now travels the "long way around" the ring using the reserved protection bandwidth, completely bypassing the failed segment. The connection is restored, typically in less than 50 milliseconds.
Path Protection: SNCP / UPSR
An alternative to protecting the entire line is to protect individual payloads or paths. In SDH, this is known as Sub-Network Connection Protection (SNCP), and in SONET, it is called an Unidirectional Path-Switched Ring (UPSR).
UPSR / SNCP traffic duplication
Toggle the scenario to see how dual counter-rotating copies keep the payload alive.
Identical VC traffic is bridged onto both directions. The destination continuously monitors both.
- Node A transmits the payload clockwise and counter-clockwise at the same time.
- Node C compares quality metrics in the Path Overhead to pick the healthier stream.
- The mirrored copy remains available if the preferred path degrades.
- Principle: SNCP operates on the same "bridge and select" principle as 1+1 linear protection, but applied to a specific end-to-end path (a Virtual Container).
- Operation in a Ring: At the source node (e.g., A), a specific VC payload is permanently bridged and sent in both directions around the ring simultaneously (clockwise and counter-clockwise). The destination node (e.g., C) receives two identical copies of the VC from both directions. It continuously monitors them and selects the better one for its output.
- Switching: If a failure occurs on one side of the ring, the destination node simply continues using the copy of the VC that is arriving from the other, unaffected side. The switch is extremely fast.
- MSP vs. SNCP: MSP (like MS-SPRing) is more efficient with bandwidth but protects the entire line (all traffic is switched together). SNCP is less efficient (like 1+1) but allows for per-service protection, meaning the failure of one path doesn't affect others, and different paths can have different levels of protection.