Network Working Group J. Lang, Ed. Request for Comments: 4426 B. Rajagopalan, Ed. Category: Standards Track D. Papadimitriou, Ed. March 2006 Generalized Multi-Protocol Label Switching (GMPLS) Recovery Functional Specification Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2006).
AbstractThis document presents a functional description of the protocol extensions needed to support Generalized Multi-Protocol Label Switching (GMPLS)-based recovery (i.e., protection and restoration). Protocol specific formats and mechanisms will be described in companion documents. 1. Introduction ................................................. 2 1.1. Conventions Used in This Document ...................... 3 2. Span Protection .............................................. 3 2.1. Unidirectional 1+1 Dedicated Protection ................ 4 2.2. Bi-directional 1+1 Dedicated Protection ................ 5 2.3. Dedicated 1:1 Protection with Extra Traffic ............ 6 2.4. Shared M:N Protection .................................. 8 2.5. Messages ............................................... 10 2.5.1. Failure Indication Message ..................... 10 2.5.2. Switchover Request Message ..................... 11 2.5.3. Switchover Response Message .................... 11 2.6. Preventing Unintended Connections ...................... 12 3. End-to-End (Path) Protection and Restoration ................. 12 3.1. Unidirectional 1+1 Protection .......................... 12 3.2. Bi-directional 1+1 Protection .......................... 12 3.2.1. Identifiers .................................... 13 3.2.2. Nodal Information .............................. 14
3.2.3. End-to-End Failure Indication Message .......... 14 3.2.4. End-to-End Failure Acknowledgement Message ..... 15 3.2.5. End-to-End Switchover Request Message .......... 15 3.2.6. End-to-End Switchover Response Message ......... 15 3.3. Shared Mesh Restoration ................................ 15 3.3.1. End-to-End Failure Indication and Acknowledgement Message ........................ 16 3.3.2. End-to-End Switchover Request Message .......... 16 3.3.3. End-to-End Switchover Response Message ......... 17 4. Reversion and Other Administrative Procedures ................ 17 5. Discussion ................................................... 18 5.1. LSP Priorities During Protection ....................... 18 6. Security Considerations ...................................... 19 7. Contributors ................................................. 20 8. References ................................................... 21 8.1. Normative References ................................... 21 8.2. Informative References ................................. 22 RFC4427]). A label-switched path (LSP) may be subject to local (span), segment, and/or end-to-end recovery. Local span protection refers to the protection of the link (and hence all the LSPs marked as required for span protection and routed over the link) between two neighboring switches. Segment protection refers to the recovery of an LSP segment (i.e., an SNC in the ITU-T terminology) between two nodes, i.e., the boundary nodes of the segment. End-to-end protection refers to the protection of an entire LSP from the ingress to the egress port. The end-to-end recovery models discussed in this document apply to segment protection where the source and destination refer to the protected segment rather than the entire LSP. Multiple recovery levels may be used concurrently by a single LSP for added resiliency; however, the interaction between levels affects any one direction of the LSP results in both directions of the LSP being switched to a new span, segment, or end-to-end path.
Unless otherwise stated, all references to "link" in this document indicate a bi-directional link (which may be realized as a pair of unidirectional links). Consider the control plane message flow during the establishment of an LSP. This message flow proceeds from an initiating (or source) node to a terminating (or destination) node, via a sequence of intermediate nodes. A node along the LSP is said to be "upstream" from another node if the former occurs first in the sequence. The latter node is said to be "downstream" from the former node. That is, an "upstream" node is closer to the initiating node than a node further "downstream". Unless otherwise stated, all references to "upstream" and "downstream" are in terms of the control plane message flow. The flow of the data traffic is defined from ingress (source node) to egress (destination node). Note that for bi-directional LSPs, there are two different data plane flows, one for each direction of the LSP. This document presents a protocol functional description to support Generalized Multi-Protocol Label Switching (GMPLS)-based recovery (i.e., protection and restoration). Protocol-specific formats, encoding, and mechanisms will be described in companion documents. RFC2119]. In addition, the reader is assumed to be familiar with the terminology used in [RFC3945], [RFC3471] and referenced as well as [RFC4427]. Section 2.1), each node A and B acts autonomously to select the signal from the working link i or the protection link j. Under bi-directional 1+1 span protection (Section 2.2) the two nodes A and B coordinate the selection function such that they select the signal from the same link, i or j.
Under the second model, a set of N working links are protected by a set of M protection links, usually with M =< N. A failure in any of the N working links results in traffic being switched to one of the M protection links that is available. This is typically a three-step process: first the data plane failure is detected at the egress node and reported (notification), then a protection link is selected, and finally, the LSPs on the failed link are moved to the protection link. If reversion is supported, a fourth step is included, i.e., return of the traffic to the working link (when the working link has recovered from the failure). In Section 2.3, 1:1 span protection is described. In Section 2.4, M:N span protection is described, where M =< N. RFC4205]. Encoding of this information in OSPF is specified in [RFC4203]. o Signaling: The Link Protection object/TLV SHOULD be used to request "Dedicated 1+1" link protection for that LSP. This object/TLV is defined in [RFC3471]. If the Link Protection object/TLV is not used, link selection is a matter of local policy. No additional signaling is required when a fail-over occurs.
o Link management: Both nodes MUST have a consistent view of the link protection association for the spans. This can be done using the Link Management Protocol (LMP) [RFC4204], or if LMP is not used, this MUST be configured manually. RFC4204]. Note that GMPLS-based mechanisms MAY not be necessary when the underlying span (transport) technology provides such a mechanism.
Note: Although this mechanism implies more traffic dropped than necessary, it is preferred over possible misconnections during the recovery process. From the description above, it is clear that 1:1 span protection may require up to three signaling messages for each failed span: a failure indication message, an LSP Switchover Request message, and an LSP Switchover Response message. Furthermore, it may be possible to switch multiple LSPs from the working span to the protection span simultaneously. The following functionality is required for dedicated 1:1 span protection: o Pre-emption MUST be supported to accommodate Extra Traffic. o Routing: A single TE link encompassing both working and protection links is announced with a Link Protection Type "Dedicated 1:1". If Extra Traffic is supported over the protection link, then the bandwidth parameters for the protection link MUST also be announced. The differentiation between bandwidth for working and protect links is made using priority mechanisms. In other words, the network MUST be configured such that bandwidth at priority X or lower is considered Extra Traffic. If there is a failure on the working link, then the normal traffic is switched to the protection link, pre-empting Extra Traffic if necessary. The bandwidth for the protection link MUST be adjusted accordingly. o Signaling: To establish an LSP on the working link, the Link Protection object/TLV indicating "Dedicated 1:1" SHOULD be included in the signaling request message for that LSP. To establish an LSP on the protection link, the appropriate priority (indicating Extra Traffic) SHOULD be used for that LSP. These objects/TLVs are defined in [RFC3471]. If the Link Protection object/TLV is not used, link selection is a matter of local policy. o Link management: Both nodes MUST have a consistent view of the link protection association for the spans. This can be done using LMP [RFC4204] or via manual configuration. o When a link failure is detected at the slave, a failure indication message MUST be sent to the master informing the node of the link failure.
failure to simultaneously affect both directions of the bi- directional link. In this case, A and B will concurrently detect failures, in the B-to-A direction and in the A-to-B direction, respectively. The basic steps in M:N protection (ignoring reversion) are as follows: 1. If the master detects a failure of a working link, it autonomously invokes a process to allocate a protection link to the affected traffic. 2. If the slave detects a failure of a working link, it MUST inform the master of the failure using a failure indication message. The master then invokes the same procedure as above to allocate a protection link. (It is possible that the master has itself detected the same failure, for example, a failure simultaneously affecting both directions of a link.) 3. Once the master has determined the identity of the protection link, it indicates this to the slave and requests the switchover of the traffic (using a "Switchover Request" message). Prior to this, if the protection link is carrying Extra Traffic, the master stops using the link for this traffic (i.e., the traffic is dropped by the master and not forwarded into or out of the protection link). 4. The slave sends a "Switchover Response" message back to the master. Prior to this, if the selected protection link is carrying traffic that could be pre-empted, the slave stops using the link for this traffic (i.e., the traffic is dropped by the slave and not forwarded into or out of the protection link). It then starts sending the normal traffic on the selected protection link. 5. When the master receives the Switchover Response, it starts sending and receiving the traffic that was previously carried on the now-failed link over the new link. Note: Although this mechanism implies more traffic dropped than necessary, it is preferred over possible misconnections during the recovery process. From the description above, it is clear that M:N span restoration (involving LSP local recovery) MAY require up to three messages for each working link being switched: a failure indication message, a Switchover Request message, and a Switchover Response message.
The following functionality is required for M:N span restoration: o Pre-emption MUST be supported to accommodate Extra Traffic. o Routing: A single TE link encompassing both sets of working and protect links should be announced with a Link Protection Type "Shared M:N". If Extra Traffic is supported over a set of the protection links, then the bandwidth parameters for the set of protection links MUST also be announced. The differentiation between bandwidth for working and protect links is made using priority mechanisms. If there is a failure on a working link, then the affected LSP(s) MUST be switched to a protection link, pre-empting Extra Traffic if necessary. The bandwidth for the protection link MUST be adjusted accordingly. o Signaling: To establish an LSP on the working link, the Link Protection object/TLV indicating "Shared M:N" SHOULD be included in the signaling request message for that LSP. To establish an LSP on the protection link, the appropriate priority (indicating Extra Traffic) SHOULD be used. These objects/TLVs are defined in [RFC3471]. If the Link Protection object/TLV is not used, link selection is a matter of local policy. o For link management, both nodes MUST have a consistent view of the link protection association for the links. This can be done using LMP [RFC4204] or via manual configuration.
The number of links included in the message depends on the number of failures detected within a window of time by the sending node. A node MAY choose to send separate failure indication messages in the interest of completing the recovery for a given link within an implementation-dependent time constraint. RFC4201] is used where the working and protect links are mapped to component links, and the labels are the same on the working and protection links, it MAY be possible to change the component links without needing to re-signal each individual LSP. Optionally, the labels MAY need to be explicitly coordinated between the two nodes. In this case, the Switchover Request message SHOULD carry the new label mappings. The master may not be able to find protection links to accommodate all failed working links. Thus, if this message is generated in response to a Failure Indication message from the slave, then the set of failed links in the message MAY be a sub-set of the links received in the Failure Indication message. Depending on time constraints, the master may switch the normal traffic from the set of failed links in smaller batches. Thus, a single failure indication message MAY result in the master sending more than one Switchover Request message to the same slave node.
master and slave notify the user (operator) of the failed switchover. A notification of the failure MAY also be used as a trigger in an end-to-end recovery.
Note that this requires coordination between the end nodes to switch to the protection path. The basic steps in bi-directional 1+1 path protection are as follows: o Failure detection: There are two possibilities for this. 1. A node in the working path detects a failure event. Such a node MUST send a Failure Indication message toward the upstream or/and downstream end node of the LSP (node A or B). This message MAY be forwarded along the working path or routed over a different path if the network has general routing intelligence. Mechanisms provided by the data transport plane MAY also be used for this, if available. 2. The end nodes (A or B) detect the failure themselves (e.g., loss of signal). o Switchover: The action taken when an end node detects a failure in the working path is as follows: Start receiving from the protection path; at the same time, send a Switchover Request message to the other end node to enable switching at the other end. The action taken when an end node receives a Switchover Request message is as follows: - Start receiving from the protection path; at the same time, send a Switchover Response message to the other end node. GMPLS signaling mechanisms MAY be used to (reliably) signal the Failure Indication message, as well as the Switchover Request and Response message. These messages MAY be forwarded along the protection path if no other routing intelligence is available in the network.
Thus, shared mesh restoration is designed to protect an LSP after a single failure event, i.e., a failure that affects the working path of at most one LSP sharing the protection capacity. It is possible that a protection path may not be successfully activated when multiple, concurrent failure events occur. In this case, shared mesh restoration capacity may be claimed for more than one failed LSP and the protection path can be activated only for one of them (at most). For implementing shared mesh restoration, the identifier and nodal information related to signaling along the control path are as defined for 1+1 protection in Sections 3.2.1 and 3.2.2. In addition, each node MUST also keep (local) information needed to establish the data plane of the protection path. This information MUST indicate the local resources to be allocated, the fabric cross-connect to be established to activate the path, etc. The precise nature of this information would depend on the type of node and LSP (the GMPLS signaling document describes different type of switches [RFC3471]). It would also depend on whether the information is fine or coarse- grained. For example, fine-grained information would indicate pre- selection of all details pertaining to protection path activation, such as outgoing link, labels, etc. Coarse-grained information, on the other hand, would allow some details to be determined during protection path activation. For example, protection resources may be pre-selected at the level of a TE link, while the selection of the specific component link and label occurs during protection path activation. While the coarser specification allows some flexibility in the selection of the precise resource to activate, it also adds complexity in decision making and signaling during the time-critical restoration phase. Furthermore, the procedures for the assignment of bandwidth to protection paths MUST take into account the total resources in a TE link so that single-failure survivability requirements are satisfied. Sections 3.2.3 and 3.2.4.
establishes cross-connects for the path. This would allow shared mesh restoration paths to be efficiently utilized. The End-to-End Switchover message MUST be sent reliably from the source to the destination of the LSP along the protection path.
selects the traffic from the working path. At the same time, it bridges the transmitted traffic onto both the working and protection paths. 3. The destination then sends a Bridge and Switch Response message to the source confirming the completion of the operation. 4. When the source receives this message, it switches to receive from the working path, and stops transmitting traffic on the protection path. The source then sends a Bridge and Switch Completed message to the destination confirming that the LSP has been reverted. 5. Upon receipt of this message, the destination stops transmitting along the protection path and de-activates the LSP along this path. The de-activation procedure should remove the crossed connections along the protection path (and frees the resources to be used for restoring other failures). Administrative procedures other than reversion include the ability to force a switchover (from working to protection or vice versa) and locking out switchover, i.e., preventing an LSP from moving from working to protection administratively. These administrative conditions have to be supported by signaling. Section 3.3, more than one LSP
can claim shared resources under multiple failure scenarios. If such resources are first allocated to a lower-priority LSP, they MAY have to be reclaimed and allocated to a higher-priority LSP.
Bala Rajagopalan Microsoft India Development Center Hyderabad, India EMail: email@example.com Yakov Rekhter (Juniper) 1194 N. Mathilda Avenue Sunnyvale, CA 94089, USA EMail: firstname.lastname@example.org [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3471] Berger, L., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Functional Description", RFC 3471, January 2003. [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling in MPLS Traffic Engineering (TE)", RFC 4201, October 2005. [RFC4203] Kompella, K., Ed. and Y. Rekhter, Ed., "OSPF Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4203, October 2005. [RFC4204] Lang, J., Ed., "Link Management Protocol (LMP)", RFC 4204, October 2005. [RFC4205] Kompella, K., Ed. and Y. Rekhter, Ed., "Intermediate System to Intermediate System (IS-IS) Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4205, October 2005.
Full Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at email@example.com. Acknowledgement Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA).