Network Working Group J. Rosenberg Request for Comments: 5411 Cisco Category: Informational January 2009 A Hitchhiker's Guide to the Session Initiation Protocol (SIP) Status of This Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.
AbstractThe Session Initiation Protocol (SIP) is the subject of numerous specifications that have been produced by the IETF. It can be difficult to locate the right document, or even to determine the set of Request for Comments (RFC) about SIP. This specification serves as a guide to the SIP RFC series. It lists a current snapshot of the specifications under the SIP umbrella, briefly summarizes each, and groups them into categories. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Scope of This Document . . . . . . . . . . . . . . . . . . . . 4 3. Core SIP Specifications . . . . . . . . . . . . . . . . . . . 5 4. Public Switched Telephone Network (PSTN) Interworking . . . . 8 5. General Purpose Infrastructure Extensions . . . . . . . . . . 10 6. NAT Traversal . . . . . . . . . . . . . . . . . . . . . . . . 12 7. Call Control Primitives . . . . . . . . . . . . . . . . . . . 13 8. Event Framework . . . . . . . . . . . . . . . . . . . . . . . 14 9. Event Packages . . . . . . . . . . . . . . . . . . . . . . . . 15 10. Quality of Service . . . . . . . . . . . . . . . . . . . . . . 16 11. Operations and Management . . . . . . . . . . . . . . . . . . 17 12. SIP Compression . . . . . . . . . . . . . . . . . . . . . . . 17 13. SIP Service URIs . . . . . . . . . . . . . . . . . . . . . . . 17 14. Minor Extensions . . . . . . . . . . . . . . . . . . . . . . . 19 15. Security Mechanisms . . . . . . . . . . . . . . . . . . . . . 20 16. Conferencing . . . . . . . . . . . . . . . . . . . . . . . . . 23 17. Instant Messaging, Presence, and Multimedia . . . . . . . . . 24 18. Emergency Services . . . . . . . . . . . . . . . . . . . . . . 25 19. Security Considerations . . . . . . . . . . . . . . . . . . . 25 20. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25 21. Informative References . . . . . . . . . . . . . . . . . . . . 26
RFC3261] is the subject of numerous specifications that have been produced by the IETF. It can be difficult to locate the right document, or even to determine the set of Request for Comments (RFC) about SIP. "Don't Panic!" [HGTTG] This specification serves as a guide to the SIP RFC series. It is a current snapshot of the specifications under the SIP umbrella at the time of publication. It is anticipated that this document itself will be regularly updated as SIP specifications mature. Furthermore, it references many specifications, which, at the time of publication of this document, were not yet finalized, and may eventually be completed or abandoned. Therefore, the enumeration of specifications here is a work-in-progress and subject to change. For each specification, a paragraph or so description is included that summarizes the purpose of the specification. Each specification also includes a letter that designates its category in the Standards Track [RFC2026]. These values are: S: Standards Track (Proposed Standard, Draft Standard, or Standard) E: Experimental B: Best Current Practice I: Informational The specifications are grouped together by topic. The topics are: Core: The SIP specifications that are expected to be utilized for each session or registration an endpoint participates in. Public Switched Telephone Network (PSTN) Interop: Specifications related to interworking with the telephone network. General Purpose Infrastructure: General purpose extensions to SIP, SDP (Session Description Protocol), and MIME, but ones that are not expected to always be used. NAT Traversal: Specifications to deal with firewall and NAT traversal. Call Control Primitives: Specifications for manipulating SIP dialogs and calls.
Event Framework: Definitions of the core specifications for the SIP event framework, providing for pub/sub capability. Event Packages: Packages that utilize the SIP event framework. Quality of Service: Specifications related to multimedia quality of service (QoS). Operations and Management: Specifications related to configuration and monitoring of SIP deployments. SIP Compression: Specifications to facilitate usage of SIP with the Signaling Compression (Sigcomp) framework. SIP Service URIs: Specifications on how to use SIP URIs to address multimedia services. Minor Extensions: Specifications that solve a narrow problem space or provide an optimization. Security Mechanisms: Specifications providing security functionality for SIP. Conferencing: Specifications for multimedia conferencing. Instant Messaging, Presence, and Multimedia: SIP extensions related to IM, presence, and multimedia. This covers only the SIP extensions related to these topics. See [SIMPLE] for a full treatment of SIP for IM and Presence (SIMPLE). Emergency Services: SIP extensions related to emergency services. See [ECRIT-FRAME] for a more complete treatment of additional functionality related to emergency services. Typically, SIP extensions fit naturally into topic areas, and implementors interested in a particular topic often implement many or all of the specifications in that area. There are some specifications that fall into multiple topic areas, in which case they are listed more than once. Do not print all the specs cited here at once, as they might share the fate of the rules of Brockian Ultracricket when bound together: collapse under their own gravity and form a black hole [HGTTG]. This document itself is not an update to RFC 3261 or an extension to SIP. It is an informational document, meant to guide newcomers, implementors, and deployers to the many specifications associated with SIP.
RFC 3261 and any specification that defines an extension to it, where an extension is a mechanism that changes or updates in some way a behavior specified there. o The basic SDP specification [RFC4566] and any specification that defines an extension to SDP whose primary purpose is to support SIP. o Any specification that defines a MIME object whose primary purpose is to support SIP. Excluded from this list are requirements, architectures, registry definitions, non-normative frameworks, and processes. Best Current Practices are included when they normatively define mechanisms for accomplishing a task, or provide significant description of the usage of the normative specifications, such as call flows. The SIP change process [RFC3427] defines two types of extensions to SIP: normal extensions and the so-called P-headers (where P stands for "preliminary", "private", or "proprietary", and the "P-" prefix is included in the header field name), which are meant to be used in areas of limited applicability. P-headers cannot be defined in the Standards Track. For the most part, P-headers are not included in the listing here, with the exception of those that have seen general usage despite their P-header status. This document includes specifications, which have already been approved by the IETF and granted an RFC number, in addition to Internet Drafts, which are still under development within the IETF and will eventually finish and get an RFC number. Inclusion of Internet Drafts here helps encourage early implementation and demonstrations of interoperability of the protocol, and thus aids in the standards-setting process. Inclusion of these also identifes where the IETF is targetting a solution at a particular problem space. Note that final IANA assignment of codepoints (such as option tags and header field names) does not take place until shortly before publication as an RFC, and thus codepoint assignments may change.
HGTTG]. RFC 3261, The Session Initiation Protocol (S): [RFC3261] is the core SIP protocol itself. RFC 3261 obsoletes [RFC2543]. It is the president of the galaxy [HGTTG] as far as the suite of SIP specifications is concerned. RFC 3263, Locating SIP Servers (S): [RFC3263] provides DNS procedures for taking a SIP URI and determining a SIP server that is associated with that SIP URI. RFC 3263 is essential for any implementation using SIP with DNS. RFC 3263 makes use of both DNS SRV records [RFC2782] and NAPTR records [RFC3401]. RFC 3264, An Offer/Answer Model with the Session Description Protocol (S): [RFC3264] defines how the Session Description Protocol (SDP) [RFC4566] is used with SIP to negotiate the parameters of a media session. It is in widespread usage and an integral part of the behavior of RFC 3261. RFC 3265, SIP-Specific Event Notification (S): [RFC3265] defines the SUBSCRIBE and NOTIFY methods. These two methods provide a general event notification framework for SIP. To actually use the framework, extensions need to be defined for specific event packages. An event package defines a schema for the event data and describes other aspects of event processing specific to that schema. An RFC 3265 implementation is required when any event package is used.
RFC 3325, Private Extensions to SIP for Asserted Identity within Trusted Networks (I): Though its P-header status implies that it has limited applicability, [RFC3325], which defines the P-Asserted- Identity header field, has been widely deployed. It is used as the basic mechanism for providing network-asserted caller ID services. Its intended update, [UPDATE-PAI], clarifies its usage for connected party identification as well. RFC 3327, SIP Extension Header Field for Registering Non-Adjacent Contacts (S): [RFC3327] defines the Path header field. This field is inserted by proxies between a client and their registrar. It allows inbound requests towards that client to traverse these proxies prior to being delivered to the user agent. It is essential in any SIP deployment that has edge proxies, which are proxies between the client and the home proxy or SIP registrar. RFC 3581, An Extension to SIP for Symmetric Response Routing (S): [RFC3581] defines the rport parameter of the Via header. It allows SIP responses to traverse NAT. It is one of several specifications that are utilized for NAT traversal (see Section 6). RFC 3840, Indicating User Agent Capabilities in SIP (S): [RFC3840] defines a mechanism for carrying capability information about a user agent in REGISTER requests and in dialog-forming requests like INVITE. It has found use with conferencing (the isfocus parameter declares that a user agent is a conference server) and with applications like push-to-talk. RFC 4320, Actions Addressing Issues Identified with the Non-INVITE Transaction in SIP (S): [RFC4320] formally updates RFC 3261 and modifies some of the behaviors associated with non-INVITE transactions. This addresses some problems found in timeout and failure cases. RFC 4474, Enhancements for Authenticated Identity Management in SIP (S): [RFC4474] defines a mechanism for providing a cryptographically verifiable identity of the calling party in a SIP request. Known as "SIP Identity", this mechanism provides an alternative to RFC 3325. It has seen little deployment so far, but its importance as a key construct for anti-spam techniques and new security mechanisms makes it a core part of the SIP specifications. GRUU, Obtaining and Using Globally Routable User Agent Identifiers (GRUU) in SIP (S): [GRUU] defines a mechanism for directing requests towards a specific UA instance. GRUU is essential for features like transfer and provides another piece of the SIP NAT traversal story.
OUTBOUND, Managing Client Initiated Connections through SIP (S): [OUTBOUND], also known as SIP outbound, defines important changes to the SIP registration mechanism that enable delivery of SIP messages towards a UA when it is behind a NAT. This specification is the cornerstone of the SIP NAT traversal strategy. RFC 4566, Session Description Protocol (S): [RFC4566] defines a format for representing multimedia sessions. SDP objects are carried in the body of SIP messages and, based on the offer/answer model, are used to negotiate the media characteristics of a session between users. SDP-CAP, SDP Capability Negotiation (S): [SDP-CAP] defines a set of extensions to SDP that allows for capability negotiation within SDP. Capability negotiation can be used to select between different profiles of RTP (secure vs. unsecure) or to negotiate codecs such that an agent has to select one amongst a set of supported codecs. ICE, Interactive Connectivity Establishment (ICE) (S): [ICE] defines a technique for NAT traversal of media sessions for protocols that make use of the offer/answer model. This specification is the IETF-recommended mechanism for NAT traversal for SIP media streams, and is meant to be used even by endpoints that are themselves never behind a NAT. A SIP option tag and media feature tag [OPTION-TAG] (also a core specification) have been defined for use with ICE. RFC 3605, Real Time Control Protocol (RTCP) Attribute in the Session Description Protocol (SDP) (S): [RFC3605] defines a way to explicitly signal, within an SDP message, the IP address and port for RTCP, rather than using the port+1 rule in the Real Time Transport Protocol (RTP) [RFC3550]. It is needed for devices behind NAT, and the specification is required by ICE. RFC 4916, Connected Identity in the Session Initiation Protocol (SIP) (S): [RFC4916] formally updates RFC 3261. It defines an extension to SIP that allows a calling user to determine the identity of the final called user (connected party). Due to forwarding and retargeting services, this may not be the same as the user that the caller was originally trying to reach. The mechanism works in tandem with the SIP identity specification [RFC4474] to provide signatures over the connected party identity. It can also be used if a party identity changes mid-call due to third-party call control actions or PSTN behavior.
RFC 3311, The SIP UPDATE Method (S): [RFC3311] defines the UPDATE method for SIP. This method is meant as a means for updating session information prior to the completion of the initial INVITE transaction. It can also be used to update other information, such as the identity of the participant [RFC4916], without involving an updated offer/answer exchange. It was developed initially to support [RFC3312], but has found other uses. In particular, its usage with RFC 4916 means it will typically be used as part of every session, to convey a secure, connected identity. SIPS-URI, The Use of the SIPS URI Scheme in the Session Initiation Protocol (SIP) (S): [SIPS-URI] is intended to update RFC 3261. It revises the processing of the SIPS URI, originally defined in RFC 3261, to fix many errors and problems that have been encountered with that mechanism. RFC 3665, Session Initiation Protocol (SIP) Basic Call Flow Examples (B): [RFC3665] contains best-practice call flow examples for basic SIP interactions -- call establishment, termination, and registration. Essential Corrections to SIP: A collection of fixes to SIP that address important bugs and vulnerabilities. These include a fix requiring loop detection in any proxy that forks [LOOP-FIX], a clarification on how record-routing works [RECORD-ROUTE], and a correction to the IPv6 BNF [ABNF-FIX]. RFC 2848, The PINT Service Protocol (S): [RFC2848] is one of the earliest extensions to SIP. It defines procedures for using SIP to invoke services that actually execute on the PSTN. Its main application is for third-party call control, allowing an IP host to set up a call between two PSTN endpoints. PINT (PSTN/Internet Interworking) has a relatively narrow focus and has not seen widespread deployment. RFC 3910, The SPIRITS Protocol (S): Continuing the trend of naming PSTN-related extensions with alcohol references, SPIRITS (Services in PSTN Requesting Internet Services) [RFC3910] defines the inverse of PINT. It allows a switch in the PSTN to ask an IP element how to proceed with call waiting. It was developed primarily to support Internet Call Waiting (ICW). Perhaps the next specification will be called the Pan Galactic Gargle Blaster
[HGTTG]. RFC 3372, SIP for Telephones (SIP-T): Context and Architectures (I): SIP-T [RFC3372] defines a mechanism for using SIP between pairs of PSTN gateways. Its essential idea is to tunnel ISDN User Part (ISUP) signaling between the gateways in the body of SIP messages. SIP-T motivated the development of INFO [RFC2976]. SIP-T has seen widespread implementation for the limited deployment model that it addresses. As ISUP endpoints disappear from the network, the need for this mechanism will decrease. RFC 3398, ISUP to SIP Mapping (S): [RFC3398] defines how to do protocol mapping from the SS7 ISDN User Part (ISUP) signaling to SIP. It is widely used in SS7 to SIP gateways and is part of the SIP-T framework. RFC 4497, Interworking between the Session Initiation Protocol (SIP) and QSIG (B): [RFC4497] defines how to do protocol mapping from Q.SIG, used for Private Branch Exchange (PBX) signaling, to SIP. RFC 3578, Mapping of ISUP Overlap Signaling to SIP (S): [RFC3578] defines a mechanism to map overlap dialing into SIP. This specification is widely regarded as the ugliest SIP specification, as the introduction to the specification itself advises that it has many problems. Overlap signaling (the practice of sending digits into the network as dialed instead of waiting for complete collection of the called party number) is largely incompatible with SIP at some fairly fundamental levels. That said, RFC 3578 is mostly harmless and has seen some usage. RFC 3960, Early Media and Ringtone Generation in SIP (I): [RFC3960] defines some guidelines for handling early media -- the practice of sending media from the called party or an application server towards the caller prior to acceptance of the call. Early media is often generated from the PSTN. Early media is a complex topic, and this specification does not fully address the problems associated with it. RFC 3959, Early Session Disposition Type for the Session Initiation Protocol (SIP) (S): [RFC3959] defines a new session disposition type for use with early media. It indicates that the SDP in the body is for a special early media session. This has seen little usage. RFC 3204, MIME Media Types for ISUP and QSIG Objects (S): [RFC3204] defines MIME objects for representing SS7 and QSIG signaling messages. SS7 signaling messages are carried in the body of SIP messages when SIP-T is used. QSIG signaling messages can be carried in a similar way.
RFC3666, Session Initiation Protocol (SIP) Public Switched Telephone Network (PSTN) Call Flows (B): [RFC3666] provides best practice call flows around interworking with the PSTN. RFC 3262, Reliability of Provisional Responses in SIP (S): SIP defines two types of responses to a request: final and provisional. Provisional responses are numbered from 100 to 199. In SIP, these responses are not sent reliably. This choice was made in RFC 2543 since the messages were meant to just be truly informational and rendered to the user. However, subsequent work on PSTN interworking demonstrated a need to map provisional responses to PSTN messages that needed to be sent reliably. [RFC3262] was developed to allow reliability of provisional responses. The specification defines the PRACK method, used for indicating that a provisional response was received. Though it provides a generic capability for SIP, RFC 3262 implementations have been most common in PSTN interworking devices. However, PRACK brings a great deal of complication for relatively small benefit. As such, it has seen only moderate levels of deployment. RFC 3323, A Privacy Mechanism for the Session Initiation Protocol (SIP) (S): [RFC3323] defines the Privacy header field, used by clients to request anonymity for their requests. Though it defines several privacy services, the only one broadly used is the one that supports privacy of the P-Asserted-Identity header field [RFC3325]. UA-PRIVACY, UA-Driven Privacy Mechanism for SIP (S): [UA-PRIVACY] defines a mechanism for achieving anonymous calls in SIP. It is an alternative to [RFC3323], and instead places more intelligence in the endpoint to craft anonymous messages by directly accessing network services. RFC 2976, The INFO Method (S): [RFC2976] was defined as an extension to RFC 2543. It defines a method, INFO, used to transport mid- dialog information that has no impact on SIP itself. Its driving application was the transport of PSTN-related information when using SIP between a pair of gateways. Though originally conceived for broader use, it only found standardized usage with SIP-T [RFC3372]. It has been used to support numerous proprietary and non-interoperable extensions due to its poorly defined scope.
RFC 3326, The Reason Header Field for SIP (S): [RFC3326] defines the Reason header field. It is used in requests, such as BYE, to indicate the reason that the request is being sent. RFC 3388, Grouping of Media Lines in the Session Description Protocol (S): RFC 3388 [RFC3388] defines a framework for grouping together media streams in an SDP message. Such a grouping allows relationships between these streams, such as which stream is the audio for a particular video feed, to be expressed. RFC 3420, Internet Media Type message/sipfrag (S): [RFC3420] defines a MIME object that contains a SIP message fragment. Only certain header fields and parts of the SIP message are present. For example, it is used to report back on the responses received to a request sent as a consequence of a REFER. RFC 3608, SIP Extension Header Field for Service Route Discovery During Registration (S): [RFC3608] allows a client to determine, from a REGISTER response, a path of proxies to use in requests it sends outside of a dialog. It can also be used by proxies to verify the Route header in client-initiated requests. In many respects, it is the inverse of the Path header field, but has seen less usage since default outbound proxies have been sufficient in many deployments. RFC 3841, Caller Preferences for SIP (S): [RFC3841] defines a set of headers that a client can include in a request to control the way in which the request is routed downstream. It allows a client to direct a request towards a UA with specific capabilities, which a UA indicates using [RFC3840]. RFC 4028, Session Timers in SIP (S): [RFC4028] defines a keepalive mechanism for SIP signaling. It is primarily meant to provide a way to clean up old state in proxies that are holding call state for calls from failed endpoints that were never terminated normally. Despite its name, the session timer is not a mechanism for detecting a network failure mid-call. Session timers introduce a fair bit of complexity for relatively little gain, and have seen moderate deployment. RFC 4168, SCTP as a Transport for SIP (S): [RFC4168] defines how to carry SIP messages over the Stream Control Transmission Protocol (SCTP) [RFC4960]. SCTP has seen very limited usage for SIP transport.
RFC 4244, An Extension to SIP for Request History Information (S): [RFC4244] defines the History-Info header field, which indicates information on how and why a call came to be routed to a particular destination. RFC 4145, TCP-Based Media Transport in the Session Description Protocol (SDP) (S): [RFC4145] defines an extension to SDP for setting up TCP-based sessions between user agents. It defines who sets up the connection and how its lifecycle is managed. It has seen relatively little usage due to the small number of media types to date that use TCP. RFC 4091, The Alternative Network Address Types (ANAT) Semantics for the Session Description Protocol (SDP) Grouping Framework (S): [RFC4091] defines a mechanism for including both IPv4 and IPv6 addresses for a media session as alternates. This mechanism has been deprecated in favor of ICE [ICE]. SDP-MEDIA, SDP Media Capabilities Negotiation (S): [SDP-MEDIA] defines an extension to the SDP capability negotiation framework [SDP-CAP] for negotiating codecs, codec parameters, and media streams. BODY-HANDLING, Message Body Handling in the Session Initiation Protocol (SIP): [BODY-HANDLING] clarifies handling of bodies in SIP, focusing primarily on multi-part behavior, which was under- specified in SIP. ICE] defines a technique for NAT traversal of media sessions for protocols that make use of the offer/answer model. This specification is the IETF-recommended mechanism for NAT traversal for SIP media streams, and is meant to be used even by endpoints that are themselves never behind a NAT. A SIP option tag and media feature tag [OPTION-TAG] have been defined for use with ICE. ICE-TCP, TCP Candidates with Interactive Connectivity Establishment (ICE) (S): [ICE-TCP] specifies the usage of ICE for TCP streams. This allows for selection of RTP-based voice on top of TCP only when NAT or firewalls would prevent UDP-based voice from working.
RFC 3605, Real Time Control Protocol (RTCP) Attribute in the Session Description Protocol (SDP) (S): [RFC3605] defines a way to explicitly signal, within an SDP message, the IP address and port for RTCP, rather than using the port+1 rule in the Real Time Transport Protocol (RTP) [RFC3550]. It is needed for devices behind NAT, and the specification is required by ICE. OUTBOUND, Managing Client Initiated Connections through SIP (S): [OUTBOUND], also known as SIP outbound, defines important changes to the SIP registration mechanism that enable delivery of SIP messages towards a UA when it is behind a NAT. RFC 3581, An Extension to SIP for Symmetric Response Routing (S): [RFC3581] defines the rport parameter of the Via header. It allows SIP responses to traverse NAT. GRUU, Obtaining and Using Globally Routable User Agent Identifiers (GRUU) in SIP (S): [GRUU] defines a mechanism for directing requests towards a specific UA instance. GRUU is essential for features like transfer and provides another piece of the SIP NAT traversal story. RFC 3515, The REFER Method (S): REFER [RFC3515] defines a mechanism for asking a user agent to send a SIP request. It's a form of SIP remote control, and is the primary tool used for call transfer in SIP. Beware that not all potential uses of REFER (neither for all methods nor for all URI schemes) are well defined. Implementors should only use the well-defined ones, and should not second guess or freely assume behavior for the others to avoid unexpected behavior of remote UAs, interoperability issues, and other bad surprises. RFC 3725, Best Current Practices for Third Party Call Control (3pcc) (B): [RFC3725] defines a number of different call flows that allow one SIP entity, called the controller, to create SIP sessions amongst other SIP user agents. RFC 3911, The SIP Join Header Field (S): [RFC3911] defines the Join header field. When sent in an INVITE, it causes the recipient to join the resulting dialog into a conference with another dialog in progress.
RFC 3891, The SIP Replaces Header (S): [RFC3891] defines a mechanism that allows a new dialog to replace an existing dialog. It is useful for certain advanced transfer services. RFC 3892, The SIP Referred-By Mechanism (S): [RFC3892] defines the Referred-By header field. It is used in requests triggered by REFER, and provides the identity of the referring party to the referred-to party. RFC 4117, Transcoding Services Invocation in SIP Using Third Party Call Control (I): [RFC4117] defines how to use 3pcc for the purposes of invoking transcoding services for a call. RFC 3265, SIP-Specific Event Notification (S): [RFC3265] defines the SUBSCRIBE and NOTIFY methods. These two methods provide a general event notification framework for SIP. To actually use the framework, extensions need to be defined for specific event packages. An event package defines a schema for the event data and describes other aspects of event processing specific to that schema. An RFC 3265 implementation is required when any event package is used. RFC 3903, SIP Extension for Event State Publication (S): [RFC3903] defines the PUBLISH method. It is not an event package, but is used by all event packages as a mechanism for pushing an event into the system. RFC 4662, A Session Initiation Protocol (SIP) Event Notification Extension for Resource Lists (S): [RFC4662] defines an extension to RFC 3265 that allows a client to subscribe to a list of resources using a single subscription. The server, called a Resource List Server (RLS), will "expand" the subscription and subscribe to each individual member of the list. It has found applicability primarily in the area of presence, but can be used with any event package. SUBNOT-ETAGS, An Extension to Session Initiation Protocol (SIP) Events for Conditional Event Notification (S): [SUBNOT-ETAGS] defines an extension to RFC 3265 to optimize the performance of notifications. When a client subscribes, it can indicate what version of a document it has so that the server can skip sending a notification if the client is up-to-date. It is applicable to any event package.