RFC 3407

Session Description Protocol (SDP) Simple Capability Declaration

Pages: 10
Proposed Standard
→ Errata

Network Working Group                                       F. Andreasen
Request for Comments: 3407                                 Cisco Systems
Category: Standards Track                                   October 2002


   Session Description Protocol (SDP) Simple Capability Declaration

Status of this Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

Abstract

   This document defines a set of Session Description Protocol (SDP)
   attributes that enables SDP to provide a minimal and backwards
   compatible capability declaration mechanism.  Such capability
   declarations can be used as input to a subsequent session
   negotiation, which is done by means outside the scope of this
   document.  This provides a simple and limited solution to the general
   capability negotiation problem being addressed by the next generation
   of SDP, also known as SDPng.

1. Conventions Used in this Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC-2119 [2].

2. Introduction

   The Session Description Protocol (SDP) [3] describes multimedia
   sessions for the purposes of session announcement, session
   invitation, and other forms of multimedia session initiation.  SDP
   was not intended to provide capability negotiation.  However, as the
   need for this has become increasingly important, work has begun on a
   "next generation SDP" (SDPng) [4,5] that supports both session
   description and capability negotiation.  SDPng is not anticipated to
   be backwards compatible with SDP and work on SDPng is currently in
   the early stages.  However, several other protocols, e.g. SIP [6] and
   Media Gateway Control Protocol (MGCP) [7], use SDP and are likely to

noToC RFC3407 - Page 2

   continue doing so for the foreseeable future.  Nevertheless, in many
   cases these signaling protocols have an urgent need for some limited
   form of capability negotiation.

   For example, an endpoint may support G.711 audio (over RTP) as well
   as T.38 fax relay (over UDP or TCP).  Unless the endpoint is willing
   to support two media streams at the same time, this cannot currently
   be expressed in SDP.  Another example involves support for multiple
   codecs.  An endpoint indicates this by including all the codecs in
   the "m=" line in the session description.  However, the endpoint
   thereby also commits to simultaneous support for each of these
   codecs.  In practice, Digital Signal Processor (DSP) memory and
   processing power limitations may not make this feasible.

   As noted in [4], the problem with SDP is that media descriptions are
   used to describe session parameters as well as capabilities without a
   clear distinction between the two.

   In this document, we define a minimal and backwards compatible
   capability declaration feature in SDP by defining a set of new SDP
   attributes.  Together, these attributes define a capability set,
   which consists of a capability set sequence number followed by one or
   more capability descriptions.  Each capability description in the set
   contains information about supported media formats, but the endpoint
   is not committing to use any of these.  In order to actually use a
   declared capability, session negotiation will have to be done by
   means outside the scope of this document, e.g., using the
   offer/answer model [8].

   It should be noted that the mechanism is not intended to solve the
   general capability negotiation problem targeted by SDPng.  It is
   merely intended as a simple and limited solution to the most urgent
   problems facing current users of SDP.

3. Simple Capability Declaration Attributes

   The SDP Simple Capability Declaration (simcap) is defined by a set of
   SDP attributes.  Together, these attributes form a capability set
   which describes the complete media capabilities of the endpoint.  Any
   previous capability sets issued by the endpoint for the session in
   question no longer apply.  The capability set consists of a sequence
   number and one or more capability descriptions.  Each such capability
   description describes the media type and media formats supported and
   may include one or more capability parameters to further define the
   capability.  A session description MUST NOT contain more than one
   capability set, however the capability set can describe capabilities
   at both the session and media level.  Capability descriptions
   provided at the session level apply to all media streams of the media

noToC RFC3407 - Page 3

   type indicated, whereas capability descriptions provided at the media
   level apply to that particular media stream only.  We refer to these
   respectively as session capabilities and media stream capabilities.
   A media stream capability may or may not be of the same media type as
   the media stream to which it applies.

   The capability set MUST begin with a single sequence number followed
   by one or more capability descriptions listing all media formats the
   endpoint is currently able and willing to support.  More
   specifically, if a media format is included in a media ("m=") line,
   then by definition the media format MUST be included in either a
   session capability or a media stream capability for that media line.
   The endpoint MAY include additional media formats in a capability if
   it is capable of supporting those media formats in a session with its
   peer.  An endpoint MUST NOT include capabilities it knows it cannot
   use in a particular session.  An endpoint receiving a capability set
   from another endpoint MAY use any of the media formats included in
   that capability set in a later attempt to negotiate media streams
   with the other endpoint, e.g., using the offer/answer model [8].  If
   a new capability set is received from the other endpoint, the old
   capability set MUST NOT be used any longer.  Session capabilities can
   be used for any media streams of the indicated media type, whereas
   media stream capabilities can only be used for their associated media
   line.  However, an endpoint receiving a capability set with a given
   media format MUST NOT assume that a subsequent attempt to negotiate a
   media stream using just this media format will succeed.

   The individual capability descriptions in a capability set can be
   provided contiguously or they can be scattered throughout the session
   description.  The first capability description MUST, however, follow
   immediately after the sequence number.

   The sequence number is on the form:

     a=sqn: <sqn-num>

   where <sqn-num> is an integer between 0 and 255 (both included).  The
   initial sequence number MUST be 0 (zero) and it MUST be incremented
   by 1 modulo 256 with each new capability set issued by the endpoint.
   Receivers may not necessarily see all capability sets issued and
   hence MUST NOT reject a capability set due to gaps in sequence
   numbers.  The sequence number MUST either be provided as a session-
   level or media-level attribute, however there MUST NOT be more than
   one occurrence of the sequence number attribute in the session
   description (since there cannot be more than one capability set).

noToC RFC3407 - Page 4

   Each capability description in the capability set is on the form:

     a=cdsc: <cap-num> <media> <transport> <fmt list>

   where <cap-num> is an integer between 1 and 255 (both included) used
   to number the capabilities, and <media>, <transport>, and <fmt list>
   are defined as in the SDP "m=" line.  The capability description
   refers to a send and receive capability by default.  When generating
   a capability set, the capability number MUST start with 1 in the
   first capability description, and be incremented by the number of
   media formats in the <fmt list> for each subsequent capability
   description.  The media formats in the <fmt list> are numbered from
   left to right.  Receivers of a capability set MUST NOT, however,
   reject capability descriptions due to gaps in the capability numbers.
   The capability number provides a convenient handle within the context
   of the capability set (as referenced by the sequence number) which
   may be used to reference a particular capability by means outside of
   this specification.

   A capability description can include one or more capability parameter
   lines on the form:

     a=cpar: <cap-par>
     a=cparmin: <cap-par>
     a=cparmax: <cap-par>

   where <cap-par> is either bandwidth information ("b=") or an
   attribute ("a=") in its full  '<type>=<value>' form (see [3]).  A
   capability parameter line provides additional parameters for the
   preceding "cdsc" attribute line.  Capability parameter lines for a
   capability description SHOULD immediately follow the "cdsc" line they
   refer to.  Nevertheless, the capability description includes all
   capability parameter lines until the next capability description
   ("cdsc") or media ("m=") line in the session description.

   The "cpar" attribute should normally be used when capability
   parameter values are to be specified. When provided, it means that
   the endpoint is declaring that it supports the media formats in the
   preceding "cdsc" line in accordance with the <cap-par> value
   specified.  This can, for example, be used to specify "fmtp"
   parameters.  If a session negotiation is attempted without
   considering the <cap-par> value, it may fail due to lack of endpoint
   support.  A capability description may contain zero, one, or more
   "cpar" attribute lines describing either the same or different
   parameters.  Describing the same parameter more than once can be used
   to specify alternatives.

noToC RFC3407 - Page 5

   Where a minimum numerical value is to be specified, the "cparmin"
   attribute should be used.  There may be zero, one, or more "cparmin"
   attribute lines in a capability description, however a given
   parameter MUST NOT be described by a "cparmin" attribute more than
   once.

   Where a maximum numerical value is to be specified, the "cparmax"
   attribute should be used.  There may be zero, one, or more "cparmax"
   attribute lines in a capability description, however a given
   parameter MUST NOT be described by a "cparmax" attribute more than
   once.

   Ranges of numerical values can be expressed by using a "cparmin" and
   a "cparmax" attribute for a given parameter.  It follows from the
   previous rules, that only a single range can be specified for a given
   parameter.

   Capability descriptions may be provided at both the session-level and
   media-level.  A capability description provided at the session-level
   applies to all the media streams of the indicated media type in the
   session description.  A capability description provided at the
   media-level only applies to that particular media stream (regardless
   of media type).  If a capability description with media type X is
   provided at the session-level, and there are no media streams of type
   X in the session description, then it is undefined which of the media
   streams the capability description applies to (except if there is
   only one media stream).  It is therefore RECOMMENDED, that such
   capabilities are provided at the media-level instead.

   Below we show an example session description using the above simple
   capability declaration mechanism:

     v=0
     o=- 25678 753849 IN IP4 128.96.41.1
     s=
     c=IN IP4 128.96.41.1
     t=0 0
     m=audio 3456 RTP/AVP 18 96
     a=rtpmap:96 telephone-event
     a=fmtp:96 0-15,32-35
     a=sqn: 0
     a=cdsc: 1 audio RTP/AVP 0 18 96
     a=cpar: a=fmtp:96 0-16,32-35
     a=cdsc: 4 image udptl t38
     a=cdsc: 5 image tcp t38

noToC RFC3407 - Page 6

   The sender of this session description is currently prepared to send
   and receive G.729 audio as well as telephone-events 0-15 and 32-35.
   The sender is furthermore capable of supporting:

   *  PCMU encoding for the audio media stream,
   *  telephone events 0-16 and 32-35,
   *  T.38 fax relay using udp or tcp (see [9]).

   Note, that the first capability number specified is 1, whereas the
   next is 4 since three media formats were included in the first
   capability description.  Also note that the rtpmap for payload type
   96 was not included in the capability description, as it was already
   specified for the media ("m=") line.  Conversely, it would of course
   not have been valid to provide the rtpmap in the capability
   description and then omit the "a=rtpmap" line.

   Below, we show another example of the simple capability declaration
   mechanism, this time with multiple media streams:

     v=0
     o=- 25678 753849 IN IP4 128.96.41.1
     s=
     c=IN IP4 128.96.41.1
     t=0 0
     m=audio 3456 RTP/AVP 18
     a=sqn: 0
     a=cdsc: 1 audio RTP/AVP 0 18
     m=video 3458 RTP/AVP 31
     a=cdsc: 3 video RTP/AVP 31 34

   The sender of this session description is currently prepared to send
   and receive G.729 audio and H.261 video.  The sender is furthermore
   capable of supporting:

   *  PCMU encoding for the audio media stream,
   *  H.263 for the video media stream.

   Note that the first capability number specified is 1, whereas the
   next is 3, since two media formats were included in the first
   capability description.  Also note that the sequence number applies
   to the entire capability set, i.e. both audio and video, and hence is
   only supplied once.  Finally, note that the media formats 18 and 31
   are listed in both the media lines and the capability set as
   required.  The above session description could have equally been
   supplied as follows:

noToC RFC3407 - Page 7

     v=0
     o=- 25678 753849 IN IP4 128.96.41.1
     s=
     c=IN IP4 128.96.41.1
     t=0 0
     a=sqn: 0
     a=cdsc: 1 audio RTP/AVP 0 18
     a=cdsc: 3 video RTP/AVP 31 34
     m=audio 3456 RTP/AVP 18
     m=video 3458 RTP/AVP 31

   i.e., with the capability set provided at the session-level.

4. Security Considerations

   Capability negotiation of security-sensitive parameters is a delicate
   process, and should not be done without careful evaluation of the
   design, including the possible susceptibility to downgrade attacks.
   Use of capability re-negotiation may make the session susceptible to
   denial of service, without design care as to authentication.

5. IANA Considerations

   This document defines the following new SDP parameters of type "att-
   field" (attribute names):

   Attribute name:      sqn
   Long form name:      Sequence number.
   Type of attribute:   Session-level and media-level.
   Subject to charset:  No.
   Purpose:             Capability set numbering.
   Appropriate values:  See Section 3.

   Attribute name:      cdsc
   Long form name:      Capability description.
   Type of attribute:   Session-level and media-level.
   Subject to charset:  No.
   Purpose:             Describe capabilities in a capability set.
   Appropriate values:  See Section 3.

   Attribute name:      cpar
   Long form name:      Capability parameter line.
   Type of attribute:   Session-level and media-level.
   Subject to charset:  No.
   Purpose:             Provide capability description parameters.
   Appropriate values:  See Section 3.

noToC RFC3407 - Page 8

   Attribute name:      cparmin
   Long form name:      Minimum capability parameter line.
   Type of attribute:   Session-level and media-level.
   Subject to charset:  No.
   Purpose:             Provide minimum capability description
                        parameters.
   Appropriate values:  See Section 3.

   Attribute name:      cparmax
   Long form name:      Maximum capability parameter line.
   Type of attribute:   Session-level and media-level.
   Subject to charset:  No.
   Purpose:             Provide maximum capability description
                        parameters.
   Appropriate values:  See Section 3.

6. Normative References

   [1] Bradner, S., "The Internet Standards Process -- Revision 3", BCP
       9, RFC 2026, October 1996.

   [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement
       Levels", BCP 14, RFC 2119, March 1997.

   [3] Handley, M. and V. Jacobson, "SDP: session description protocol",
       Request for Comments 2327, April 1998.

7. Informative References

   [4] Kutscher, Ott, Bormann and Curcio, "Requirements for Session
       Description and Capability Negotiation", Work in Progress.

   [5] Kutscher, Ott and Borman, "Session Description and Capability
       Negotiation", Work in Progress.

   [6] Handley, M., Schulzrinne, H., Schooler, E. and J. Rosenberg,
       "SIP: session initiation protocol", RFC 2543, March 1999.

   [7] Arango, M., Dugan, A., Elliott, I., Huitema, C. and S. Pickett,
       "Media Gateway Control Protocol (MGCP) Version 1.0", RFC 2705,
       October 1999.

   [8] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
       SDP", Work in Progress.

   [9] ITU-T Recommendation T.38 Annex D, "SIP/SDP Call Establishment
       Procedures".

noToC RFC3407 - Page 9

8. Acknowledgments

   This work draws upon the ongoing work on SDPng in the IETF MMUSIC
   Working Group; in particular [4].  Furthermore this work was inspired
   by the CableLabs PacketCable project.  The author would like to
   recognize and thank Joerg Ott and Jonathan Rosenberg who provided
   many detailed comments and suggestions to improve this specification.
   Colin Perkins, Orit Levin and Tom Taylor provided valuable feedback
   as well.

9. Author's Address

   Flemming Andreasen
   Cisco Systems
   499 Thornall Street, 8th floor
   Edison, NJ
   EMail: fandreas@cisco.com

noToC RFC3407 - Page 10

10. Full Copyright Statement

   Copyright (C) The Internet Society (2002).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.