in Index   Prev   Next

RFC 3261

SIP: Session Initiation Protocol

Pages: 269
Proposed Standard
Obsoletes:  2543
Updated by:  326538534320491653935621562656305922595460266141666568787462746382178591876088988996
Part 2 of 13 – Pages 10 to 34
First   Prev   Next

Top   ToC   RFC3261 - Page 10   prevText

4 Overview of Operation

This section introduces the basic operations of SIP using simple examples. This section is tutorial in nature and does not contain any normative statements.
Top   ToC   RFC3261 - Page 11
   The first example shows the basic functions of SIP: location of an
   end point, signal of a desire to communicate, negotiation of session
   parameters to establish the session, and teardown of the session once

   Figure 1 shows a typical example of a SIP message exchange between
   two users, Alice and Bob.  (Each message is labeled with the letter
   "F" and a number for reference by the text.)  In this example, Alice
   uses a SIP application on her PC (referred to as a softphone) to call
   Bob on his SIP phone over the Internet.  Also shown are two SIP proxy
   servers that act on behalf of Alice and Bob to facilitate the session
   establishment.  This typical arrangement is often referred to as the
   "SIP trapezoid" as shown by the geometric shape of the dotted lines
   in Figure 1.

   Alice "calls" Bob using his SIP identity, a type of Uniform Resource
   Identifier (URI) called a SIP URI. SIP URIs are defined in Section
   19.1.  It has a similar form to an email address, typically
   containing a username and a host name.  In this case, it is, where is the domain of Bob's SIP
   service provider.  Alice has a SIP URI of
   Alice might have typed in Bob's URI or perhaps clicked on a hyperlink
   or an entry in an address book.  SIP also provides a secure URI,
   called a SIPS URI.  An example would be  A call
   made to a SIPS URI guarantees that secure, encrypted transport
   (namely TLS) is used to carry all SIP messages from the caller to the
   domain of the callee.  From there, the request is sent securely to
   the callee, but with security mechanisms that depend on the policy of
   the domain of the callee.

   SIP is based on an HTTP-like request/response transaction model.
   Each transaction consists of a request that invokes a particular
   method, or function, on the server and at least one response.  In
   this example, the transaction begins with Alice's softphone sending
   an INVITE request addressed to Bob's SIP URI.  INVITE is an example
   of a SIP method that specifies the action that the requestor (Alice)
   wants the server (Bob) to take.  The INVITE request contains a number
   of header fields.  Header fields are named attributes that provide
   additional information about a message.  The ones present in an
   INVITE include a unique identifier for the call, the destination
   address, Alice's address, and information about the type of session
   that Alice wishes to establish with Bob.  The INVITE (message F1 in
   Figure 1) might look like this:
Top   ToC   RFC3261 - Page 12
             . . .
                 .      proxy              proxy     .
               .                                       .
       Alice's  . . . . . . . . . . . . . . . . . . . .  Bob's
      softphone                                        SIP Phone
         |                |                |                |
         |    INVITE F1   |                |                |
         |--------------->|    INVITE F2   |                |
         |  100 Trying F3 |--------------->|    INVITE F4   |
         |<---------------|  100 Trying F5 |--------------->|
         |                |<-------------- | 180 Ringing F6 |
         |                | 180 Ringing F7 |<---------------|
         | 180 Ringing F8 |<---------------|     200 OK F9  |
         |<---------------|    200 OK F10  |<---------------|
         |    200 OK F11  |<---------------|                |
         |<---------------|                |                |
         |                       ACK F12                    |
         |                   Media Session                  |
         |                       BYE F13                    |
         |                     200 OK F14                   |
         |                                                  |

         Figure 1: SIP session setup example with SIP trapezoid

      INVITE SIP/2.0
      Via: SIP/2.0/UDP;branch=z9hG4bK776asdhds
      Max-Forwards: 70
      To: Bob <>
      From: Alice <>;tag=1928301774
      CSeq: 314159 INVITE
      Contact: <>
      Content-Type: application/sdp
      Content-Length: 142

      (Alice's SDP not shown)

   The first line of the text-encoded message contains the method name
   (INVITE).  The lines that follow are a list of header fields.  This
   example contains a minimum required set.  The header fields are
   briefly described below:
Top   ToC   RFC3261 - Page 13
   Via contains the address ( at which Alice is
   expecting to receive responses to this request.  It also contains a
   branch parameter that identifies this transaction.

   To contains a display name (Bob) and a SIP or SIPS URI
   ( towards which the request was originally
   directed.  Display names are described in RFC 2822 [3].

   From also contains a display name (Alice) and a SIP or SIPS URI
   ( that indicate the originator of the request.
   This header field also has a tag parameter containing a random string
   (1928301774) that was added to the URI by the softphone.  It is used
   for identification purposes.

   Call-ID contains a globally unique identifier for this call,
   generated by the combination of a random string and the softphone's
   host name or IP address.  The combination of the To tag, From tag,
   and Call-ID completely defines a peer-to-peer SIP relationship
   between Alice and Bob and is referred to as a dialog.

   CSeq or Command Sequence contains an integer and a method name.  The
   CSeq number is incremented for each new request within a dialog and
   is a traditional sequence number.

   Contact contains a SIP or SIPS URI that represents a direct route to
   contact Alice, usually composed of a username at a fully qualified
   domain name (FQDN).  While an FQDN is preferred, many end systems do
   not have registered domain names, so IP addresses are permitted.
   While the Via header field tells other elements where to send the
   response, the Contact header field tells other elements where to send
   future requests.

   Max-Forwards serves to limit the number of hops a request can make on
   the way to its destination.  It consists of an integer that is
   decremented by one at each hop.

   Content-Type contains a description of the message body (not shown).

   Content-Length contains an octet (byte) count of the message body.

   The complete set of SIP header fields is defined in Section 20.

   The details of the session, such as the type of media, codec, or
   sampling rate, are not described using SIP.  Rather, the body of a
   SIP message contains a description of the session, encoded in some
   other protocol format.  One such format is the Session Description
   Protocol (SDP) (RFC 2327 [1]).  This SDP message (not shown in the
Top   ToC   RFC3261 - Page 14
   example) is carried by the SIP message in a way that is analogous to
   a document attachment being carried by an email message, or a web
   page being carried in an HTTP message.

   Since the softphone does not know the location of Bob or the SIP
   server in the domain, the softphone sends the INVITE to
   the SIP server that serves Alice's domain,  The address
   of the SIP server could have been configured in Alice's
   softphone, or it could have been discovered by DHCP, for example.

   The SIP server is a type of SIP server known as a proxy
   server.  A proxy server receives SIP requests and forwards them on
   behalf of the requestor.  In this example, the proxy server receives
   the INVITE request and sends a 100 (Trying) response back to Alice's
   softphone.  The 100 (Trying) response indicates that the INVITE has
   been received and that the proxy is working on her behalf to route
   the INVITE to the destination.  Responses in SIP use a three-digit
   code followed by a descriptive phrase.  This response contains the
   same To, From, Call-ID, CSeq and branch parameter in the Via as the
   INVITE, which allows Alice's softphone to correlate this response to
   the sent INVITE.  The proxy server locates the proxy
   server at, possibly by performing a particular type of DNS
   (Domain Name Service) lookup to find the SIP server that serves the domain.  This is described in [4].  As a result, it
   obtains the IP address of the proxy server and forwards,
   or proxies, the INVITE request there.  Before forwarding the request,
   the proxy server adds an additional Via header field
   value that contains its own address (the INVITE already contains
   Alice's address in the first Via).  The proxy server
   receives the INVITE and responds with a 100 (Trying) response back to
   the proxy server to indicate that it has received the
   INVITE and is processing the request.  The proxy server consults a
   database, generically called a location service, that contains the
   current IP address of Bob.  (We shall see in the next section how
   this database can be populated.)  The proxy server adds
   another Via header field value with its own address to the INVITE and
   proxies it to Bob's SIP phone.

   Bob's SIP phone receives the INVITE and alerts Bob to the incoming
   call from Alice so that Bob can decide whether to answer the call,
   that is, Bob's phone rings.  Bob's SIP phone indicates this in a 180
   (Ringing) response, which is routed back through the two proxies in
   the reverse direction.  Each proxy uses the Via header field to
   determine where to send the response and removes its own address from
   the top.  As a result, although DNS and location service lookups were
   required to route the initial INVITE, the 180 (Ringing) response can
   be returned to the caller without lookups or without state being
Top   ToC   RFC3261 - Page 15
   maintained in the proxies.  This also has the desirable property that
   each proxy that sees the INVITE will also see all responses to the

   When Alice's softphone receives the 180 (Ringing) response, it passes
   this information to Alice, perhaps using an audio ringback tone or by
   displaying a message on Alice's screen.

   In this example, Bob decides to answer the call.  When he picks up
   the handset, his SIP phone sends a 200 (OK) response to indicate that
   the call has been answered.  The 200 (OK) contains a message body
   with the SDP media description of the type of session that Bob is
   willing to establish with Alice.  As a result, there is a two-phase
   exchange of SDP messages: Alice sent one to Bob, and Bob sent one
   back to Alice.  This two-phase exchange provides basic negotiation
   capabilities and is based on a simple offer/answer model of SDP
   exchange.  If Bob did not wish to answer the call or was busy on
   another call, an error response would have been sent instead of the
   200 (OK), which would have resulted in no media session being
   established.  The complete list of SIP response codes is in Section
   21.  The 200 (OK) (message F9 in Figure 1) might look like this as
   Bob sends it out:

      SIP/2.0 200 OK
      Via: SIP/2.0/UDP
      Via: SIP/2.0/UDP
      Via: SIP/2.0/UDP
         ;branch=z9hG4bK776asdhds ;received=
      To: Bob <>;tag=a6c85cf
      From: Alice <>;tag=1928301774
      CSeq: 314159 INVITE
      Contact: <sip:bob@>
      Content-Type: application/sdp
      Content-Length: 131

      (Bob's SDP not shown)

   The first line of the response contains the response code (200) and
   the reason phrase (OK).  The remaining lines contain header fields.
   The Via, To, From, Call-ID, and CSeq header fields are copied from
   the INVITE request.  (There are three Via header field values - one
   added by Alice's SIP phone, one added by the proxy, and
   one added by the proxy.)  Bob's SIP phone has added a tag
   parameter to the To header field.  This tag will be incorporated by
   both endpoints into the dialog and will be included in all future
Top   ToC   RFC3261 - Page 16
   requests and responses in this call.  The Contact header field
   contains a URI at which Bob can be directly reached at his SIP phone.
   The Content-Type and Content-Length refer to the message body (not
   shown) that contains Bob's SDP media information.

   In addition to DNS and location service lookups shown in this
   example, proxy servers can make flexible "routing decisions" to
   decide where to send a request.  For example, if Bob's SIP phone
   returned a 486 (Busy Here) response, the proxy server
   could proxy the INVITE to Bob's voicemail server.  A proxy server can
   also send an INVITE to a number of locations at the same time.  This
   type of parallel search is known as forking.

   In this case, the 200 (OK) is routed back through the two proxies and
   is received by Alice's softphone, which then stops the ringback tone
   and indicates that the call has been answered.  Finally, Alice's
   softphone sends an acknowledgement message, ACK, to Bob's SIP phone
   to confirm the reception of the final response (200 (OK)).  In this
   example, the ACK is sent directly from Alice's softphone to Bob's SIP
   phone, bypassing the two proxies.  This occurs because the endpoints
   have learned each other's address from the Contact header fields
   through the INVITE/200 (OK) exchange, which was not known when the
   initial INVITE was sent.  The lookups performed by the two proxies
   are no longer needed, so the proxies drop out of the call flow.  This
   completes the INVITE/200/ACK three-way handshake used to establish
   SIP sessions.  Full details on session setup are in Section 13.

   Alice and Bob's media session has now begun, and they send media
   packets using the format to which they agreed in the exchange of SDP.
   In general, the end-to-end media packets take a different path from
   the SIP signaling messages.

   During the session, either Alice or Bob may decide to change the
   characteristics of the media session.  This is accomplished by
   sending a re-INVITE containing a new media description.  This re-
   INVITE references the existing dialog so that the other party knows
   that it is to modify an existing session instead of establishing a
   new session.  The other party sends a 200 (OK) to accept the change.
   The requestor responds to the 200 (OK) with an ACK.  If the other
   party does not accept the change, he sends an error response such as
   488 (Not Acceptable Here), which also receives an ACK.  However, the
   failure of the re-INVITE does not cause the existing call to fail -
   the session continues using the previously negotiated
   characteristics.  Full details on session modification are in Section
Top   ToC   RFC3261 - Page 17
   At the end of the call, Bob disconnects (hangs up) first and
   generates a BYE message.  This BYE is routed directly to Alice's
   softphone, again bypassing the proxies.  Alice confirms receipt of
   the BYE with a 200 (OK) response, which terminates the session and
   the BYE transaction.  No ACK is sent - an ACK is only sent in
   response to a response to an INVITE request.  The reasons for this
   special handling for INVITE will be discussed later, but relate to
   the reliability mechanisms in SIP, the length of time it can take for
   a ringing phone to be answered, and forking.  For this reason,
   request handling in SIP is often classified as either INVITE or non-
   INVITE, referring to all other methods besides INVITE.  Full details
   on session termination are in Section 15.

   Section 24.2 describes the messages shown in Figure 1 in full.

   In some cases, it may be useful for proxies in the SIP signaling path
   to see all the messaging between the endpoints for the duration of
   the session.  For example, if the proxy server wished to
   remain in the SIP messaging path beyond the initial INVITE, it would
   add to the INVITE a required routing header field known as Record-
   Route that contained a URI resolving to the hostname or IP address of
   the proxy.  This information would be received by both Bob's SIP
   phone and (due to the Record-Route header field being passed back in
   the 200 (OK)) Alice's softphone and stored for the duration of the
   dialog.  The proxy server would then receive and proxy the
   ACK, BYE, and 200 (OK) to the BYE.  Each proxy can independently
   decide to receive subsequent messages, and those messages will pass
   through all proxies that elect to receive it.  This capability is
   frequently used for proxies that are providing mid-call features.

   Registration is another common operation in SIP.  Registration is one
   way that the server can learn the current location of Bob.
   Upon initialization, and at periodic intervals, Bob's SIP phone sends
   REGISTER messages to a server in the domain known as a SIP
   registrar.  The REGISTER messages associate Bob's SIP or SIPS URI
   ( with the machine into which he is currently
   logged (conveyed as a SIP or SIPS URI in the Contact header field).
   The registrar writes this association, also called a binding, to a
   database, called the location service, where it can be used by the
   proxy in the domain.  Often, a registrar server for a
   domain is co-located with the proxy for that domain.  It is an
   important concept that the distinction between types of SIP servers
   is logical, not physical.

   Bob is not limited to registering from a single device.  For example,
   both his SIP phone at home and the one in the office could send
   registrations.  This information is stored together in the location
Top   ToC   RFC3261 - Page 18
   service and allows a proxy to perform various types of searches to
   locate Bob.  Similarly, more than one user can be registered on a
   single device at the same time.

   The location service is just an abstract concept.  It generally
   contains information that allows a proxy to input a URI and receive a
   set of zero or more URIs that tell the proxy where to send the
   request.  Registrations are one way to create this information, but
   not the only way.  Arbitrary mapping functions can be configured at
   the discretion of the administrator.

   Finally, it is important to note that in SIP, registration is used
   for routing incoming SIP requests and has no role in authorizing
   outgoing requests.  Authorization and authentication are handled in
   SIP either on a request-by-request basis with a challenge/response
   mechanism, or by using a lower layer scheme as discussed in Section

   The complete set of SIP message details for this registration example
   is in Section 24.1.

   Additional operations in SIP, such as querying for the capabilities
   of a SIP server or client using OPTIONS, or canceling a pending
   request using CANCEL, will be introduced in later sections.

5 Structure of the Protocol

SIP is structured as a layered protocol, which means that its behavior is described in terms of a set of fairly independent processing stages with only a loose coupling between each stage. The protocol behavior is described as layers for the purpose of presentation, allowing the description of functions common across elements in a single section. It does not dictate an implementation in any way. When we say that an element "contains" a layer, we mean it is compliant to the set of rules defined by that layer. Not every element specified by the protocol contains every layer. Furthermore, the elements specified by SIP are logical elements, not physical ones. A physical realization can choose to act as different logical elements, perhaps even on a transaction-by-transaction basis. The lowest layer of SIP is its syntax and encoding. Its encoding is specified using an augmented Backus-Naur Form grammar (BNF). The complete BNF is specified in Section 25; an overview of a SIP message's structure can be found in Section 7.
Top   ToC   RFC3261 - Page 19
   The second layer is the transport layer.  It defines how a client
   sends requests and receives responses and how a server receives
   requests and sends responses over the network.  All SIP elements
   contain a transport layer.  The transport layer is described in
   Section 18.

   The third layer is the transaction layer.  Transactions are a
   fundamental component of SIP.  A transaction is a request sent by a
   client transaction (using the transport layer) to a server
   transaction, along with all responses to that request sent from the
   server transaction back to the client.  The transaction layer handles
   application-layer retransmissions, matching of responses to requests,
   and application-layer timeouts.  Any task that a user agent client
   (UAC) accomplishes takes place using a series of transactions.
   Discussion of transactions can be found in Section 17.  User agents
   contain a transaction layer, as do stateful proxies.  Stateless
   proxies do not contain a transaction layer.  The transaction layer
   has a client component (referred to as a client transaction) and a
   server component (referred to as a server transaction), each of which
   are represented by a finite state machine that is constructed to
   process a particular request.

   The layer above the transaction layer is called the transaction user
   (TU).  Each of the SIP entities, except the stateless proxy, is a
   transaction user.  When a TU wishes to send a request, it creates a
   client transaction instance and passes it the request along with the
   destination IP address, port, and transport to which to send the
   request.  A TU that creates a client transaction can also cancel it.
   When a client cancels a transaction, it requests that the server stop
   further processing, revert to the state that existed before the
   transaction was initiated, and generate a specific error response to
   that transaction.  This is done with a CANCEL request, which
   constitutes its own transaction, but references the transaction to be
   cancelled (Section 9).

   The SIP elements, that is, user agent clients and servers, stateless
   and stateful proxies and registrars, contain a core that
   distinguishes them from each other.  Cores, except for the stateless
   proxy, are transaction users.  While the behavior of the UAC and UAS
   cores depends on the method, there are some common rules for all
   methods (Section 8).  For a UAC, these rules govern the construction
   of a request; for a UAS, they govern the processing of a request and
   generating a response.  Since registrations play an important role in
   SIP, a UAS that handles a REGISTER is given the special name
   registrar.  Section 10 describes UAC and UAS core behavior for the
   REGISTER method.  Section 11 describes UAC and UAS core behavior for
   the OPTIONS method, used for determining the capabilities of a UA.
Top   ToC   RFC3261 - Page 20
   Certain other requests are sent within a dialog.  A dialog is a
   peer-to-peer SIP relationship between two user agents that persists
   for some time.  The dialog facilitates sequencing of messages and
   proper routing of requests between the user agents.  The INVITE
   method is the only way defined in this specification to establish a
   dialog.  When a UAC sends a request that is within the context of a
   dialog, it follows the common UAC rules as discussed in Section 8 but
   also the rules for mid-dialog requests.  Section 12 discusses dialogs
   and presents the procedures for their construction and maintenance,
   in addition to construction of requests within a dialog.

   The most important method in SIP is the INVITE method, which is used
   to establish a session between participants.  A session is a
   collection of participants, and streams of media between them, for
   the purposes of communication.  Section 13 discusses how sessions are
   initiated, resulting in one or more SIP dialogs.  Section 14
   discusses how characteristics of that session are modified through
   the use of an INVITE request within a dialog.  Finally, section 15
   discusses how a session is terminated.

   The procedures of Sections 8, 10, 11, 12, 13, 14, and 15 deal
   entirely with the UA core (Section 9 describes cancellation, which
   applies to both UA core and proxy core).  Section 16 discusses the
   proxy element, which facilitates routing of messages between user

6 Definitions

The following terms have special significance for SIP. Address-of-Record: An address-of-record (AOR) is a SIP or SIPS URI that points to a domain with a location service that can map the URI to another URI where the user might be available. Typically, the location service is populated through registrations. An AOR is frequently thought of as the "public address" of the user. Back-to-Back User Agent: A back-to-back user agent (B2BUA) is a logical entity that receives a request and processes it as a user agent server (UAS). In order to determine how the request should be answered, it acts as a user agent client (UAC) and generates requests. Unlike a proxy server, it maintains dialog state and must participate in all requests sent on the dialogs it has established. Since it is a concatenation of a UAC and UAS, no explicit definitions are needed for its behavior.
Top   ToC   RFC3261 - Page 21
      Call: A call is an informal term that refers to some communication
         between peers, generally set up for the purposes of a
         multimedia conversation.

      Call Leg: Another name for a dialog [31]; no longer used in this

      Call Stateful: A proxy is call stateful if it retains state for a
         dialog from the initiating INVITE to the terminating BYE
         request.  A call stateful proxy is always transaction stateful,
         but the converse is not necessarily true.

      Client: A client is any network element that sends SIP requests
         and receives SIP responses.  Clients may or may not interact
         directly with a human user.  User agent clients and proxies are

      Conference: A multimedia session (see below) that contains
         multiple participants.

      Core: Core designates the functions specific to a particular type
         of SIP entity, i.e., specific to either a stateful or stateless
         proxy, a user agent or registrar.  All cores, except those for
         the stateless proxy, are transaction users.

      Dialog: A dialog is a peer-to-peer SIP relationship between two
         UAs that persists for some time.  A dialog is established by
         SIP messages, such as a 2xx response to an INVITE request.  A
         dialog is identified by a call identifier, local tag, and a
         remote tag.  A dialog was formerly known as a call leg in RFC

      Downstream: A direction of message forwarding within a transaction
         that refers to the direction that requests flow from the user
         agent client to user agent server.

      Final Response: A response that terminates a SIP transaction, as
         opposed to a provisional response that does not.  All 2xx, 3xx,
         4xx, 5xx and 6xx responses are final.

      Header: A header is a component of a SIP message that conveys
         information about the message.  It is structured as a sequence
         of header fields.

      Header Field: A header field is a component of the SIP message
         header.  A header field can appear as one or more header field
         rows. Header field rows consist of a header field name and zero
         or more header field values. Multiple header field values on a
Top   ToC   RFC3261 - Page 22
         given header field row are separated by commas. Some header
         fields can only have a single header field value, and as a
         result, always appear as a single header field row.

      Header Field Value: A header field value is a single value; a
         header field consists of zero or more header field values.

      Home Domain: The domain providing service to a SIP user.
         Typically, this is the domain present in the URI in the
         address-of-record of a registration.

      Informational Response: Same as a provisional response.

      Initiator, Calling Party, Caller: The party initiating a session
         (and dialog) with an INVITE request.  A caller retains this
         role from the time it sends the initial INVITE that established
         a dialog until the termination of that dialog.

      Invitation: An INVITE request.

      Invitee, Invited User, Called Party, Callee: The party that
         receives an INVITE request for the purpose of establishing a
         new session.  A callee retains this role from the time it
         receives the INVITE until the termination of the dialog
         established by that INVITE.

      Location Service: A location service is used by a SIP redirect or
         proxy server to obtain information about a callee's possible
         location(s).  It contains a list of bindings of address-of-
         record keys to zero or more contact addresses.  The bindings
         can be created and removed in many ways; this specification
         defines a REGISTER method that updates the bindings.

      Loop: A request that arrives at a proxy, is forwarded, and later
         arrives back at the same proxy.  When it arrives the second
         time, its Request-URI is identical to the first time, and other
         header fields that affect proxy operation are unchanged, so
         that the proxy would make the same processing decision on the
         request it made the first time.  Looped requests are errors,
         and the procedures for detecting them and handling them are
         described by the protocol.

      Loose Routing: A proxy is said to be loose routing if it follows
         the procedures defined in this specification for processing of
         the Route header field.  These procedures separate the
         destination of the request (present in the Request-URI) from
Top   ToC   RFC3261 - Page 23
         the set of proxies that need to be visited along the way
         (present in the Route header field).  A proxy compliant to
         these mechanisms is also known as a loose router.

      Message: Data sent between SIP elements as part of the protocol.
         SIP messages are either requests or responses.

      Method: The method is the primary function that a request is meant
         to invoke on a server.  The method is carried in the request
         message itself.  Example methods are INVITE and BYE.

      Outbound Proxy: A proxy that receives requests from a client, even
         though it may not be the server resolved by the Request-URI.
         Typically, a UA is manually configured with an outbound proxy,
         or can learn about one through auto-configuration protocols.

      Parallel Search: In a parallel search, a proxy issues several
         requests to possible user locations upon receiving an incoming
         request.  Rather than issuing one request and then waiting for
         the final response before issuing the next request as in a
         sequential search, a parallel search issues requests without
         waiting for the result of previous requests.

      Provisional Response: A response used by the server to indicate
         progress, but that does not terminate a SIP transaction.  1xx
         responses are provisional, other responses are considered

      Proxy, Proxy Server: An intermediary entity that acts as both a
         server and a client for the purpose of making requests on
         behalf of other clients.  A proxy server primarily plays the
         role of routing, which means its job is to ensure that a
         request is sent to another entity "closer" to the targeted
         user.  Proxies are also useful for enforcing policy (for
         example, making sure a user is allowed to make a call).  A
         proxy interprets, and, if necessary, rewrites specific parts of
         a request message before forwarding it.

      Recursion: A client recurses on a 3xx response when it generates a
         new request to one or more of the URIs in the Contact header
         field in the response.

      Redirect Server: A redirect server is a user agent server that
         generates 3xx responses to requests it receives, directing the
         client to contact an alternate set of URIs.
Top   ToC   RFC3261 - Page 24
      Registrar: A registrar is a server that accepts REGISTER requests
         and places the information it receives in those requests into
         the location service for the domain it handles.

      Regular Transaction: A regular transaction is any transaction with
         a method other than INVITE, ACK, or CANCEL.

      Request: A SIP message sent from a client to a server, for the
         purpose of invoking a particular operation.

      Response: A SIP message sent from a server to a client, for
         indicating the status of a request sent from the client to the

      Ringback: Ringback is the signaling tone produced by the calling
         party's application indicating that a called party is being
         alerted (ringing).

      Route Set: A route set is a collection of ordered SIP or SIPS URI
         which represent a list of proxies that must be traversed when
         sending a particular request.  A route set can be learned,
         through headers like Record-Route, or it can be configured.

      Server: A server is a network element that receives requests in
         order to service them and sends back responses to those
         requests.  Examples of servers are proxies, user agent servers,
         redirect servers, and registrars.

      Sequential Search: In a sequential search, a proxy server attempts
         each contact address in sequence, proceeding to the next one
         only after the previous has generated a final response.  A 2xx
         or 6xx class final response always terminates a sequential

      Session: From the SDP specification: "A multimedia session is a
         set of multimedia senders and receivers and the data streams
         flowing from senders to receivers.  A multimedia conference is
         an example of a multimedia session." (RFC 2327 [1]) (A session
         as defined for SDP can comprise one or more RTP sessions.)  As
         defined, a callee can be invited several times, by different
         calls, to the same session.  If SDP is used, a session is
         defined by the concatenation of the SDP user name, session id,
         network type, address type, and address elements in the origin

      SIP Transaction: A SIP transaction occurs between a client and a
         server and comprises all messages from the first request sent
         from the client to the server up to a final (non-1xx) response
Top   ToC   RFC3261 - Page 25
         sent from the server to the client.  If the request is INVITE
         and the final response is a non-2xx, the transaction also
         includes an ACK to the response.  The ACK for a 2xx response to
         an INVITE request is a separate transaction.

      Spiral: A spiral is a SIP request that is routed to a proxy,
         forwarded onwards, and arrives once again at that proxy, but
         this time differs in a way that will result in a different
         processing decision than the original request.  Typically, this
         means that the request's Request-URI differs from its previous
         arrival.  A spiral is not an error condition, unlike a loop.  A
         typical cause for this is call forwarding.  A user calls  The proxy forwards it to Joe's
         PC, which in turn, forwards it to  This
         request is proxied back to the proxy.  However,
         this is not a loop.  Since the request is targeted at a
         different user, it is considered a spiral, and is a valid

      Stateful Proxy: A logical entity that maintains the client and
         server transaction state machines defined by this specification
         during the processing of a request, also known as a transaction
         stateful proxy.  The behavior of a stateful proxy is further
         defined in Section 16.  A (transaction) stateful proxy is not
         the same as a call stateful proxy.

      Stateless Proxy: A logical entity that does not maintain the
         client or server transaction state machines defined in this
         specification when it processes requests.  A stateless proxy
         forwards every request it receives downstream and every
         response it receives upstream.

      Strict Routing: A proxy is said to be strict routing if it follows
         the Route processing rules of RFC 2543 and many prior work in
         progress versions of this RFC.  That rule caused proxies to
         destroy the contents of the Request-URI when a Route header
         field was present.  Strict routing behavior is not used in this
         specification, in favor of a loose routing behavior.  Proxies
         that perform strict routing are also known as strict routers.

      Target Refresh Request: A target refresh request sent within a
         dialog is defined as a request that can modify the remote
         target of the dialog.

      Transaction User (TU): The layer of protocol processing that
         resides above the transaction layer.  Transaction users include
         the UAC core, UAS core, and proxy core.
Top   ToC   RFC3261 - Page 26
      Upstream: A direction of message forwarding within a transaction
         that refers to the direction that responses flow from the user
         agent server back to the user agent client.

      URL-encoded: A character string encoded according to RFC 2396,
         Section 2.4 [5].

      User Agent Client (UAC): A user agent client is a logical entity
         that creates a new request, and then uses the client
         transaction state machinery to send it.  The role of UAC lasts
         only for the duration of that transaction.  In other words, if
         a piece of software initiates a request, it acts as a UAC for
         the duration of that transaction.  If it receives a request
         later, it assumes the role of a user agent server for the
         processing of that transaction.

      UAC Core: The set of processing functions required of a UAC that
         reside above the transaction and transport layers.

      User Agent Server (UAS): A user agent server is a logical entity
         that generates a response to a SIP request.  The response
         accepts, rejects, or redirects the request.  This role lasts
         only for the duration of that transaction.  In other words, if
         a piece of software responds to a request, it acts as a UAS for
         the duration of that transaction.  If it generates a request
         later, it assumes the role of a user agent client for the
         processing of that transaction.

      UAS Core: The set of processing functions required at a UAS that
         resides above the transaction and transport layers.

      User Agent (UA): A logical entity that can act as both a user
         agent client and user agent server.

   The role of UAC and UAS, as well as proxy and redirect servers, are
   defined on a transaction-by-transaction basis.  For example, the user
   agent initiating a call acts as a UAC when sending the initial INVITE
   request and as a UAS when receiving a BYE request from the callee.
   Similarly, the same software can act as a proxy server for one
   request and as a redirect server for the next request.

   Proxy, location, and registrar servers defined above are logical
   entities; implementations MAY combine them into a single application.

7 SIP Messages

SIP is a text-based protocol and uses the UTF-8 charset (RFC 2279 [7]).
Top   ToC   RFC3261 - Page 27
   A SIP message is either a request from a client to a server, or a
   response from a server to a client.

   Both Request (section 7.1) and Response (section 7.2) messages use
   the basic format of RFC 2822 [3], even though the syntax differs in
   character set and syntax specifics.  (SIP allows header fields that
   would not be valid RFC 2822 header fields, for example.)  Both types
   of messages consist of a start-line, one or more header fields, an
   empty line indicating the end of the header fields, and an optional

         generic-message  =  start-line
                             [ message-body ]
         start-line       =  Request-Line / Status-Line

   The start-line, each message-header line, and the empty line MUST be
   terminated by a carriage-return line-feed sequence (CRLF).  Note that
   the empty line MUST be present even if the message-body is not.

   Except for the above difference in character sets, much of SIP's
   message and header field syntax is identical to HTTP/1.1.  Rather
   than repeating the syntax and semantics here, we use [HX.Y] to refer
   to Section X.Y of the current HTTP/1.1 specification (RFC 2616 [8]).

   However, SIP is not an extension of HTTP.

7.1 Requests

SIP requests are distinguished by having a Request-Line for a start- line. A Request-Line contains a method name, a Request-URI, and the protocol version separated by a single space (SP) character. The Request-Line ends with CRLF. No CR or LF are allowed except in the end-of-line CRLF sequence. No linear whitespace (LWS) is allowed in any of the elements. Request-Line = Method SP Request-URI SP SIP-Version CRLF Method: This specification defines six methods: REGISTER for registering contact information, INVITE, ACK, and CANCEL for setting up sessions, BYE for terminating sessions, and OPTIONS for querying servers about their capabilities. SIP extensions, documented in standards track RFCs, may define additional methods.
Top   ToC   RFC3261 - Page 28
      Request-URI: The Request-URI is a SIP or SIPS URI as described in
           Section 19.1 or a general URI (RFC 2396 [5]).  It indicates
           the user or service to which this request is being addressed.
           The Request-URI MUST NOT contain unescaped spaces or control
           characters and MUST NOT be enclosed in "<>".

           SIP elements MAY support Request-URIs with schemes other than
           "sip" and "sips", for example the "tel" URI scheme of RFC
           2806 [9].  SIP elements MAY translate non-SIP URIs using any
           mechanism at their disposal, resulting in SIP URI, SIPS URI,
           or some other scheme.

      SIP-Version: Both request and response messages include the
           version of SIP in use, and follow [H3.1] (with HTTP replaced
           by SIP, and HTTP/1.1 replaced by SIP/2.0) regarding version
           ordering, compliance requirements, and upgrading of version
           numbers.  To be compliant with this specification,
           applications sending SIP messages MUST include a SIP-Version
           of "SIP/2.0".  The SIP-Version string is case-insensitive,
           but implementations MUST send upper-case.

           Unlike HTTP/1.1, SIP treats the version number as a literal
           string.  In practice, this should make no difference.

7.2 Responses

SIP responses are distinguished from requests by having a Status-Line as their start-line. A Status-Line consists of the protocol version followed by a numeric Status-Code and its associated textual phrase, with each element separated by a single SP character. No CR or LF is allowed except in the final CRLF sequence. Status-Line = SIP-Version SP Status-Code SP Reason-Phrase CRLF The Status-Code is a 3-digit integer result code that indicates the outcome of an attempt to understand and satisfy a request. The Reason-Phrase is intended to give a short textual description of the Status-Code. The Status-Code is intended for use by automata, whereas the Reason-Phrase is intended for the human user. A client is not required to examine or display the Reason-Phrase. While this specification suggests specific wording for the reason phrase, implementations MAY choose other text, for example, in the language indicated in the Accept-Language header field of the request.
Top   ToC   RFC3261 - Page 29
   The first digit of the Status-Code defines the class of response.
   The last two digits do not have any categorization role.  For this
   reason, any response with a status code between 100 and 199 is
   referred to as a "1xx response", any response with a status code
   between 200 and 299 as a "2xx response", and so on.  SIP/2.0 allows
   six values for the first digit:

      1xx: Provisional -- request received, continuing to process the

      2xx: Success -- the action was successfully received, understood,
           and accepted;

      3xx: Redirection -- further action needs to be taken in order to
           complete the request;

      4xx: Client Error -- the request contains bad syntax or cannot be
           fulfilled at this server;

      5xx: Server Error -- the server failed to fulfill an apparently
           valid request;

      6xx: Global Failure -- the request cannot be fulfilled at any

   Section 21 defines these classes and describes the individual codes.

7.3 Header Fields

SIP header fields are similar to HTTP header fields in both syntax and semantics. In particular, SIP header fields follow the [H4.2] definitions of syntax for the message-header and the rules for extending header fields over multiple lines. However, the latter is specified in HTTP with implicit whitespace and folding. This specification conforms to RFC 2234 [10] and uses only explicit whitespace and folding as an integral part of the grammar. [H4.2] also specifies that multiple header fields of the same field name whose value is a comma-separated list can be combined into one header field. That applies to SIP as well, but the specific rule is different because of the different grammars. Specifically, any SIP header whose grammar is of the form header = "header-name" HCOLON header-value *(COMMA header-value) allows for combining header fields of the same name into a comma- separated list. The Contact header field allows a comma-separated list unless the header field value is "*".
Top   ToC   RFC3261 - Page 30

7.3.1 Header Field Format

Header fields follow the same generic header format as that given in Section 2.2 of RFC 2822 [3]. Each header field consists of a field name followed by a colon (":") and the field value. field-name: field-value The formal grammar for a message-header specified in Section 25 allows for an arbitrary amount of whitespace on either side of the colon; however, implementations should avoid spaces between the field name and the colon and use a single space (SP) between the colon and the field-value. Subject: lunch Subject : lunch Subject :lunch Subject: lunch Thus, the above are all valid and equivalent, but the last is the preferred form. Header fields can be extended over multiple lines by preceding each extra line with at least one SP or horizontal tab (HT). The line break and the whitespace at the beginning of the next line are treated as a single SP character. Thus, the following are equivalent: Subject: I know you're there, pick up the phone and talk to me! Subject: I know you're there, pick up the phone and talk to me! The relative order of header fields with different field names is not significant. However, it is RECOMMENDED that header fields which are needed for proxy processing (Via, Route, Record-Route, Proxy-Require, Max-Forwards, and Proxy-Authorization, for example) appear towards the top of the message to facilitate rapid parsing. The relative order of header field rows with the same field name is important. Multiple header field rows with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list (that is, if follows the grammar defined in Section 7.3). It MUST be possible to combine the multiple header field rows into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The exceptions to this rule are the WWW-Authenticate, Authorization, Proxy- Authenticate, and Proxy-Authorization header fields. Multiple header
Top   ToC   RFC3261 - Page 31
   field rows with these names MAY be present in a message, but since
   their grammar does not follow the general form listed in Section 7.3,
   they MUST NOT be combined into a single header field row.

   Implementations MUST be able to process multiple header field rows
   with the same name in any combination of the single-value-per-line or
   comma-separated value forms.

   The following groups of header field rows are valid and equivalent:

      Route: <>
      Subject: Lunch
      Route: <>
      Route: <>

      Route: <>, <>
      Route: <>
      Subject: Lunch

      Subject: Lunch
      Route: <>, <>,

   Each of the following blocks is valid but not equivalent to the

      Route: <>
      Route: <>
      Route: <>

      Route: <>
      Route: <>
      Route: <>

      Route: <>,<>,

   The format of a header field-value is defined per header-name.  It
   will always be either an opaque sequence of TEXT-UTF8 octets, or a
   combination of whitespace, tokens, separators, and quoted strings.
   Many existing header fields will adhere to the general form of a
   value followed by a semi-colon separated sequence of parameter-name,
   parameter-value pairs:

         field-name: field-value *(;parameter-name=parameter-value)
Top   ToC   RFC3261 - Page 32
   Even though an arbitrary number of parameter pairs may be attached to
   a header field value, any given parameter-name MUST NOT appear more
   than once.

   When comparing header fields, field names are always case-
   insensitive.  Unless otherwise stated in the definition of a
   particular header field, field values, parameter names, and parameter
   values are case-insensitive.  Tokens are always case-insensitive.
   Unless specified otherwise, values expressed as quoted strings are
   case-sensitive.  For example,

      Contact: <>;expires=3600

   is equivalent to

      CONTACT: <>;ExPiReS=3600


      Content-Disposition: session;handling=optional

   is equivalent to

      content-disposition: Session;HANDLING=OPTIONAL

   The following two header fields are not equivalent:

      Warning: 370 devnull "Choose a bigger pipe"
      Warning: 370 devnull "CHOOSE A BIGGER PIPE"

7.3.2 Header Field Classification

Some header fields only make sense in requests or responses. These are called request header fields and response header fields, respectively. If a header field appears in a message not matching its category (such as a request header field in a response), it MUST be ignored. Section 20 defines the classification of each header field.

7.3.3 Compact Form

SIP provides a mechanism to represent common header field names in an abbreviated form. This may be useful when messages would otherwise become too large to be carried on the transport available to it (exceeding the maximum transmission unit (MTU) when using UDP, for example). These compact forms are defined in Section 20. A compact form MAY be substituted for the longer form of a header field name at any time without changing the semantics of the message. A header
Top   ToC   RFC3261 - Page 33
   field name MAY appear in both long and short forms within the same
   message.  Implementations MUST accept both the long and short forms
   of each header name.

7.4 Bodies

Requests, including new requests defined in extensions to this specification, MAY contain message bodies unless otherwise noted. The interpretation of the body depends on the request method. For response messages, the request method and the response status code determine the type and interpretation of any message body. All responses MAY include a body.

7.4.1 Message Body Type

The Internet media type of the message body MUST be given by the Content-Type header field. If the body has undergone any encoding such as compression, then this MUST be indicated by the Content- Encoding header field; otherwise, Content-Encoding MUST be omitted. If applicable, the character set of the message body is indicated as part of the Content-Type header-field value. The "multipart" MIME type defined in RFC 2046 [11] MAY be used within the body of the message. Implementations that send requests containing multipart message bodies MUST send a session description as a non-multipart message body if the remote implementation requests this through an Accept header field that does not contain multipart. SIP messages MAY contain binary bodies or body parts. When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "UTF-8".

7.4.2 Message Body Length

The body length in bytes is provided by the Content-Length header field. Section 20.14 describes the necessary contents of this header field in detail. The "chunked" transfer encoding of HTTP/1.1 MUST NOT be used for SIP. (Note: The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator.)
Top   ToC   RFC3261 - Page 34

7.5 Framing SIP Messages

Unlike HTTP, SIP implementations can use UDP or other unreliable datagram protocols. Each such datagram carries one request or response. See Section 18 on constraints on usage of unreliable transports. Implementations processing SIP messages over stream-oriented transports MUST ignore any CRLF appearing before the start-line [H4.1]. The Content-Length header field value is used to locate the end of each SIP message in a stream. It will always be present when SIP messages are sent over stream-oriented transports.

(page 34 continued on part 3)

Next Section