Other-Mime-header = (Content-ID
; Content-ID, and Content-Description are defined in RFC2045.
; Content-Disposition is defined in RFC2183
; MIME-extension-field indicates additional MIME extension
; header fields as described in RFC2045
data = *OCTET
end-line = "-------" transact-id continuation-flag CRLF
continuation-flag = "+" / "$" / "#"
ext-header = hname ":" SP hval CRLF
hname = ALPHA *token
hval = utf8text
utf8text = *(HTAB / %x20-7E / UTF8-NONASCII)
UTF8-NONASCII = %xC0-DF 1UTF8-CONT
/ %xE0-EF 2UTF8-CONT
/ %xF0-F7 3UTF8-CONT
/ %xF8-Fb 4UTF8-CONT
/ %xFC-FD 5UTF8-CONT
UTF8-CONT = %x80-BF
Figure 11: MSRP ABNF10. Response Code Descriptions
This section summarizes the semantics of various response codes that
may be used in MSRP transaction responses. These codes may also be
used in the Status header field in REPORT requests.
The 200 response code indicates a successful transaction.
A 400 response indicates that a request was unintelligible. The
sender may retry the request after correcting the error.
A 403 response indicates that the attempted action is not allowed.
The sender should not try the request again.
A 408 response indicates that a downstream transaction did not
complete in the allotted time. It is never sent by any elements
described in this specification. However, 408 is used in the MSRP
relay extension; therefore, MSRP endpoints may receive it. An
endpoint MUST treat a 408 response in the same manner as it would
treat a local timeout.
A 413 response indicates that the receiver wishes the sender to stop
sending the particular message. Typically, a 413 is sent in response
to a chunk of an undesired message.
If a message sender receives a 413 in a response, or in a REPORT
request, it MUST NOT send any further chunks in the message, that is,
any further chunks with the same Message-ID value. If the sender
receives the 413 while in the process of sending a chunk, and the
chunk is interruptible, the sender MUST interrupt it.
A 415 response indicates that the SEND request contained a media type
that is not understood by the receiver. The sender should not send
any further messages with the same content-type for the duration of
A 423 response indicates that one of the requested parameters is out
of bounds. It is used by the relay extensions to this document.
A 481 response indicates that the indicated session does not exist.
The sender should terminate the session.
A 501 response indicates that the recipient does not understand the
The 501 response code exists to allow some degree of method
extensibility. It is not intended as a license to ignore methods
defined in this document; rather, it is a mechanism to report lack
of support of extension methods.
A 506 response indicates that a request arrived on a session that is
already bound to another network connection. The sender should cease
sending messages for that session on this connection.
11.1. Basic IM Session
This section shows an example flow for the most common scenario. The
example assumes SIP is used to transport the SDP exchange. Details
of the SIP messages and SIP proxy infrastructure are omitted for the
sake of brevity. In the example, assume that the offerer is
sip:email@example.com and the answerer is sip:firstname.lastname@example.org.
|(1) (SIP) INVITE |
|(2) (SIP) 200 OK |
|(3) (SIP) ACK |
|(4) (MSRP) SEND |
|(5) (MSRP) 200 OK |
|(6) (MSRP) SEND |
|(7) (MSRP) 200 OK |
|(8) (SIP) BYE |
|(9) (SIP) 200 OK |
Figure 12: Basic IM Session Example
<?xml version="1.0" encoding="UTF-8"?>
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<p>See the results at <a
Figure 13: Example Message with XHTML11.3. Chunked Message
For an example of a chunked message, see the example in Section 5.1.
11.4. Chunked Message with Message/CPIM Payload
This example shows a chunked message containing a CPIM message that
wraps a text/plain payload. It is worth noting that MSRP considers
the complete CPIM message before chunking the message; thus, the CPIM
headers are included in only the first chunk. The MSRP Content-Type
and Byte-Range headers, present in both chunks, refer to the whole
MSRP d93kswow SEND
To: Bob <sip:email@example.com>
From: Alice <sip:firstname.lastname@example.org>
Figure 14: First Chunk
<p>Here is that important link...
Figure 16: Initial SEND Request Bob->Alice (MSRP):
MSRP dkei38sd REPORT
Status: 000 200 OK
Figure 17: Success Report11.7. Forked IM
Traditional IM systems generally do a poor job of handling multiple
simultaneous IM clients online for the same person. While some do a
better job than many existing systems, handling of multiple clients
is fairly crude. This becomes a much more significant issue when
always-on mobile devices are available, but it is desirable to use
them only if another IM client is not available.
Using SIP makes rendezvous decisions explicit, deterministic, and
very flexible. In contrast, "page-mode" IM systems use implicit
implementation-specific decisions that IM clients cannot influence.
With SIP session-mode messaging, rendezvous decisions can be under
control of the client in a predictable, interoperable way for any
host that implements callee capabilities . As a result,
rendezvous policy is managed consistently for each address of record.
The following example shows Juliet with several IM clients where she
can be reached. Each of these has a unique SIP contact and MSRP
session. The example takes advantage of SIP's capability to "fork"
an invitation to several contacts in parallel, in sequence, or in
combination. Juliet has registered from her chamber, the balcony,
her PDA, and as a last resort, you can leave a message with her
nurse. Juliet's contacts are listed below. The q-values express
relative preference (q=1.0 is the highest preference).
When Romeo opens his IM program, he selects Juliet and types the
message "art thou hither?" (instead of "you there?"). His client
sends a SIP invitation to sip:email@example.com. The
proxy there tries first the balcony and the chamber simultaneously.
A client is running on each of those systems, both of which set up
early sessions of MSRP with Romeo's client. The client automatically
sends the message over MSRP to the two MSRP URIs involved. After a
delay of a several seconds with no reply or activity from Juliet, the
proxy cancels the invitation at her first two contacts, and forwards
the invitation on to Juliet's PDA. Since her father is talking to
her about her wedding, she selects "Do Not Disturb" on her PDA, which
sends a "Busy Here" response. The proxy then tries the nurse, who
answers and tells Romeo what is going on.
MSRP was designed to be only minimally extensible. New MSRP methods,
header fields, and status codes can be defined in standards-track
RFCs. MSRP does not contain a version number or any negotiation
mechanism to require or discover new features. If an extension is
specified in the future that requires negotiation, the specification
will need to describe how the extension is to be negotiated in the
encapsulating signaling protocol. If a non-interoperable update or
extension occurs in the future, it will be treated as a new protocol,
and MUST describe how its use will be signaled.
In order to allow extension header fields without breaking
interoperability, if an MSRP device receives a request or response
containing a header field that it does not understand, it MUST ignore
the header field and process the request or response as if the header
field was not present. If an MSRP device receives a request with an
unknown method, it MUST return a 501 response.
MSRP was designed to use lists of URIs instead of a single URI in the
To-Path and From-Path header fields in anticipation of relay or
gateway functionality being added. In addition, "msrp" and "msrps"
URIs can contain parameters that are extensible.
13. CPIM Compatibility
MSRP sessions may go to a gateway to other Common Profile for Instant
Messaging (CPIM)  compatible protocols. If this occurs, the
gateway MUST maintain session state, and MUST translate between the
MSRP session semantics and CPIM semantics, which do not include a
concept of sessions. Furthermore, when one endpoint of the session
is a CPIM gateway, instant messages SHOULD be wrapped in
"message/cpim"  bodies. Such a gateway MUST include
"message/cpim" as the first entry in its SDP accept-types attribute.
MSRP endpoints sending instant messages to a peer that has included
"message/cpim" as the first entry in the accept-types attribute
SHOULD encapsulate all instant message bodies in "message/ cpim"
wrappers. All MSRP endpoints MUST support the message/cpim type, and
SHOULD support the S/MIME features of that format.
If a message is to be wrapped in a message/cpim envelope, the
wrapping MUST be done prior to breaking the message into chunks, if
All MSRP endpoints MUST recognize the From, To, DateTime, and Require
header fields as defined in RFC 3862. Such applications SHOULD
recognize the CC header field, and MAY recognize the Subject header
field. Any MSRP application that recognizes any message/cpim header
field MUST understand the NS (name space) header field.
All message/cpim body parts sent by an MSRP endpoint MUST include the
From and To header fields. If the message/cpim body part is
protected using S/MIME, then it MUST also include the DateTime header
The NS, To, and CC header fields may occur multiple times. Other
header fields defined in RFC 3862 MUST NOT occur more than once in a
given message/cpim body part in an MSRP message. The Require header
field MAY include multiple values. The NS header field MAY occur
zero or more times, depending on how many name spaces are being
Extension header fields MAY occur more than once, depending on the
definition of such header fields.
Using message/cpim envelopes is also useful if an MSRP device
wishes to send a message on behalf of some other identity. The
device may add a message/cpim envelope with the appropriate From
header field value.
14. Security Considerations
Instant messaging systems are used to exchange a variety of sensitive
information ranging from personal conversations, to corporate
confidential information, to account numbers and other financial
trading information. IM is used by individuals, corporations, and
governments for communicating important information. IM systems need
to provide the properties of integrity and confidentiality for the
exchanged information, and the knowledge that you are communicating
with the correct party, and they need to allow the possibility of
anonymous communication. MSRP pushes many of the hard problems to
SIP when SIP sets up the session, but some of the problems remain.
Spam and Denial of Service (DoS) attacks are also very relevant to IM
MSRP needs to provide confidentiality and integrity for the messages
it transfers. It also needs to provide assurances that the connected
host is the host that it meant to connect to and that the connection
has not been hijacked.
14.1. Secrecy of the MSRP URI
When an endpoint sends an MSRP URI to its peer in a rendezvous
protocol, that URI is effectively a secret shared between the peers.
If an attacker learns or guesses the URI prior to the completion of
session setup, it may be able to impersonate one of the peers.
Assuming the URI exchange in the rendezvous protocol is sufficiently
protected, it is critical that the URI remain difficult to "guess"
via brute force methods. Most components of the URI, such as the
scheme and the authority components, are common knowledge. The
secrecy is entirely provided by the session-id component.
Therefore, when an MSRP device generates an MSRP URI to be used in
the initiation of an MSRP session, the session-id component MUST
contain at least 80 bits of randomness.
14.2. Transport Level Protection
When using only TCP connections, MSRP security is fairly weak. If
host A is contacting host B, B passes its hostname and a secret to A
using a rendezvous protocol. Although MSRP requires the use of a
rendezvous protocol with the ability to protect this exchange, there
is no guarantee that the protection will be used all the time. If
such protection is not used, anyone can see this secret. Host A then
connects to the provided hostname and passes the secret in the clear
across the connection to B. Host A assumes that it is talking to B
based on where it sent the SYN packet and then delivers the secret in
plain text across the connections. Host B assumes it is talking to A
because the host on the other end of the connection delivered the
secret. An attacker that could ACK the SYN packet could insert
itself as a man-in-the-middle in the connection.
When using TLS connections, the security is significantly improved.
We assume that the host accepting the connection has a certificate
from a well-known certification authority. Furthermore, we assume
that the signaling to set up the session is protected by the
rendezvous protocol. In this case, when host A contacts host B, the
secret is passed through a confidential channel to A. A connects
with TLS to B. B presents a valid certificate, so A knows it really
is connected to B. A then delivers the secret provided by B, so that
B can verify it is connected to A. In this case, a rogue SIP Proxy
can see the secret in the SIP signaling traffic and could potentially
insert itself as a man-in-the-middle.
Realistically, using TLS with certificates from well-known
certification authorities is difficult for peer-to-peer connections,
as the types of hosts that end clients use for sending instant
messages are unlikely to have long-term stable IP addresses or DNS
names that the certificates can bind to. In addition, the cost of
server certificates from well-known certification authorities is
currently expensive enough to discourage their use for each client.
Using TLS in a peer-to-peer mode without well-known certificates is
discussed in Section 14.4.
TLS becomes much more practical when some form of relay is
introduced. Clients can then form TLS connections to relays, which
are much more likely to have TLS certificates. While this
specification does not address such relays, they are described by a
companion document . That document makes extensive use of TLS to
protect traffic between clients and relays, and between one relay and
TLS is used to authenticate devices and to provide integrity and
confidentiality for the header fields being transported. MSRP
elements MUST implement TLS and MUST also implement the TLS
ClientExtendedHello extended hello information for server name
indication as described in . A TLS cipher-suite of
TLS_RSA_WITH_AES_128_CBC_SHA  MUST be supported (other cipher-
suites MAY also be supported).
The only strong security for non-TLS connections is achieved using
Since MSRP carries arbitrary MIME content, it can trivially carry
S/MIME protected messages as well. All MSRP implementations MUST
support the multipart/signed media-type even if they do not support
S/MIME. Since SIP can carry a session key, S/MIME messages in the
context of a session could also be protected using a key-wrapped
shared secret  provided in the session setup. MSRP can carry
unencoded binary payloads. Therefore, MIME bodies MUST be
transferred with a transfer encoding of binary. If a message is both
signed and encrypted, it SHOULD be signed first, then encrypted. If
S/MIME is supported, SHA-1, SHA-256, RSA, and AES-128 MUST be
supported. For RSA, implementations MUST support key sizes of at
least 1024 bits and SHOULD support key sizes of 2048 bits or more.
This does not actually require the endpoint to have certificates from
a well-known certification authority. When MSRP is used with SIP,
the Identity  and Certificates  mechanisms provide S/MIME-
based delivery of a secret between A and B. No SIP intermediary
except the explicitly trusted authentication service (one per user)
can see the secret. The S/MIME encryption of the SDP can also be
used by SIP to exchange keying material that can be used in MSRP.
The MSRP session can then use S/MIME with this keying material to
sign and encrypt messages sent over MSRP. The connection can still
be hijacked since the secret is sent in clear text to the other end
of the TCP connection, but the consequences are mitigated if all the
MSRP content is signed and encrypted with S/MIME. Although out of
scope for this document, the SIP negotiation of an MSRP session can
negotiate symmetric keying material to be used with S/MIME for
integrity and privacy.
14.4. Using TLS in Peer-to-Peer Mode
TLS can be used with a self-signed certificate as long as there is a
mechanism for both sides to ascertain that the other side used the
correct certificate. When used with SDP and SIP, the correct
certificate can be verified by passing a fingerprint of the
certificate in the SDP and ensuring that the SDP has suitable
integrity protection. When SIP is used to transport the SDP, the
integrity can be provided by the SIP Identity mechanism . The
rest of this section describes the details of this approach.
If self-signed certificates are used, the content of the
subjectAltName attribute inside the certificate MAY use the URI of
the user. In SIP, this URI of the user is the User's Address of
Record (AOR). This is useful for debugging purposes only and is not
required to bind the certificate to one of the communication
endpoints. Unlike normal TLS operations in this protocol, when doing
peer-to-peer TLS, the subjectAltName is not an important component of
the certificate verification. If the endpoint is also able to make
anonymous sessions, a distinct, unique certificate MUST be used for
this purpose. For a client that works with multiple users, each user
SHOULD have its own certificate. Because the generation of
public/private key pairs is relatively expensive, endpoints are not
required to generate certificates for each session.
A certificate fingerprint is the output of a one-way hash function
computed over the Distinguished Encoding Rules (DER) form of the
certificate. The endpoint MUST use the certificate fingerprint
attribute as specified in  and MUST include this in the SDP. The
certificate presented during the TLS handshake needs to match the
fingerprint exchanged via the SDP, and if the fingerprint does not
match the hashed certificate then the endpoint MUST tear down the
media session immediately.
When using SIP, the integrity of the fingerprint can be ensured
through the SIP Identity mechanism . When a client wishes to use
SIP to set up a secure MSRP session with another endpoint, it sends
an SDP offer in a SIP message to the other endpoint. This offer
includes, as part of the SDP payload, the fingerprint of the
certificate that the endpoint wants to use. The SIP message
containing the offer is sent to the offerer's SIP proxy, which will
add an Identity header according to the procedures outlined in .
When the far endpoint receives the SIP message, it can verify the
identity of the sender using the Identity header. Since the Identity
header is a digital signature across several SIP headers, in addition
to the body or bodies of the SIP message, the receiver can also be
certain that the message has not been tampered with after the digital
signature was added to the SIP message.
An example of SDP with a fingerprint attribute is shown in the
following figure. Note the fingerprint is shown spread over two
lines due to formatting consideration but should all be on one line.
c=IN IP4 atlanta.example.com
m=message 7654 TCP/TLS/MSRP *
Figure 19: SDP with Fingerprint Attribute14.5. Other Security Concerns
MSRP cannot be used as an amplifier for DoS attacks, but it can be
used to form a distributed attack to consume TCP connection resources
on servers. The attacker, Mallory, sends a SIP INVITE with no offer
to Alice. Alice returns a 200 with an offer and Mallory returns an
answer with SDP indicating that his MSRP address is the address of
Tom. Since Alice sent the offer, Alice will initiate a connection to
Tom using up resources on Tom's server. Given the huge number of IM
clients, and the relatively few TCP connections that most servers
support, this is a fairly straightforward attack.
SIP is attempting to address issues in dealing with spam. The spam
issue is probably best dealt with at the SIP level when an MSRP
session is initiated and not at the MSRP level.
If a sender chooses to employ S/MIME to protect a message, all S/MIME
operations apply to the complete message, prior to any breaking of
the message into chunks.
The signaling will have set up the session to or from some specific
URIs that will often have "im:" or "sip:" URI schemes. When the
signaling has been set up to a specific end user, and S/MIME is
implemented, then the client needs to verify that the name in the
SubjectAltName of the certificate contains an entry that matches the
URI that was used for the other end in the signaling. There are some
cases, such as IM conferencing, where the S/MIME certificate name and
the signaled identity will not match. In these cases, the client
should ensure that the user is informed that the message came from
the user identified in the certificate and does not assume that the
message came from the party they signaled.
In some cases, a sending device may need to attribute a message to
some other identity, and may use different identities for different
messages in the same session. For example, a conference server may
send messages on behalf of multiple users on the same session.
Rather than add additional header fields to MSRP for this purpose,
MSRP relies on the message/cpim format for this purpose. The sender
may envelop such a message in a message/cpim body, and place the
actual sender identity in the From field. The trustworthiness of
such an attribution is affected by the security properties of the
session in the same way that the trustworthiness of the identity of
the actual peer is affected, with the additional issue of determining
whether the recipient trusts the sender to assert the identity.
This approach can result in nesting of message/cpim envelopes. For
example, a message originates from a CPIM gateway, and is then
forwarded by a conference server onto a new session. Both the
gateway and the conference server introduce envelopes. In this case,
the recipient client SHOULD indicate the chain of identity assertions
to the user, rather than allow the user to assume that either the
gateway or the conference server originated the message.
It is possible that a recipient might receive messages that are
attributed to the same sender via different MSRP sessions. For
example, Alice might be in a conversation with Bob via an MSRP
session over a TLS protected channel. Alice might then receive a
different message from Bob over a different session, perhaps with a
conference server that asserts Bob's identity in a message/cpim
envelope signed by the server.
MSRP does not prohibit multiple simultaneous sessions between the
same pair of identities. Nor does it prohibit an endpoint sending a
message on behalf of another identity, such as may be the case for a
conference server. The recipient's endpoint should determine its
level of trust of the authenticity of the sender independently for
each session. The fact that an endpoint trusts the authenticity of
the sender on any given session should not affect the level of trust
it assigns for apparently the same sender on a different session.
When MSRP clients form or acquire a certificate, they SHOULD ensure
that the subjectAltName has a GeneralName entry of type
uniformResourceIdentifier for each URI corresponding to this client
and should always include an "im:" URI. It is fine if the
certificate contains other URIs such as "sip:" or "xmpp:" URIs.
MSRP implementors should be aware of a potential attack on MSRP
devices that involves placing very large values in the byte-range
header field, potentially causing the device to allocate very large
memory buffers to hold the message. Implementations SHOULD apply
some degree of sanity checking on byte-range values before allocating