Network Working Group B. Campbell, Ed. Request for Comments: 3428 J. Rosenberg Category: Standards Track dynamicsoft H. Schulzrinne Columbia University C. Huitema D. Gurle Microsoft Corporation December 2002 Session Initiation Protocol (SIP) Extension for Instant Messaging Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved.
AbstractInstant Messaging (IM) refers to the transfer of messages between users in near real-time. These messages are usually, but not required to be, short. IMs are often used in a conversational mode, that is, the transfer of messages back and forth is fast enough for participants to maintain an interactive conversation. This document proposes the MESSAGE method, an extension to the Session Initiation Protocol (SIP) that allows the transfer of Instant Messages. Since the MESSAGE request is an extension to SIP, it inherits all the request routing and security features of that protocol. MESSAGE requests carry the content in the form of MIME body parts. MESSAGE requests do not themselves initiate a SIP dialog; under normal usage each Instant Message stands alone, much like pager messages. MESSAGE requests may be sent in the context of a dialog initiated by some other SIP request.
Terminology In this document, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY" and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119  and indicate requirement levels for compliant SIP implementations. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Scope of Applicability . . . . . . . . . . . . . . . . . . . 3 3. Overview of Operation . . . . . . . . . . . . . . . . . . . 4 4. UAC Processing . . . . . . . . . . . . . . . . . . . . . . . 5 5. Use of Instant Message URIs . . . . . . . . . . . . . . . . 6 6. Proxy Processing . . . . . . . . . . . . . . . . . . . . . . 6 7. UAS Processing . . . . . . . . . . . . . . . . . . . . . . . 7 8. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8 9. Method Definition . . . . . . . . . . . . . . . . . . . . . 9 10. Example Messages . . . . . . . . . . . . . . . . . . . . . . 11 11. Security Considerations . . . . . . . . . . . . . . . . . . 13 11.1 Outbound authentication . . . . . . . . . . . . . . . . . . 13 11.2 SIPS URIs . . . . . . . . . . . . . . . . . . . . . . . . . 14 11.3 End-to-End Protection . . . . . . . . . . . . . . . . . . . 14 11.4 Replay Prevention . . . . . . . . . . . . . . . . . . . . . 14 11.5 Using message/cpim bodies . . . . . . . . . . . . . . . . . 15 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . 15 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 15 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 15 15. Normative References . . . . . . . . . . . . . . . . . . . . 16 16. Informational References . . . . . . . . . . . . . . . . . . 16 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 17 18. Full Copyright Statement . . . . . . . . . . . . . . . . . . 18 11], the UNIX talk application, and IRC. More recently, IM
has been used as a service coupled with presence and buddy lists; that is, when a friend comes online, a user can be made aware of this and have the option of sending the friend an instant message. The protocols for accomplishing this are all proprietary, which has seriously hampered interoperability. The integration of instant messaging, presence, and session-oriented communications is very powerful. The Session Initiation Protocol (SIP)  provides mechanisms that are useful for presence applications, and for session-oriented communication applications, but not for instant messages. This document proposes an extension method for SIP called the MESSAGE method. MESSAGE requests normally carry the instant message content in the request body. RFC 2778  and RFC 2779  give a model and requirements for presence and instant messaging protocols. Implementations of the MESSAGE method SHALL support all the instant message requirements in RFC 2779 relevant to its scope of applicability.
There may be a temptation to simulate a session of IMs by initiating a dialog, then sending MESSAGE requests in the context of that dialog. This is not an adequate solution for IM sessions, in that this approach forces the MESSAGE requests to follow the same network path as any other SIP requests, even though the MESSAGE requests arguably carry media rather than signaling. IM applications are typically high volume, and we expect the IM volume in sessions to be even higher. This will likely cause congestion problems if sent over a transport without congestion control, and there is no clear mechanism in SIP to prevent some hop from forwarding a MESSAGE request over UDP. Additionally, MESSAGE requests sent over an existing dialog must, by the nature of SIP, go to the same destination as any other request sent in that dialog. This prevents any separation between the IM endpoint and the signaling endpoint. This is not an acceptable limitation for the session-model of instant messaging. The authors recognize that there may be valid reasons to send MESSAGE requests in the context of a dialog. For example, one participant in a voice session may wish to send an IM to another participant, and associate that IM with the session. But implementations SHOULD NOT create dialogs for the primary purpose of associating MESSAGE requests with one another. Note that this statement does not prohibit using SIP to initiate a media session made up of IMs, just like any other session. Indeed, we expect the solution for IM sessions to use that metaphor. The reader should avoid confusing the concepts of a SIP dialog and a media session. 7]. Since the message/cpim format is expected to be supported by other instant message protocols, endpoints using different IM protocols, but otherwise supporting message/cpim body types, should be able to
exchange messages without modification of the content by a gateway or other intermediary. This helps to enable end-to-end security between endpoints that use different IM protocols. The request may traverse a set of SIP proxies, using a variety of transports, before reaching its destination. The destination for each hop is located using the address resolution rules detailed in the Common Profile for Instant Messaging (CPIM)  and SIP specifications. During traversal, each proxy may rewrite the request URI based on available routing information. Provisional and final responses to the request will be returned to the sender as with any other SIP request. Normally, a 200 OK response will be generated by the user agent of the request's final recipient. Note that this indicates that the user agent accepted the message, not that the user has seen it. MESSAGE requests do not establish dialogs. section 8.1 of the SIP specification . All UACs which support the MESSAGE method MUST be prepared to send MESSAGE requests with a body of type text/plain. They may send bodies of type message/cpim . MESSAGE requests do not initiate dialogs. User Agents MUST NOT insert Contact header fields into MESSAGE requests. A UAC MAY associate a MESSAGE request with an existing dialog. If a MESSAGE request is sent within a dialog, it is "associated" with any media session or sessions associated with that dialog. If the UAC receives a 200 OK response to a MESSAGE request, it may assume the message has been delivered to the final destination. It MUST NOT assume that the recipient has actually read the instant message. If the UAC receives a 202 Accepted response, the message has been delivered to a gateway, store and forward server, or some other service that may eventually deliver the message. In this case, the UAC MUST NOT assume the message has been delivered to the final destination. If confirmation of delivery is required for a message that has been responded to with a 202 Accepted, that confirmation must be delivered via some other mechanism, which is beyond the scope of this specification.
Note that a downstream proxy could fork a MESSAGE request. If this occurs, the forking proxy will forward one final response upstream, even though it may receive multiple final responses. The UAC will have no way to detect whether or not a fork occurs. Therefore the UAC MUST NOT assume that a given final response represents the only UAS that receives the request. For example, multiple branches of a fork could have resulted in 2xx responses. Even though the UAC only sees one of those responses, the request has in fact been received by the second device as well. The UAC MAY add an Expires header field to limit the validity of the message content. If the UAC adds an Expires header field with a non-zero value, it SHOULD also add a Date header field containing the time the message is sent. 8] in the form of "im:user@domain". IM URIs are abstract, and will eventually be translated to concrete, protocol- dependent URI. If a UA is presented with an IM URI as the address for an instant message, it SHOULD resolve it to a SIP URI, and place the resulting URI in the Request-URI of the MESSAGE request before sending. If the UA is unable to resolve the IM URI, it MAY place the IM URI in the Request-URI, thus delegating the resolution to a downstream device such as proxy or gateway. Performing this translation as early as possible allows SIP proxies, which may be unaware of the im: namespace, to route the requests normally. MESSAGE requests also contain logical identifiers of the sender and intended recipient, in the form of the From and To header fields. These identifiers SHOULD contain SIP (or SIPS) URIs, but MAY include IM URIs if the SIP URIs are not known at the time of request construction. Record-Route and Route header fields MUST NOT contain IM URIs. These header fields contain concrete SIP or SIPS URIs according to the rules of SIP . 1]. Note that the MESSAGE request MAY fork; this allows for delivery of the message to several possible terminals where the user might be. A proxy forking a MESSAGE request follows the normal SIP rules for forking a non-INVITE request. In particular, even if the fork
results in multiple successful deliveries, the forking proxy will only forward one final response upstream. 1]. A UAS receiving a MESSAGE request SHOULD respond with a final response immediately. Note, however, that the UAS is not obliged to display the message to the user either before or after responding with a 200 OK. That is, a 200 OK response does not necessarily mean the user has read the message. A 2xx response to a MESSAGE request MUST NOT contain a body. A UAS MUST NOT insert a Contact header field into a 2xx response. A UAS which is, in fact, a message relay, storing the message and forwarding it later on, or forwarding it into a non-SIP domain, SHOULD return a 202 (Accepted)  response indicating that the message was accepted, but end to end delivery has not been guaranteed. A 4xx or 5xx response indicates that the message was not delivered successfully. A 6xx response means it was delivered successfully, but refused. A UAS that supports the MESSAGE method MUST be prepared to receive and render bodies of type "text/plain", and may support reception and rendering of bodies of type "message/cpim" . A MESSAGE request is said to be expired if its expiration time has passed. The expiration time is determined by examining the Expires header field, if present. MESSAGE requests without an Expires header field do not expire. If the MESSAGE request containing an Expires header field also contains a Date header field, the UAS SHOULD interpret the Expires header field value as delta time from the Date header field value. If the request does not contain a Date header field, the UAS SHOULD interpret the Expires header value as delta time from the time the UAS received the request. If the MESSAGE expires before the UAS is able to present the message to the user, the UAS SHOULD handle the message based on local policy. This policy could mean: the message is deleted undisplayed,
the message is still displayed to the user, or some other policy may be invoked. If the message is displayed, the UAS SHOULD clearly indicate to the user that the message has expired. If the UAS is acting as a message relay, and is unable to deliver the message before expiration, it chooses an action based on local policy. This action could involve deleting the message undelivered, delivering it as is, logging the expired message, or any other local policy.
There have been discussions on making the 1300 byte limit based on the path MTU to the next hop SIP device. The SIP specification does exactly that, choosing the limit 200 bytes less than the path MTU, or 1300 bytes if the device does not know the path MTU. Transport decisions are made on a per-hop basis. However, the point of this limit is to make sure that no upstream proxy chooses to send a MESSAGE request with large content over UDP. Since, except in trivial circumstances, a MESSAGE client is very unlikely to know the MTU for upstream devices beyond the next hop, an MTU based limit is not very useful. A UAC MUST NOT initiate a new out-of-dialog MESSAGE transaction to a given URI if there is a previous out-of-dialog transaction pending for the same URI. Similarly, A UAC SHOULD NOT initiate overlapping MESSAGE transactions inside a dialog, and MUST NOT do so unless the route set for that dialog uses a congestion-controlled transport at every hop. The prohibition against overlapping MESSAGE request provides some degree of congestion-safe behavior. A request and its associated response must each cross the full path between the UAC and the UAS. The time required for this will increase as networks become congested. This provides an adaptive mechanism to slow the introduction of new MESSAGE requests to the same destination. It has been suggested that provisional responses should not be allowed for pager-model MESSAGE requests. However, it is not possible to require special treatment for MESSAGE, since many proxy servers will not be aware of the MESSAGE method. Therefore MESSAGE requests will receive the same provisional response treatment as any other non-INVITE method, as described in the SIP specification. 1] by adding an additional column, defining the header fields that can be used in MESSAGE requests and responses.
Header Field where proxy MESSAGE __________________________________________ Accept R - Accept 2xx - Accept 415 m* Accept-Encoding R - Accept-Encoding 2xx - Accept-Encoding 415 m* Accept-Language R - Accept-Language 2xx - Accept-Language 415 m* Alert-Info R - Alert-Info 180 - Allow R o Allow 2xx o Allow r o Allow 405 m Authentication-Info 2xx o Authorization R o Call-ID c r m Call-Info ar o Contact R - Contact 1xx - Contact 2xx - Contact 3xx o Contact 485 o Content-Disposition o Content-Encoding o Content-Language o Content-Length ar t Content-Type * CSeq c r m Date a o Error-Info 300-699 a o Expires o From c r m In-Reply-To R o Max-Forwards R amr m Organization ar o Table 1: Summary of header fields, A--O
Header Field where proxy MESSAGE __________________________________________ Priority R ar o Proxy-Authenticate 407 ar m Proxy-Authenticate 401 ar o Proxy-Authorization R dr o Proxy-Require R ar o Record-Route ar - Reply-To o Require ar c Retry-After 404,413,480,486 o 500,503 o 600,603 o Route R adr o Server r o Subject R o Timestamp o To c(1) r m Unsupported 420 o User-Agent o Via R amr m Via rc dr m Warning r o WWW-Authenticate 401 ar m WWW-Authenticate 407 ar o (1): copied with possible addition of tag Table 2: Summary of header fields, P--Z A MESSAGE request MAY contain a body, using the standard MIME header fields to identify the content. Figure 1. The message flow shows an initial IM sent from User 1 to User 2, both users in the same domain, "domain", through a single proxy.
| F1 MESSAGE | | |--------------------> | F2 MESSAGE | | | ----------------------->| | | | | | F3 200 OK | | | <-----------------------| | F4 200 OK | | |<-------------------- | | | | | | | | | | | User 1 Proxy User 2 Figure 1: Example Message Flow Message F1 looks like: MESSAGE sip:firstname.lastname@example.org SIP/2.0 Via: SIP/2.0/TCP user1pc.domain.com;branch=z9hG4bK776sgdkse Max-Forwards: 70 From: sip:email@example.com;tag=49583 To: sip:firstname.lastname@example.org Call-ID: email@example.com CSeq: 1 MESSAGE Content-Type: text/plain Content-Length: 18 Watson, come here. User1 forwards this message to the server for domain.com. The proxy receives this request, and recognizes that it is the server for domain.com. It looks up user2 in its database (built up through registrations), and finds a binding from sip:firstname.lastname@example.org to sip:email@example.com. It forwards the request to user2. The resulting message, F2, looks like: MESSAGE sip:firstname.lastname@example.org SIP/2.0 Via: SIP/2.0/TCP proxy.domain.com;branch=z9hG4bK123dsghds Via: SIP/2.0/TCP user1pc.domain.com;branch=z9hG4bK776sgdkse; received=126.96.36.199 Max-Forwards: 69 From: sip:email@example.com;tag=49394 To: sip:firstname.lastname@example.org Call-ID: email@example.com CSeq: 1 MESSAGE Content-Type: text/plain Content-Length: 18
Watson, come here. The message is received by user2, displayed, and a response is generated, message F3, and sent to the proxy: SIP/2.0 200 OK Via: SIP/2.0/TCP proxy.domain.com;branch=z9hG4bK123dsghds; received=192.0.2.1 Via: SIP/2.0/TCP user1pc.domain.com;;branch=z9hG4bK776sgdkse; received=188.8.131.52 From: sip:firstname.lastname@example.org;tag=49394 To: sip:email@example.com;tag=ab8asdasd9 Call-ID: firstname.lastname@example.org CSeq: 1 MESSAGE Content-Length: 0 Note that most of the header fields are simply reflected in the response. The proxy receives this response, strips off the top Via, and forwards to the address in the next Via, user1pc.domain.com, the result being message F4: SIP/2.0 200 OK Via: SIP/2.0/TCP user1pc.domain.com;branch=z9hG4bK776sgdkse; received=184.108.40.206 From: sip:email@example.com;;tag=49394 To: sip:firstname.lastname@example.org;tag=ab8asdasd9 Call-ID: email@example.com CSeq: 1 MESSAGE Content-Length: 0 RFC 3261, is RECOMMENDED. This is useful to verify the identity of the originator, and prevent spoofing and spamming at the originating network.
1] allows a UA to assert that every hop must occur over a secure connection. This provides some level of integrity and privacy protection. However, this requires the users to trust that each proxy in the request path is well-behaved, that is, they do not violate the rules for routing SIPS URIs. Also, any unencrypted bodies are fully exposed to the proxies. Additionally, the possibility of a forking proxy allows a MESSAGE request to be delivered to additional endpoints without the knowledge of the UAC. If only hop-by-hop protection is used, the users must trust all proxies in the chain to not fork requests to unauthorized destinations. RFC 3369 . The AES  algorithm should be preferred, as it is expected that AES best suits the capabilities of many platforms. However, an IETF specification for this is still incomplete as of the time of this writing.
If a UAS receives a stale message that can be confirmed to have come from a known store and forward server (perhaps over a TLS connection), it makes sense for it to accept the message normally. Also, if one or more stale messages arrive shortly after an offline period, the UAS MAY accept the message, but SHOULD warn the user that there is a risk the message has been replayed. 7] allows for the S/MIME protection of metadata in addition to the message payload itself. In many cases, this metadata is redundant with SIP header fields. Still, message/cpim adds value in that the protection of metadata can extend across protocol boundaries. For example, a signed message/cpim body can provide sender authentication using the message/cpim From header field, even if the message crosses a gateway to another CPIM compliant instant message service that does not understand SIP header fields. http://www.iana.org/assignments/sip-parameters/Method registry, according to the following information: MESSAGE [RFC3428]
Stuart Barkley UUNet Mauricio Arango SUN Richard Shockey NeuStar Jorgen Bjorker Hotsip Henry Sinnreich MCI Worldcom Ronald Akers Motorola Torrey Searle Indigo Software Rohan Mahy Cisco Christian Groves Ericsson Allison Mankin ISI Tan Ya-Ching Siemens  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002.  Day, M., Aggarwal, S. and J. Vincent, "Instant Messaging / Presence Protocol Requirements", RFC 2779, February 2000.  Day, M., Rosenberg, J. and H. Sugano, "A Model for Presence and Instant Messaging", RFC 2778, February 2000.  Housley, R., "Cryptographic Message Syntax (CMS)", RFC 3369, August 2002.  Roach, A., "Session Initiation Protocol (SIP)-Specific Event Notification", RFC 3265, June 2002.  Bradner, S., "Keywords for Use in RFC's to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997.  Atkins, D. and G. Klyne, "Common Presence and Instant Messaging Message Format", Work in Progress.  Crocker, D., Diacakis, A., Mazzoldi, F., Huitema, C., Klyne, G., Rose, M., Rosenberg, J., Sparks, R., Sugano, H. and J. Peterson, "Address Resolution for Instant Messaging and Presence", Work in Progress.  Rosenberg, J. and H. Schulzrinne, "SIP Caller Preferences and Callee Capabilities", Work in Progress.  Schaad, J. and R. Housley, "Use of the AES Encryption Algorithm and RSA-OAEP Key Transport in CMS", Work in Progress.
 DellaFera, C., Eichin, M., French, R., Jedlinski, D., Kohl, J. and W. Sommerfeld, "The Zephyr notification service", in USENIX Winter Conference (Dallas, Texas), Feb. 1988.
Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.