Tech-invite3GPPspaceIETF RFCsSIP
9190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 5322

Internet Message Format

Pages: 57
Draft Standard
Errata
Obsoletes:  2822
Updates:  4021
Updated by:  6854
Part 1 of 3 – Pages 1 to 9
None   None   Next

Top   ToC   RFC5322 - Page 1
Network Working Group                                    P. Resnick, Ed.
Request for Comments: 5322                         Qualcomm Incorporated
Obsoletes: 2822                                             October 2008
Updates: 4021
Category: Standards Track


                        Internet Message Format

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Abstract

This document specifies the Internet Message Format (IMF), a syntax for text messages that are sent between computer users, within the framework of "electronic mail" messages. This specification is a revision of Request For Comments (RFC) 2822, which itself superseded Request For Comments (RFC) 822, "Standard for the Format of ARPA Internet Text Messages", updating it to reflect current practice and incorporating incremental changes that were specified in other RFCs.
Top   ToC   RFC5322 - Page 2

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Notational Conventions . . . . . . . . . . . . . . . . . . 5 1.2.1. Requirements Notation . . . . . . . . . . . . . . . . 5 1.2.2. Syntactic Notation . . . . . . . . . . . . . . . . . . 5 1.2.3. Structure of This Document . . . . . . . . . . . . . . 5 2. Lexical Analysis of Messages . . . . . . . . . . . . . . . . . 6 2.1. General Description . . . . . . . . . . . . . . . . . . . 6 2.1.1. Line Length Limits . . . . . . . . . . . . . . . . . . 7 2.2. Header Fields . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1. Unstructured Header Field Bodies . . . . . . . . . . . 8 2.2.2. Structured Header Field Bodies . . . . . . . . . . . . 8 2.2.3. Long Header Fields . . . . . . . . . . . . . . . . . . 8 2.3. Body . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3. Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 10 3.2. Lexical Tokens . . . . . . . . . . . . . . . . . . . . . . 10 3.2.1. Quoted characters . . . . . . . . . . . . . . . . . . 10 3.2.2. Folding White Space and Comments . . . . . . . . . . . 11 3.2.3. Atom . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.4. Quoted Strings . . . . . . . . . . . . . . . . . . . . 13 3.2.5. Miscellaneous Tokens . . . . . . . . . . . . . . . . . 14 3.3. Date and Time Specification . . . . . . . . . . . . . . . 14 3.4. Address Specification . . . . . . . . . . . . . . . . . . 16 3.4.1. Addr-Spec Specification . . . . . . . . . . . . . . . 17 3.5. Overall Message Syntax . . . . . . . . . . . . . . . . . . 18 3.6. Field Definitions . . . . . . . . . . . . . . . . . . . . 19 3.6.1. The Origination Date Field . . . . . . . . . . . . . . 22 3.6.2. Originator Fields . . . . . . . . . . . . . . . . . . 22 3.6.3. Destination Address Fields . . . . . . . . . . . . . . 23 3.6.4. Identification Fields . . . . . . . . . . . . . . . . 25 3.6.5. Informational Fields . . . . . . . . . . . . . . . . . 27 3.6.6. Resent Fields . . . . . . . . . . . . . . . . . . . . 28 3.6.7. Trace Fields . . . . . . . . . . . . . . . . . . . . . 30 3.6.8. Optional Fields . . . . . . . . . . . . . . . . . . . 30 4. Obsolete Syntax . . . . . . . . . . . . . . . . . . . . . . . 31 4.1. Miscellaneous Obsolete Tokens . . . . . . . . . . . . . . 32 4.2. Obsolete Folding White Space . . . . . . . . . . . . . . . 33 4.3. Obsolete Date and Time . . . . . . . . . . . . . . . . . . 33 4.4. Obsolete Addressing . . . . . . . . . . . . . . . . . . . 35 4.5. Obsolete Header Fields . . . . . . . . . . . . . . . . . . 35 4.5.1. Obsolete Origination Date Field . . . . . . . . . . . 36 4.5.2. Obsolete Originator Fields . . . . . . . . . . . . . . 36 4.5.3. Obsolete Destination Address Fields . . . . . . . . . 37 4.5.4. Obsolete Identification Fields . . . . . . . . . . . . 37 4.5.5. Obsolete Informational Fields . . . . . . . . . . . . 37
Top   ToC   RFC5322 - Page 3
       4.5.6.  Obsolete Resent Fields . . . . . . . . . . . . . . . . 38
       4.5.7.  Obsolete Trace Fields  . . . . . . . . . . . . . . . . 38
       4.5.8.  Obsolete optional fields . . . . . . . . . . . . . . . 38
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 38
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 39
   Appendix A.     Example Messages . . . . . . . . . . . . . . . . . 43
   Appendix A.1.   Addressing Examples  . . . . . . . . . . . . . . . 44
   Appendix A.1.1. A Message from One Person to Another with
                   Simple Addressing  . . . . . . . . . . . . . . . . 44
   Appendix A.1.2. Different Types of Mailboxes . . . . . . . . . . . 45
   Appendix A.1.3. Group Addresses  . . . . . . . . . . . . . . . . . 45
   Appendix A.2.   Reply Messages . . . . . . . . . . . . . . . . . . 46
   Appendix A.3.   Resent Messages  . . . . . . . . . . . . . . . . . 47
   Appendix A.4.   Messages with Trace Fields . . . . . . . . . . . . 48
   Appendix A.5.   White Space, Comments, and Other Oddities  . . . . 49
   Appendix A.6.   Obsoleted Forms  . . . . . . . . . . . . . . . . . 50
   Appendix A.6.1. Obsolete Addressing  . . . . . . . . . . . . . . . 50
   Appendix A.6.2. Obsolete Dates . . . . . . . . . . . . . . . . . . 50
   Appendix A.6.3. Obsolete White Space and Comments  . . . . . . . . 51
   Appendix B.     Differences from Earlier Specifications  . . . . . 52
   Appendix C.     Acknowledgements . . . . . . . . . . . . . . . . . 53
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 55
     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 55
     7.2.  Informative References . . . . . . . . . . . . . . . . . . 55
Top   ToC   RFC5322 - Page 4

1. Introduction

1.1. Scope

This document specifies the Internet Message Format (IMF), a syntax for text messages that are sent between computer users, within the framework of "electronic mail" messages. This specification is an update to [RFC2822], which itself superseded [RFC0822], updating it to reflect current practice and incorporating incremental changes that were specified in other RFCs such as [RFC1123]. This document specifies a syntax only for text messages. In particular, it makes no provision for the transmission of images, audio, or other sorts of structured data in electronic mail messages. There are several extensions published, such as the MIME document series ([RFC2045], [RFC2046], [RFC2049]), which describe mechanisms for the transmission of such data through electronic mail, either by extending the syntax provided here or by structuring such messages to conform to this syntax. Those mechanisms are outside of the scope of this specification. In the context of electronic mail, messages are viewed as having an envelope and contents. The envelope contains whatever information is needed to accomplish transmission and delivery. (See [RFC5321] for a discussion of the envelope.) The contents comprise the object to be delivered to the recipient. This specification applies only to the format and some of the semantics of message contents. It contains no specification of the information in the envelope. However, some message systems may use information from the contents to create the envelope. It is intended that this specification facilitate the acquisition of such information by programs. This specification is intended as a definition of what message content format is to be passed between systems. Though some message systems locally store messages in this format (which eliminates the need for translation between formats) and others use formats that differ from the one specified in this specification, local storage is outside of the scope of this specification. Note: This specification is not intended to dictate the internal formats used by sites, the specific message system features that they are expected to support, or any of the characteristics of user interface programs that create or read messages. In addition, this document does not specify an encoding of the characters for either transport or storage; that is, it does not specify the number of bits used or how those bits are specifically transferred over the wire or stored on disk.
Top   ToC   RFC5322 - Page 5

1.2. Notational Conventions

1.2.1. Requirements Notation

This document occasionally uses terms that appear in capital letters. When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD NOT", and "MAY" appear capitalized, they are being used to indicate particular requirements of this specification. A discussion of the meanings of these terms appears in [RFC2119].

1.2.2. Syntactic Notation

This specification uses the Augmented Backus-Naur Form (ABNF) [RFC5234] notation for the formal definitions of the syntax of messages. Characters will be specified either by a decimal value (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by a case-insensitive literal value enclosed in quotation marks (e.g., "A" for either uppercase or lowercase A).

1.2.3. Structure of This Document

This document is divided into several sections. This section, section 1, is a short introduction to the document. Section 2 lays out the general description of a message and its constituent parts. This is an overview to help the reader understand some of the general principles used in the later portions of this document. Any examples in this section MUST NOT be taken as specification of the formal syntax of any part of a message. Section 3 specifies formal ABNF rules for the structure of each part of a message (the syntax) and describes the relationship between those parts and their meaning in the context of a message (the semantics). That is, it lays out the actual rules for the structure of each part of a message (the syntax) as well as a description of the parts and instructions for their interpretation (the semantics). This includes analysis of the syntax and semantics of subparts of messages that have specific structure. The syntax included in section 3 represents messages as they MUST be created. There are also notes in section 3 to indicate if any of the options specified in the syntax SHOULD be used over any of the others. Both sections 2 and 3 describe messages that are legal to generate for purposes of this specification.
Top   ToC   RFC5322 - Page 6
   Section 4 of this document specifies an "obsolete" syntax.  There are
   references in section 3 to these obsolete syntactic elements.  The
   rules of the obsolete syntax are elements that have appeared in
   earlier versions of this specification or have previously been widely
   used in Internet messages.  As such, these elements MUST be
   interpreted by parsers of messages in order to be conformant to this
   specification.  However, since items in this syntax have been
   determined to be non-interoperable or to cause significant problems
   for recipients of messages, they MUST NOT be generated by creators of
   conformant messages.

   Section 5 details security considerations to take into account when
   implementing this specification.

   Appendix A lists examples of different sorts of messages.  These
   examples are not exhaustive of the types of messages that appear on
   the Internet, but give a broad overview of certain syntactic forms.

   Appendix B lists the differences between this specification and
   earlier specifications for Internet messages.

   Appendix C contains acknowledgements.

2. Lexical Analysis of Messages

2.1. General Description

At the most basic level, a message is a series of characters. A message that is conformant with this specification is composed of characters with values in the range of 1 through 127 and interpreted as US-ASCII [ANSI.X3-4.1986] characters. For brevity, this document sometimes refers to this range of characters as simply "US-ASCII characters". Note: This document specifies that messages are made up of characters in the US-ASCII range of 1 through 127. There are other documents, specifically the MIME document series ([RFC2045], [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]), that extend this specification to allow for values outside of that range. Discussion of those mechanisms is not within the scope of this specification. Messages are divided into lines of characters. A line is a series of characters that is delimited with the two characters carriage-return and line-feed; that is, the carriage return (CR) character (ASCII value 13) followed immediately by the line feed (LF) character (ASCII value 10). (The carriage return/line feed pair is usually written in this document as "CRLF".)
Top   ToC   RFC5322 - Page 7
   A message consists of header fields (collectively called "the header
   section of the message") followed, optionally, by a body.  The header
   section is a sequence of lines of characters with special syntax as
   defined in this specification.  The body is simply a sequence of
   characters that follows the header section and is separated from the
   header section by an empty line (i.e., a line with nothing preceding
   the CRLF).

      Note: Common parlance and earlier versions of this specification
      use the term "header" to either refer to the entire header section
      or to refer to an individual header field.  To avoid ambiguity,
      this document does not use the terms "header" or "headers" in
      isolation, but instead always uses "header field" to refer to the
      individual field and "header section" to refer to the entire
      collection.

2.1.1. Line Length Limits

There are two limits that this specification places on the number of characters in a line. Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF. The 998 character limit is due to limitations in many implementations that send, receive, or store IMF messages which simply cannot handle more than 998 characters on a line. Receiving implementations would do well to handle an arbitrarily large number of characters in a line for robustness sake. However, there are so many implementations that (in compliance with the transport requirements of [RFC5321]) do not accept messages containing more than 1000 characters including the CR and LF per line, it is important for implementations not to create such messages. The more conservative 78 character recommendation is to accommodate the many implementations of user interfaces that display these messages which may truncate, or disastrously wrap, the display of more than 78 characters per line, in spite of the fact that such implementations are non-conformant to the intent of this specification (and that of [RFC5321] if they actually cause information to be lost). Again, even though this limitation is put on messages, it is incumbent upon implementations that display messages to handle an arbitrarily large number of characters in a line (certainly at least up to the 998 character limit) for the sake of robustness.
Top   ToC   RFC5322 - Page 8

2.2. Header Fields

Header fields are lines beginning with a field name, followed by a colon (":"), followed by a field body, and terminated by CRLF. A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon. A field body may be composed of printable US-ASCII characters as well as the space (SP, ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters (together known as the white space characters, WSP). A field body MUST NOT include CR and LF except when used in "folding" and "unfolding", as described in section 2.2.3. All field bodies MUST conform to the syntax described in sections 3 and 4 of this specification.

2.2.1. Unstructured Header Field Bodies

Some field bodies in this specification are defined simply as "unstructured" (which is specified in section 3.2.5 as any printable US-ASCII characters plus white space characters) with no further restrictions. These are referred to as unstructured field bodies. Semantically, unstructured field bodies are simply to be treated as a single line of characters with no further processing (except for "folding" and "unfolding" as described in section 2.2.3).

2.2.2. Structured Header Field Bodies

Some field bodies in this specification have a syntax that is more restrictive than the unstructured field bodies described above. These are referred to as "structured" field bodies. Structured field bodies are sequences of specific lexical tokens as described in sections 3 and 4 of this specification. Many of these tokens are allowed (according to their syntax) to be introduced or end with comments (as described in section 3.2.2) as well as the white space characters, and those white space characters are subject to "folding" and "unfolding" as described in section 2.2.3. Semantic analysis of structured field bodies is given along with their syntax.

2.2.3. Long Header Fields

Each header field is logically a single line of characters comprising the field name, the colon, and the field body. For convenience however, and to deal with the 998/78 character limitations per line, the field body portion of a header field can be split into a multiple-line representation; this is called "folding". The general rule is that wherever this specification allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP.
Top   ToC   RFC5322 - Page 9
   For example, the header field:

   Subject: This is a test

   can be represented as:

   Subject: This
    is a test

      Note: Though structured field bodies are defined in such a way
      that folding can take place between many of the lexical tokens
      (and even within some of the lexical tokens), folding SHOULD be
      limited to placing the CRLF at higher-level syntactic breaks.  For
      instance, if a field body is defined as comma-separated values, it
      is recommended that folding occur after the comma separating the
      structured items in preference to other places where the field
      could be folded, even if it is allowed elsewhere.

   The process of moving from this folded multiple-line representation
   of a header field to its single line representation is called
   "unfolding".  Unfolding is accomplished by simply removing any CRLF
   that is immediately followed by WSP.  Each header field should be
   treated in its unfolded form for further syntactic and semantic
   evaluation.  An unfolded header field has no length restriction and
   therefore may be indeterminately long.

2.3. Body

The body of a message is simply lines of US-ASCII characters. The only two limitations on the body are as follows: o CR and LF MUST only occur together as CRLF; they MUST NOT appear independently in the body. o Lines of characters in the body MUST be limited to 998 characters, and SHOULD be limited to 78 characters, excluding the CRLF. Note: As was stated earlier, there are other documents, specifically the MIME documents ([RFC2045], [RFC2046], [RFC2049], [RFC4288], [RFC4289]), that extend (and limit) this specification to allow for different sorts of message bodies. Again, these mechanisms are beyond the scope of this document.


(next page on part 2)

Next Section