RFC 5322

Internet Message Format

Pages: 57
Draft Standard
→ Errata
Obsoletes: 2822
Updates: 4021
Updated by: 6854

Part 1 of 3 – Pages 1 to 9

RFC5322 - Page 1

Network Working Group                                    P. Resnick, Ed.
Request for Comments: 5322                         Qualcomm Incorporated
Obsoletes: 2822                                             October 2008
Updates: 4021
Category: Standards Track


                        Internet Message Format

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Abstract

   This document specifies the Internet Message Format (IMF), a syntax
   for text messages that are sent between computer users, within the
   framework of "electronic mail" messages.  This specification is a
   revision of Request For Comments (RFC) 2822, which itself superseded
   Request For Comments (RFC) 822, "Standard for the Format of ARPA
   Internet Text Messages", updating it to reflect current practice and
   incorporating incremental changes that were specified in other RFCs.

RFC5322 - Page 2

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.1.  Scope  . . . . . . . . . . . . . . . . . . . . . . . . . .  4
     1.2.  Notational Conventions . . . . . . . . . . . . . . . . . .  5
       1.2.1.  Requirements Notation  . . . . . . . . . . . . . . . .  5
       1.2.2.  Syntactic Notation . . . . . . . . . . . . . . . . . .  5
       1.2.3.  Structure of This Document . . . . . . . . . . . . . .  5
   2.  Lexical Analysis of Messages . . . . . . . . . . . . . . . . .  6
     2.1.  General Description  . . . . . . . . . . . . . . . . . . .  6
       2.1.1.  Line Length Limits . . . . . . . . . . . . . . . . . .  7
     2.2.  Header Fields  . . . . . . . . . . . . . . . . . . . . . .  8
       2.2.1.  Unstructured Header Field Bodies . . . . . . . . . . .  8
       2.2.2.  Structured Header Field Bodies . . . . . . . . . . . .  8
       2.2.3.  Long Header Fields . . . . . . . . . . . . . . . . . .  8
     2.3.  Body . . . . . . . . . . . . . . . . . . . . . . . . . . .  9
   3.  Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
     3.1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . 10
     3.2.  Lexical Tokens . . . . . . . . . . . . . . . . . . . . . . 10
       3.2.1.  Quoted characters  . . . . . . . . . . . . . . . . . . 10
       3.2.2.  Folding White Space and Comments . . . . . . . . . . . 11
       3.2.3.  Atom . . . . . . . . . . . . . . . . . . . . . . . . . 12
       3.2.4.  Quoted Strings . . . . . . . . . . . . . . . . . . . . 13
       3.2.5.  Miscellaneous Tokens . . . . . . . . . . . . . . . . . 14
     3.3.  Date and Time Specification  . . . . . . . . . . . . . . . 14
     3.4.  Address Specification  . . . . . . . . . . . . . . . . . . 16
       3.4.1.  Addr-Spec Specification  . . . . . . . . . . . . . . . 17
     3.5.  Overall Message Syntax . . . . . . . . . . . . . . . . . . 18
     3.6.  Field Definitions  . . . . . . . . . . . . . . . . . . . . 19
       3.6.1.  The Origination Date Field . . . . . . . . . . . . . . 22
       3.6.2.  Originator Fields  . . . . . . . . . . . . . . . . . . 22
       3.6.3.  Destination Address Fields . . . . . . . . . . . . . . 23
       3.6.4.  Identification Fields  . . . . . . . . . . . . . . . . 25
       3.6.5.  Informational Fields . . . . . . . . . . . . . . . . . 27
       3.6.6.  Resent Fields  . . . . . . . . . . . . . . . . . . . . 28
       3.6.7.  Trace Fields . . . . . . . . . . . . . . . . . . . . . 30
       3.6.8.  Optional Fields  . . . . . . . . . . . . . . . . . . . 30
   4.  Obsolete Syntax  . . . . . . . . . . . . . . . . . . . . . . . 31
     4.1.  Miscellaneous Obsolete Tokens  . . . . . . . . . . . . . . 32
     4.2.  Obsolete Folding White Space . . . . . . . . . . . . . . . 33
     4.3.  Obsolete Date and Time . . . . . . . . . . . . . . . . . . 33
     4.4.  Obsolete Addressing  . . . . . . . . . . . . . . . . . . . 35
     4.5.  Obsolete Header Fields . . . . . . . . . . . . . . . . . . 35
       4.5.1.  Obsolete Origination Date Field  . . . . . . . . . . . 36
       4.5.2.  Obsolete Originator Fields . . . . . . . . . . . . . . 36
       4.5.3.  Obsolete Destination Address Fields  . . . . . . . . . 37
       4.5.4.  Obsolete Identification Fields . . . . . . . . . . . . 37
       4.5.5.  Obsolete Informational Fields  . . . . . . . . . . . . 37

RFC5322 - Page 3

       4.5.6.  Obsolete Resent Fields . . . . . . . . . . . . . . . . 38
       4.5.7.  Obsolete Trace Fields  . . . . . . . . . . . . . . . . 38
       4.5.8.  Obsolete optional fields . . . . . . . . . . . . . . . 38
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 38
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 39
   Appendix A.     Example Messages . . . . . . . . . . . . . . . . . 43
   Appendix A.1.   Addressing Examples  . . . . . . . . . . . . . . . 44
   Appendix A.1.1. A Message from One Person to Another with
                   Simple Addressing  . . . . . . . . . . . . . . . . 44
   Appendix A.1.2. Different Types of Mailboxes . . . . . . . . . . . 45
   Appendix A.1.3. Group Addresses  . . . . . . . . . . . . . . . . . 45
   Appendix A.2.   Reply Messages . . . . . . . . . . . . . . . . . . 46
   Appendix A.3.   Resent Messages  . . . . . . . . . . . . . . . . . 47
   Appendix A.4.   Messages with Trace Fields . . . . . . . . . . . . 48
   Appendix A.5.   White Space, Comments, and Other Oddities  . . . . 49
   Appendix A.6.   Obsoleted Forms  . . . . . . . . . . . . . . . . . 50
   Appendix A.6.1. Obsolete Addressing  . . . . . . . . . . . . . . . 50
   Appendix A.6.2. Obsolete Dates . . . . . . . . . . . . . . . . . . 50
   Appendix A.6.3. Obsolete White Space and Comments  . . . . . . . . 51
   Appendix B.     Differences from Earlier Specifications  . . . . . 52
   Appendix C.     Acknowledgements . . . . . . . . . . . . . . . . . 53
   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 55
     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 55
     7.2.  Informative References . . . . . . . . . . . . . . . . . . 55

RFC5322 - Page 4

1.  Introduction

1.1.  Scope

   This document specifies the Internet Message Format (IMF), a syntax
   for text messages that are sent between computer users, within the
   framework of "electronic mail" messages.  This specification is an
   update to [RFC2822], which itself superseded [RFC0822], updating it
   to reflect current practice and incorporating incremental changes
   that were specified in other RFCs such as [RFC1123].

   This document specifies a syntax only for text messages.  In
   particular, it makes no provision for the transmission of images,
   audio, or other sorts of structured data in electronic mail messages.
   There are several extensions published, such as the MIME document
   series ([RFC2045], [RFC2046], [RFC2049]), which describe mechanisms
   for the transmission of such data through electronic mail, either by
   extending the syntax provided here or by structuring such messages to
   conform to this syntax.  Those mechanisms are outside of the scope of
   this specification.

   In the context of electronic mail, messages are viewed as having an
   envelope and contents.  The envelope contains whatever information is
   needed to accomplish transmission and delivery.  (See [RFC5321] for a
   discussion of the envelope.)  The contents comprise the object to be
   delivered to the recipient.  This specification applies only to the
   format and some of the semantics of message contents.  It contains no
   specification of the information in the envelope.

   However, some message systems may use information from the contents
   to create the envelope.  It is intended that this specification
   facilitate the acquisition of such information by programs.

   This specification is intended as a definition of what message
   content format is to be passed between systems.  Though some message
   systems locally store messages in this format (which eliminates the
   need for translation between formats) and others use formats that
   differ from the one specified in this specification, local storage is
   outside of the scope of this specification.

      Note: This specification is not intended to dictate the internal
      formats used by sites, the specific message system features that
      they are expected to support, or any of the characteristics of
      user interface programs that create or read messages.  In
      addition, this document does not specify an encoding of the
      characters for either transport or storage; that is, it does not
      specify the number of bits used or how those bits are specifically
      transferred over the wire or stored on disk.

RFC5322 - Page 5

1.2.  Notational Conventions

1.2.1.  Requirements Notation

   This document occasionally uses terms that appear in capital letters.
   When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD
   NOT", and "MAY" appear capitalized, they are being used to indicate
   particular requirements of this specification.  A discussion of the
   meanings of these terms appears in [RFC2119].

1.2.2.  Syntactic Notation

   This specification uses the Augmented Backus-Naur Form (ABNF)
   [RFC5234] notation for the formal definitions of the syntax of
   messages.  Characters will be specified either by a decimal value
   (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by
   a case-insensitive literal value enclosed in quotation marks (e.g.,
   "A" for either uppercase or lowercase A).

1.2.3.  Structure of This Document

   This document is divided into several sections.

   This section, section 1, is a short introduction to the document.

   Section 2 lays out the general description of a message and its
   constituent parts.  This is an overview to help the reader understand
   some of the general principles used in the later portions of this
   document.  Any examples in this section MUST NOT be taken as
   specification of the formal syntax of any part of a message.

   Section 3 specifies formal ABNF rules for the structure of each part
   of a message (the syntax) and describes the relationship between
   those parts and their meaning in the context of a message (the
   semantics).  That is, it lays out the actual rules for the structure
   of each part of a message (the syntax) as well as a description of
   the parts and instructions for their interpretation (the semantics).
   This includes analysis of the syntax and semantics of subparts of
   messages that have specific structure.  The syntax included in
   section 3 represents messages as they MUST be created.  There are
   also notes in section 3 to indicate if any of the options specified
   in the syntax SHOULD be used over any of the others.

   Both sections 2 and 3 describe messages that are legal to generate
   for purposes of this specification.

RFC5322 - Page 6

   Section 4 of this document specifies an "obsolete" syntax.  There are
   references in section 3 to these obsolete syntactic elements.  The
   rules of the obsolete syntax are elements that have appeared in
   earlier versions of this specification or have previously been widely
   used in Internet messages.  As such, these elements MUST be
   interpreted by parsers of messages in order to be conformant to this
   specification.  However, since items in this syntax have been
   determined to be non-interoperable or to cause significant problems
   for recipients of messages, they MUST NOT be generated by creators of
   conformant messages.

   Section 5 details security considerations to take into account when
   implementing this specification.

   Appendix A lists examples of different sorts of messages.  These
   examples are not exhaustive of the types of messages that appear on
   the Internet, but give a broad overview of certain syntactic forms.

   Appendix B lists the differences between this specification and
   earlier specifications for Internet messages.

   Appendix C contains acknowledgements.

2.  Lexical Analysis of Messages

2.1.  General Description

   At the most basic level, a message is a series of characters.  A
   message that is conformant with this specification is composed of
   characters with values in the range of 1 through 127 and interpreted
   as US-ASCII [ANSI.X3-4.1986] characters.  For brevity, this document
   sometimes refers to this range of characters as simply "US-ASCII
   characters".

      Note: This document specifies that messages are made up of
      characters in the US-ASCII range of 1 through 127.  There are
      other documents, specifically the MIME document series ([RFC2045],
      [RFC2046], [RFC2047], [RFC2049], [RFC4288], [RFC4289]), that
      extend this specification to allow for values outside of that
      range.  Discussion of those mechanisms is not within the scope of
      this specification.

   Messages are divided into lines of characters.  A line is a series of
   characters that is delimited with the two characters carriage-return
   and line-feed; that is, the carriage return (CR) character (ASCII
   value 13) followed immediately by the line feed (LF) character (ASCII
   value 10).  (The carriage return/line feed pair is usually written in
   this document as "CRLF".)

RFC5322 - Page 7

   A message consists of header fields (collectively called "the header
   section of the message") followed, optionally, by a body.  The header
   section is a sequence of lines of characters with special syntax as
   defined in this specification.  The body is simply a sequence of
   characters that follows the header section and is separated from the
   header section by an empty line (i.e., a line with nothing preceding
   the CRLF).

      Note: Common parlance and earlier versions of this specification
      use the term "header" to either refer to the entire header section
      or to refer to an individual header field.  To avoid ambiguity,
      this document does not use the terms "header" or "headers" in
      isolation, but instead always uses "header field" to refer to the
      individual field and "header section" to refer to the entire
      collection.

2.1.1.  Line Length Limits

   There are two limits that this specification places on the number of
   characters in a line.  Each line of characters MUST be no more than
   998 characters, and SHOULD be no more than 78 characters, excluding
   the CRLF.

   The 998 character limit is due to limitations in many implementations
   that send, receive, or store IMF messages which simply cannot handle
   more than 998 characters on a line.  Receiving implementations would
   do well to handle an arbitrarily large number of characters in a line
   for robustness sake.  However, there are so many implementations that
   (in compliance with the transport requirements of [RFC5321]) do not
   accept messages containing more than 1000 characters including the CR
   and LF per line, it is important for implementations not to create
   such messages.

   The more conservative 78 character recommendation is to accommodate
   the many implementations of user interfaces that display these
   messages which may truncate, or disastrously wrap, the display of
   more than 78 characters per line, in spite of the fact that such
   implementations are non-conformant to the intent of this
   specification (and that of [RFC5321] if they actually cause
   information to be lost).  Again, even though this limitation is put
   on messages, it is incumbent upon implementations that display
   messages to handle an arbitrarily large number of characters in a
   line (certainly at least up to the 998 character limit) for the sake
   of robustness.

RFC5322 - Page 8

2.2.  Header Fields

   Header fields are lines beginning with a field name, followed by a
   colon (":"), followed by a field body, and terminated by CRLF.  A
   field name MUST be composed of printable US-ASCII characters (i.e.,
   characters that have values between 33 and 126, inclusive), except
   colon.  A field body may be composed of printable US-ASCII characters
   as well as the space (SP, ASCII value 32) and horizontal tab (HTAB,
   ASCII value 9) characters (together known as the white space
   characters, WSP).  A field body MUST NOT include CR and LF except
   when used in "folding" and "unfolding", as described in section
   2.2.3.  All field bodies MUST conform to the syntax described in
   sections 3 and 4 of this specification.

2.2.1.  Unstructured Header Field Bodies

   Some field bodies in this specification are defined simply as
   "unstructured" (which is specified in section 3.2.5 as any printable
   US-ASCII characters plus white space characters) with no further
   restrictions.  These are referred to as unstructured field bodies.
   Semantically, unstructured field bodies are simply to be treated as a
   single line of characters with no further processing (except for
   "folding" and "unfolding" as described in section 2.2.3).

2.2.2.  Structured Header Field Bodies

   Some field bodies in this specification have a syntax that is more
   restrictive than the unstructured field bodies described above.
   These are referred to as "structured" field bodies.  Structured field
   bodies are sequences of specific lexical tokens as described in
   sections 3 and 4 of this specification.  Many of these tokens are
   allowed (according to their syntax) to be introduced or end with
   comments (as described in section 3.2.2) as well as the white space
   characters, and those white space characters are subject to "folding"
   and "unfolding" as described in section 2.2.3.  Semantic analysis of
   structured field bodies is given along with their syntax.

2.2.3.  Long Header Fields

   Each header field is logically a single line of characters comprising
   the field name, the colon, and the field body.  For convenience
   however, and to deal with the 998/78 character limitations per line,
   the field body portion of a header field can be split into a
   multiple-line representation; this is called "folding".  The general
   rule is that wherever this specification allows for folding white
   space (not simply WSP characters), a CRLF may be inserted before any
   WSP.

RFC5322 - Page 9

   For example, the header field:

   Subject: This is a test

   can be represented as:

   Subject: This
    is a test

      Note: Though structured field bodies are defined in such a way
      that folding can take place between many of the lexical tokens
      (and even within some of the lexical tokens), folding SHOULD be
      limited to placing the CRLF at higher-level syntactic breaks.  For
      instance, if a field body is defined as comma-separated values, it
      is recommended that folding occur after the comma separating the
      structured items in preference to other places where the field
      could be folded, even if it is allowed elsewhere.

   The process of moving from this folded multiple-line representation
   of a header field to its single line representation is called
   "unfolding".  Unfolding is accomplished by simply removing any CRLF
   that is immediately followed by WSP.  Each header field should be
   treated in its unfolded form for further syntactic and semantic
   evaluation.  An unfolded header field has no length restriction and
   therefore may be indeterminately long.

2.3.  Body

   The body of a message is simply lines of US-ASCII characters.  The
   only two limitations on the body are as follows:

   o  CR and LF MUST only occur together as CRLF; they MUST NOT appear
      independently in the body.
   o  Lines of characters in the body MUST be limited to 998 characters,
      and SHOULD be limited to 78 characters, excluding the CRLF.

      Note: As was stated earlier, there are other documents,
      specifically the MIME documents ([RFC2045], [RFC2046], [RFC2049],
      [RFC4288], [RFC4289]), that extend (and limit) this specification
      to allow for different sorts of message bodies.  Again, these
      mechanisms are beyond the scope of this document.

(next page on part 2)