6. Additional Content-Header Fields 6.1. Optional Content-ID Header Field In constructing a high-level user agent, it may be desirable to allow one body to make reference to another. Accordingly, bodies may be labeled using the "Content-ID" header field, which is syntactically identical to the "Message-ID" header field: id := "Content-ID" ":" msg-id Like the Message-ID values, Content-ID values must be generated to be world-unique. The Content-ID value may be used for uniquely identifying MIME entities in several contexts, particularly for cacheing data referenced by the message/external-body mechanism. Although the Content-ID header is generally optional, its use is mandatory in
implementations which generate data of the optional MIME Content-type "message/external-body". That is, each message/external-body entity must have a Content-ID field to permit cacheing of such data. It is also worth noting that the Content-ID value has special semantics in the case of the multipart/alternative content-type. This is explained in the section of this document dealing with multipart/alternative. 6.2. Optional Content-Description Header Field The ability to associate some descriptive information with a given body is often desirable. For example, it may be useful to mark an "image" body as "a picture of the Space Shuttle Endeavor." Such text may be placed in the Content-Description header field. description := "Content-Description" ":" *text The description is presumed to be given in the US-ASCII character set, although the mechanism specified in [RFC-1522] may be used for non-US-ASCII Content-Description values. 7. The Predefined Content-Type Values This document defines seven initial Content-Type values and an extension mechanism for private or experimental types. Further standard types must be defined by new published specifications. It is expected that most innovation in new types of mail will take place as subtypes of the seven types defined here. The most essential characteristics of the seven content-types are summarized in Appendix F. 7.1 The Text Content-Type The text Content-Type is intended for sending material which is principally textual in form. It is the default Content-Type. A "charset" parameter may be used to indicate the character set of the body text for some text subtypes, notably including the primary subtype, "text/plain", which indicates plain (unformatted) text. The default Content-Type for Internet mail is "text/plain; charset=us- ascii". Beyond plain text, there are many formats for representing what might be known as "extended text" -- text with embedded formatting and presentation information. An interesting characteristic of many such representations is that they are to some extent readable even without the software that interprets them. It is useful, then, to distinguish them, at the highest level, from such unreadable data as
images, audio, or text represented in an unreadable form. In the absence of appropriate interpretation software, it is reasonable to show subtypes of text to the user, while it is not reasonable to do so with most nontextual data. Such formatted textual data should be represented using subtypes of text. Plausible subtypes of text are typically given by the common name of the representation format, e.g., "text/richtext" [RFC-1341]. 7.1.1. The charset parameter A critical parameter that may be specified in the Content-Type field for text/plain data is the character set. This is specified with a "charset" parameter, as in: Content-type: text/plain; charset=us-ascii Unlike some other parameter values, the values of the charset parameter are NOT case sensitive. The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII. The specification for any future subtypes of "text" must specify whether or not they will also utilize a "charset" parameter, and may possibly restrict its values as well. When used with a particular body, the semantics of the "charset" parameter should be identical to those specified here for "text/plain", i.e., the body consists entirely of characters in the given charset. In particular, definers of future text subtypes should pay close attention the the implications of multibyte character sets for their subtype definitions. This RFC specifies the definition of the charset parameter for the purposes of MIME to be a unique mapping of a byte stream to glyphs, a mapping which does not require external profiling information. An initial list of predefined character set names can be found at the end of this section. Additional character sets may be registered with IANA, although the standardization of their use requires the usual IESG [RFC-1340] review and approval. Note that if the specified character set includes 8-bit data, a Content-Transfer- Encoding header field and a corresponding encoding on the data are required in order to transmit the body via some mail transfer protocols, such as SMTP. The default character set, US-ASCII, has been the subject of some confusion and ambiguity in the past. Not only were there some ambiguities in the definition, there have been wide variations in practice. In order to eliminate such ambiguity and variations in the
future, it is strongly recommended that new user agents explicitly specify a character set via the Content-Type header field. "US- ASCII" does not indicate an arbitrary seven-bit character code, but specifies that the body uses character coding that uses the exact correspondence of codes to characters specified in ASCII. National use variations of ISO 646 [ISO-646] are NOT ASCII and their use in Internet mail is explicitly discouraged. The omission of the ISO 646 character set is deliberate in this regard. The character set name of "US-ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only. The character set name "ASCII" is reserved and must not be used for any purpose. NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier version of the American Standard. Insofar as one of the purposes of specifying a Content-Type and character set is to permit the receiver to unambiguously determine how the sender intended the coded message to be interpreted, assuming anything other than "strict ASCII" as the default would risk unintentional and incompatible changes to the semantics of messages now being transmitted. This also implies that messages containing characters coded according to national variations on ISO 646, or using code-switching procedures (e.g., those of ISO 2022), as well as 8-bit or multiple octet character encodings MUST use an appropriate character set specification to be consistent with this specification. The complete US-ASCII character set is listed in [US-ASCII]. Note that the control characters including DEL (0-31, 127) have no defined meaning apart from the combination CRLF (ASCII values 13 and 10) indicating a new line. Two of the characters have de facto meanings in wide use: FF (12) often means "start subsequent text on the beginning of a new page"; and TAB or HT (9) often (though not always) means "move the cursor to the next available column after the current position where the column number is a multiple of 8 (counting the first column as column 0)." Apart from this, any use of the control characters or DEL in a body must be part of a private agreement between the sender and recipient. Such private agreements are discouraged and should be replaced by the other capabilities of this document. NOTE: Beyond US-ASCII, an enormous proliferation of character sets is possible. It is the opinion of the IETF working group that a large number of character sets is NOT a good thing. We would prefer to specify a single character set that can be used universally for representing all of the world's languages in electronic mail. Unfortunately, existing practice in several communities seems to point to the continued use of multiple character sets in the near future. For this reason, we define
names for a small number of character sets for which a strong
constituent base exists.
The defined charset values are:
US-ASCII -- as defined in [US-ASCII].
ISO-8859-X -- where "X" is to be replaced, as necessary, for the
parts of ISO-8859 [ISO-8859]. Note that the ISO 646
character sets have deliberately been omitted in favor of
their 8859 replacements, which are the designated character
sets for Internet mail. As of the publication of this
document, the legitimate values for "X" are the digits 1
through 9.
The character sets specified above are the ones that were relatively
uncontroversial during the drafting of MIME. This document does not
endorse the use of any particular character set other than US-ASCII,
and recognizes that the future evolution of world character sets
remains unclear. It is expected that in the future, additional
character sets will be registered for use in MIME.
Note that the character set used, if anything other than US-ASCII,
must always be explicitly specified in the Content-Type field.
No other character set name may be used in Internet mail without the
publication of a formal specification and its registration with IANA,
or by private agreement, in which case the character set name must
begin with "X-".
Implementors are discouraged from defining new character sets for
mail use unless absolutely necessary.
The "charset" parameter has been defined primarily for the purpose of
textual data, and is described in this section for that reason.
However, it is conceivable that non-textual data might also wish to
specify a charset value for some purpose, in which case the same
syntax and values should be used.
In general, mail-sending software must always use the "lowest common
denominator" character set possible. For example, if a body contains
only US-ASCII characters, it must be marked as being in the US-ASCII
character set, not ISO-8859-1, which, like all the ISO-8859 family of
character sets, is a superset of US-ASCII. More generally, if a
widely-used character set is a subset of another character set, and a
body contains only characters in the widely-used subset, it must be
labeled as being in that subset. This will increase the chances that
the recipient will be able to view the mail correctly.
7.1.2. The Text/plain subtype The primary subtype of text is "plain". This indicates plain (unformatted) text. The default Content-Type for Internet mail, "text/plain; charset=us-ascii", describes existing Internet practice. That is, it is the type of body defined by RFC 822. No other text subtype is defined by this document. The formal grammar for the content-type header field for text is as follows: text-type := "text" "/" text-subtype [";" "charset" "=" charset] text-subtype := "plain" / extension-token charset := "us-ascii"/ "iso-8859-1"/ "iso-8859-2"/ "iso-8859-3" / "iso-8859-4"/ "iso-8859-5"/ "iso-8859-6"/ "iso-8859-7" / "iso-8859-8" / "iso-8859-9" / extension-token ; case insensitive 7.2. The Multipart Content-Type In the case of multiple part entities, in which one or more different sets of data are combined in a single body, a "multipart" Content- Type field must appear in the entity's header. The body must then contain one or more "body parts," each preceded by an encapsulation boundary, and the last one followed by a closing boundary. Each part starts with an encapsulation boundary, and then contains a body part consisting of header area, a blank line, and a body area. Thus a body part is similar to an RFC 822 message in syntax, but different in meaning. A body part is NOT to be interpreted as actually being an RFC 822 message. To begin with, NO header fields are actually required in body parts. A body part that starts with a blank line, therefore, is allowed and is a body part for which all default values are to be assumed. In such a case, the absence of a Content-Type header field implies that the corresponding body is plain US-ASCII text. The only header fields that have defined meaning for body parts are those the names of which begin with "Content-". All other header fields are generally to be ignored in body parts. Although they should generally be retained in mail processing, they may be discarded by gateways if necessary. Such other fields are permitted to appear in body parts but must not be depended on. "X-" fields may be created for experimental or private purposes, with the recognition that the information they contain may be lost at some gateways.
NOTE: The distinction between an RFC 822 message and a body part
is subtle, but important. A gateway between Internet and X.400
mail, for example, must be able to tell the difference between a
body part that contains an image and a body part that contains an
encapsulated message, the body of which is an image. In order to
represent the latter, the body part must have "Content-Type:
message", and its body (after the blank line) must be the
encapsulated message, with its own "Content-Type: image" header
field. The use of similar syntax facilitates the conversion of
messages to body parts, and vice versa, but the distinction
between the two must be understood by implementors. (For the
special case in which all parts actually are messages, a "digest"
subtype is also defined.)
As stated previously, each body part is preceded by an encapsulation
boundary. The encapsulation boundary MUST NOT appear inside any of
the encapsulated parts. Thus, it is crucial that the composing agent
be able to choose and specify the unique boundary that will separate
the parts.
All present and future subtypes of the "multipart" type must use an
identical syntax. Subtypes may differ in their semantics, and may
impose additional restrictions on syntax, but must conform to the
required syntax for the multipart type. This requirement ensures
that all conformant user agents will at least be able to recognize
and separate the parts of any multipart entity, even of an
unrecognized subtype.
As stated in the definition of the Content-Transfer-Encoding field,
no encoding other than "7bit", "8bit", or "binary" is permitted for
entities of type "multipart". The multipart delimiters and header
fields are always represented as 7-bit ASCII in any case (though the
header fields may encode non-ASCII header text as per [RFC-1522]),
and data within the body parts can be encoded on a part-by-part
basis, with Content-Transfer-Encoding fields for each appropriate
body part.
Mail gateways, relays, and other mail handling agents are commonly
known to alter the top-level header of an RFC 822 message. In
particular, they frequently add, remove, or reorder header fields.
Such alterations are explicitly forbidden for the body part headers
embedded in the bodies of messages of type "multipart."
7.2.1. Multipart: The common syntax
All subtypes of "multipart" share a common syntax, defined in this
section. A simple example of a multipart message also appears in
this section. An example of a more complex multipart message is
given in Appendix C.
The Content-Type field for multipart entities requires one parameter,
"boundary", which is used to specify the encapsulation boundary. The
encapsulation boundary is defined as a line consisting entirely of
two hyphen characters ("-", decimal code 45) followed by the boundary
parameter value from the Content-Type header field.
NOTE: The hyphens are for rough compatibility with the earlier RFC
934 method of message encapsulation, and for ease of searching for
the boundaries in some implementations. However, it should be
noted that multipart messages are NOT completely compatible with
RFC 934 encapsulations; in particular, they do not obey RFC 934
quoting conventions for embedded lines that begin with hyphens.
This mechanism was chosen over the RFC 934 mechanism because the
latter causes lines to grow with each level of quoting. The
combination of this growth with the fact that SMTP implementations
sometimes wrap long lines made the RFC 934 mechanism unsuitable
for use in the event that deeply-nested multipart structuring is
ever desired.
WARNING TO IMPLEMENTORS: The grammar for parameters on the Content-
type field is such that it is often necessary to enclose the
boundaries in quotes on the Content-type line. This is not always
necessary, but never hurts. Implementors should be sure to study the
grammar carefully in order to avoid producing illegal Content-type
fields. Thus, a typical multipart Content-Type header field might
look like this:
Content-Type: multipart/mixed;
boundary=gc0p4Jq0M2Yt08jU534c0p
But the following is illegal:
Content-Type: multipart/mixed;
boundary=gc0p4Jq0M:2Yt08jU534c0p
(because of the colon) and must instead be represented as
Content-Type: multipart/mixed;
boundary="gc0p4Jq0M:2Yt08jU534c0p"
This indicates that the entity consists of several parts, each itself
with a structure that is syntactically identical to an RFC 822
message, except that the header area might be completely empty, and
that the parts are each preceded by the line
--gc0p4Jq0M:2Yt08jU534c0p
Note that the encapsulation boundary must occur at the beginning of a
line, i.e., following a CRLF, and that the initial CRLF is considered
to be attached to the encapsulation boundary rather than part of the
preceding part. The boundary must be followed immediately either by
another CRLF and the header fields for the next part, or by two
CRLFs, in which case there are no header fields for the next part
(and it is therefore assumed to be of Content-Type text/plain).
NOTE: The CRLF preceding the encapsulation line is conceptually
attached to the boundary so that it is possible to have a part
that does not end with a CRLF (line break). Body parts that must
be considered to end with line breaks, therefore, must have two
CRLFs preceding the encapsulation line, the first of which is part
of the preceding body part, and the second of which is part of the
encapsulation boundary.
Encapsulation boundaries must not appear within the encapsulations,
and must be no longer than 70 characters, not counting the two
leading hyphens.
The encapsulation boundary following the last body part is a
distinguished delimiter that indicates that no further body parts
will follow. Such a delimiter is identical to the previous
delimiters, with the addition of two more hyphens at the end of the
line:
--gc0p4Jq0M2Yt08jU534c0p--
There appears to be room for additional information prior to the
first encapsulation boundary and following the final boundary. These
areas should generally be left blank, and implementations must ignore
anything that appears before the first boundary or after the last
one.
NOTE: These "preamble" and "epilogue" areas are generally not used
because of the lack of proper typing of these parts and the lack
of clear semantics for handling these areas at gateways,
particularly X.400 gateways. However, rather than leaving the
preamble area blank, many MIME implementations have found this to
be a convenient place to insert an explanatory note for recipients
who read the message with pre-MIME software, since such notes will
be ignored by MIME-compliant software.
NOTE: Because encapsulation boundaries must not appear in the body
parts being encapsulated, a user agent must exercise care to
choose a unique boundary. The boundary in the example above could
have been the result of an algorithm designed to produce
boundaries with a very low probability of already existing in the
data to be encapsulated without having to prescan the data.
Alternate algorithms might result in more 'readable' boundaries
for a recipient with an old user agent, but would require more
attention to the possibility that the boundary might appear in the
encapsulated part. The simplest boundary possible is something
like "---", with a closing boundary of "-----".
As a very simple example, the following multipart message has two
parts, both of them plain text, one of them explicitly typed and one
of them implicitly typed:
From: Nathaniel Borenstein <nsb@bellcore.com>
To: Ned Freed <ned@innosoft.com>
Subject: Sample message
MIME-Version: 1.0
Content-type: multipart/mixed; boundary="simple
boundary"
This is the preamble. It is to be ignored, though it
is a handy place for mail composers to include an
explanatory note to non-MIME conformant readers.
--simple boundary
This is implicitly typed plain ASCII text.
It does NOT end with a linebreak.
--simple boundary
Content-type: text/plain; charset=us-ascii
This is explicitly typed plain ASCII text.
It DOES end with a linebreak.
--simple boundary--
This is the epilogue. It is also to be ignored.
The use of a Content-Type of multipart in a body part within another
multipart entity is explicitly allowed. In such cases, for obvious
reasons, care must be taken to ensure that each nested multipart
entity must use a different boundary delimiter. See Appendix C for an
example of nested multipart entities.
The use of the multipart Content-Type with only a single body part
may be useful in certain contexts, and is explicitly permitted.
The only mandatory parameter for the multipart Content-Type is the
boundary parameter, which consists of 1 to 70 characters from a set
of characters known to be very robust through email gateways, and NOT
ending with white space. (If a boundary appears to end with white
space, the white space must be presumed to have been added by a
gateway, and must be deleted.) It is formally specified by the
following BNF:
boundary := 0*69<bchars> bcharsnospace
bchars := bcharsnospace / " "
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
/ "," / "-" / "." / "/" / ":" / "=" / "?"
Overall, the body of a multipart entity may be specified as
follows:
multipart-body := preamble 1*encapsulation
close-delimiter epilogue
encapsulation := delimiter body-part CRLF
delimiter := "--" boundary CRLF ; taken from Content-Type field.
; There must be no space
; between "--" and boundary.
close-delimiter := "--" boundary "--" CRLF ; Again, no space
by "--",
preamble := discard-text ; to be ignored upon receipt.
epilogue := discard-text ; to be ignored upon receipt.
discard-text := *(*text CRLF)
body-part := <"message" as defined in RFC 822,
with all header fields optional, and with the
specified delimiter not occurring anywhere in
the message body, either on a line by itself
or as a substring anywhere. Note that the
semantics of a part differ from the semantics
of a message, as described in the text.>
NOTE: In certain transport enclaves, RFC 822 restrictions such as
the one that limits bodies to printable ASCII characters may not
be in force. (That is, the transport domains may resemble
standard Internet mail transport as specified in RFC821 and
assumed by RFC822, but without certain restrictions.) The
relaxation of these restrictions should be construed as locally
extending the definition of bodies, for example to include octets
outside of the ASCII range, as long as these extensions are
supported by the transport and adequately documented in the
Content-Transfer-Encoding header field. However, in no event are
headers (either message headers or body-part headers) allowed to
contain anything other than ASCII characters.
NOTE: Conspicuously missing from the multipart type is a notion of
structured, related body parts. In general, it seems premature to
try to standardize interpart structure yet. It is recommended
that those wishing to provide a more structured or integrated
multipart messaging facility should define a subtype of multipart
that is syntactically identical, but that always expects the
inclusion of a distinguished part that can be used to specify the
structure and integration of the other parts, probably referring
to them by their Content-ID field. If this approach is used,
other implementations will not recognize the new subtype, but will
treat it as the primary subtype (multipart/mixed) and will thus be
able to show the user the parts that are recognized.
7.2.2. The Multipart/mixed (primary) subtype
The primary subtype for multipart, "mixed", is intended for use when
the body parts are independent and need to be bundled in a particular
order. Any multipart subtypes that an implementation does not
recognize must be treated as being of subtype "mixed".
7.2.3. The Multipart/alternative subtype
The multipart/alternative type is syntactically identical to
multipart/mixed, but the semantics are different. In particular,
each of the parts is an "alternative" version of the same
information.
Systems should recognize that the content of the various parts are
interchangeable. Systems should choose the "best" type based on the
local environment and preferences, in some cases even through user
interaction. As with multipart/mixed, the order of body parts is
significant. In this case, the alternatives appear in an order of
increasing faithfulness to the original content. In general, the best
choice is the LAST part of a type supported by the recipient system's
local environment.
Multipart/alternative may be used, for example, to send mail in a
fancy text format in such a way that it can easily be displayed
anywhere:
From: Nathaniel Borenstein <nsb@bellcore.com>
To: Ned Freed <ned@innosoft.com>
Subject: Formatted text mail
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=boundary42
--boundary42
Content-Type: text/plain; charset=us-ascii
...plain text version of message goes here....
--boundary42
Content-Type: text/richtext
.... RFC 1341 richtext version of same message goes here ...
--boundary42
Content-Type: text/x-whatever
.... fanciest formatted version of same message goes here
...
--boundary42--
In this example, users whose mail system understood the "text/x-
whatever" format would see only the fancy version, while other users
would see only the richtext or plain text version, depending on the
capabilities of their system.
In general, user agents that compose multipart/alternative entities
must place the body parts in increasing order of preference, that is,
with the preferred format last. For fancy text, the sending user
agent should put the plainest format first and the richest format
last. Receiving user agents should pick and display the last format
they are capable of displaying. In the case where one of the
alternatives is itself of type "multipart" and contains unrecognized
sub-parts, the user agent may choose either to show that alternative,
an earlier alternative, or both.
NOTE: From an implementor's perspective, it might seem more
sensible to reverse this ordering, and have the plainest
alternative last. However, placing the plainest alternative first
is the friendliest possible option when multipart/alternative
entities are viewed using a non-MIME-conformant mail reader.
While this approach does impose some burden on conformant mail
readers, interoperability with older mail readers was deemed to be
more important in this case.
It may be the case that some user agents, if they can recognize more
than one of the formats, will prefer to offer the user the choice of
which format to view. This makes sense, for example, if mail includes both a nicely-formatted image version and an easily-edited text version. What is most critical, however, is that the user not automatically be shown multiple versions of the same data. Either the user should be shown the last recognized version or should be given the choice. NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a multipart/alternative entity represents the same data, but the mappings between the two are not necessarily without information loss. For example, information is lost when translating ODA to PostScript or plain text. It is recommended that each part should have a different Content-ID value in the case where the information content of the two parts is not identical. However, where the information content is identical -- for example, where several parts of type "application/external- body" specify alternate ways to access the identical data -- the same Content-ID field value should be used, to optimize any cacheing mechanisms that might be present on the recipient's end. However, it is recommended that the Content-ID values used by the parts should not be the same Content-ID value that describes the multipart/alternative as a whole, if there is any such Content-ID field. That is, one Content-ID value will refer to the multipart/alternative entity, while one or more other Content-ID values will refer to the parts inside it. 7.2.4. The Multipart/digest subtype This document defines a "digest" subtype of the multipart Content- Type. This type is syntactically identical to multipart/mixed, but the semantics are different. In particular, in a digest, the default Content-Type value for a body part is changed from "text/plain" to "message/rfc822". This is done to allow a more readable digest format that is largely compatible (except for the quoting convention) with RFC 934.
A digest in this format might, then, look something like this:
From: Moderator-Address
To: Recipient-List
MIME-Version: 1.0
Subject: Internet Digest, volume 42
Content-Type: multipart/digest;
boundary="---- next message ----"
------ next message ----
From: someone-else
Subject: my opinion
...body goes here ...
------ next message ----
From: someone-else-again
Subject: my different opinion
... another body goes here...
------ next message ------
7.2.5. The Multipart/parallel subtype
This document defines a "parallel" subtype of the multipart Content-
Type. This type is syntactically identical to multipart/mixed, but
the semantics are different. In particular, in a parallel entity,
the order of body parts is not significant.
A common presentation of this type is to display all of the parts
simultaneously on hardware and software that are capable of doing so.
However, composing agents should be aware that many mail readers will
lack this capability and will show the parts serially in any event.
7.2.6. Other Multipart subtypes
Other multipart subtypes are expected in the future. MIME
implementations must in general treat unrecognized subtypes of
multipart as being equivalent to "multipart/mixed".
The formal grammar for content-type header fields for multipart data
is given by:
multipart-type := "multipart" "/" multipart-subtype
";" "boundary" "=" boundary
multipart-subtype := "mixed" / "parallel" / "digest"
/ "alternative" / extension-token
7.3. The Message Content-Type
It is frequently desirable, in sending mail, to encapsulate another
mail message. For this common operation, a special Content-Type,
"message", is defined. The primary subtype, message/rfc822, has no
required parameters in the Content-Type field. Additional subtypes,
"partial" and "External-body", do have required parameters. These
subtypes are explained below.
NOTE: It has been suggested that subtypes of message might be
defined for forwarded or rejected messages. However, forwarded
and rejected messages can be handled as multipart messages in
which the first part contains any control or descriptive
information, and a second part, of type message/rfc822, is the
forwarded or rejected message. Composing rejection and forwarding
messages in this manner will preserve the type information on the
original message and allow it to be correctly presented to the
recipient, and hence is strongly encouraged.
As stated in the definition of the Content-Transfer-Encoding field,
no encoding other than "7bit", "8bit", or "binary" is permitted for
messages or parts of type "message". Even stronger restrictions
apply to the subtypes "message/partial" and "message/external-body",
as specified below. The message header fields are always US-ASCII in
any case, and data within the body can still be encoded, in which
case the Content-Transfer-Encoding header field in the encapsulated
message will reflect this. Non-ASCII text in the headers of an
encapsulated message can be specified using the mechanisms described
in [RFC-1522].
Mail gateways, relays, and other mail handling agents are commonly
known to alter the top-level header of an RFC 822 message. In
particular, they frequently add, remove, or reorder header fields.
Such alterations are explicitly forbidden for the encapsulated
headers embedded in the bodies of messages of type "message."
7.3.1. The Message/rfc822 (primary) subtype
A Content-Type of "message/rfc822" indicates that the body contains
an encapsulated message, with the syntax of an RFC 822 message.
However, unlike top-level RFC 822 messages, it is not required that
each message/rfc822 body must include a "From", "Subject", and at
least one destination header.
It should be noted that, despite the use of the numbers "822", a
message/rfc822 entity can include enhanced information as defined in this document. In other words, a message/rfc822 message may be a MIME message. 7.3.2. The Message/Partial subtype A subtype of message, "partial", is defined in order to allow large objects to be delivered as several separate pieces of mail and automatically reassembled by the receiving user agent. (The concept is similar to IP fragmentation/reassembly in the basic Internet Protocols.) This mechanism can be used when intermediate transport agents limit the size of individual messages that can be sent. Content-Type "message/partial" thus indicates that the body contains a fragment of a larger message. Three parameters must be specified in the Content-Type field of type message/partial: The first, "id", is a unique identifier, as close to a world-unique identifier as possible, to be used to match the parts together. (In general, the identifier is essentially a message-id; if placed in double quotes, it can be any message-id, in accordance with the BNF for "parameter" given earlier in this specification.) The second, "number", an integer, is the part number, which indicates where this part fits into the sequence of fragments. The third, "total", another integer, is the total number of parts. This third subfield is required on the final part, and is optional (though encouraged) on the earlier parts. Note also that these parameters may be given in any order. Thus, part 2 of a 3-part message may have either of the following header fields: Content-Type: Message/Partial; number=2; total=3; id="oc=jpbe0M2Yt4s@thumper.bellcore.com" Content-Type: Message/Partial; id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; number=2 But part 3 MUST specify the total number of parts: Content-Type: Message/Partial; number=3; total=3; id="oc=jpbe0M2Yt4s@thumper.bellcore.com" Note that part numbering begins with 1, not 0. When the parts of a message broken up in this manner are put
together, the result is a complete MIME entity, which may have its
own Content-Type header field, and thus may contain any other data
type.
Message fragmentation and reassembly: The semantics of a reassembled
partial message must be those of the "inner" message, rather than of
a message containing the inner message. This makes it possible, for
example, to send a large audio message as several partial messages,
and still have it appear to the recipient as a simple audio message
rather than as an encapsulated message containing an audio message.
That is, the encapsulation of the message is considered to be
"transparent".
When generating and reassembling the parts of a message/partial
message, the headers of the encapsulated message must be merged with
the headers of the enclosing entities. In this process the following
rules must be observed:
(1) All of the header fields from the initial enclosing entity
(part one), except those that start with "Content-" and the
specific header fields "Message-ID", "Encrypted", and "MIME-
Version", must be copied, in order, to the new message.
(2) Only those header fields in the enclosed message which start
with "Content-" and "Message-ID", "Encrypted", and "MIME-Version"
must be appended, in order, to the header fields of the new
message. Any header fields in the enclosed message which do not
start with "Content-" (except for "Message-ID", "Encrypted", and
"MIME-Version") will be ignored.
(3) All of the header fields from the second and any subsequent
messages will be ignored.
For example, if an audio message is broken into two parts, the first
part might look something like this:
X-Weird-Header-1: Foo
From: Bill@host.com
To: joe@otherhost.com
Subject: Audio mail
Message-ID: <id1@host.com>
MIME-Version: 1.0
Content-type: message/partial;
id="ABC@host.com";
number=1; total=2
X-Weird-Header-1: Bar
X-Weird-Header-2: Hello
Message-ID: <anotherid@foo.com>
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here...
and the second half might look something like this:
From: Bill@host.com
To: joe@otherhost.com
Subject: Audio mail
MIME-Version: 1.0
Message-ID: <id2@host.com>
Content-type: message/partial;
id="ABC@host.com"; number=2; total=2
... second half of encoded audio data goes here...
Then, when the fragmented message is reassembled, the resulting
message to be displayed to the user should look something like this:
X-Weird-Header-1: Foo
From: Bill@host.com
To: joe@otherhost.com
Subject: Audio mail
Message-ID: <anotherid@foo.com>
MIME-Version: 1.0
Content-type: audio/basic
Content-transfer-encoding: base64
... first half of encoded audio data goes here...
... second half of encoded audio data goes here...
Note on encoding of MIME entities encapsulated inside message/partial
entities: Because data of type "message" may never be encoded in
base64 or quoted-printable, a problem might arise if message/partial
entities are constructed in an environment that supports binary or
8-bit transport. The problem is that the binary data would be split
into multiple message/partial objects, each of them requiring binary
transport. If such objects were encountered at a gateway into a 7-
bit transport environment, there would be no way to properly encode
them for the 7-bit world, aside from waiting for all of the parts,
reassembling the message, and then encoding the reassembled data in
base64 or quoted-printable. Since it is possible that different
parts might go through different gateways, even this is not an
acceptable solution. For this reason, it is specified that MIME
entities of type message/partial must always have a content-
transfer-encoding of 7-bit (the default). In particular, even in environments that support binary or 8-bit transport, the use of a content-transfer-encoding of "8bit" or "binary" is explicitly prohibited for entities of type message/partial. It should be noted that, because some message transfer agents may choose to automatically fragment large messages, and because such agents may use different fragmentation thresholds, it is possible that the pieces of a partial message, upon reassembly, may prove themselves to comprise a partial message. This is explicitly permitted. It should also be noted that the inclusion of a "References" field in the headers of the second and subsequent pieces of a fragmented message that references the Message-Id on the previous piece may be of benefit to mail readers that understand and track references. However, the generation of such "References" fields is entirely optional. Finally, it should be noted that the "Encrypted" header field has been made obsolete by Privacy Enhanced Messaging (PEM), but the rules above are believed to describe the correct way to treat it if it is encountered in the context of conversion to and from message/partial fragments. 7.3.3. The Message/External-Body subtype The external-body subtype indicates that the actual body data are not included, but merely referenced. In this case, the parameters describe a mechanism for accessing the external data. When an entity is of type "message/external-body", it consists of a header, two consecutive CRLFs, and the message header for the encapsulated message. If another pair of consecutive CRLFs appears, this of course ends the message header for the encapsulated message. However, since the encapsulated message's body is itself external, it does NOT appear in the area that follows. For example, consider the following message: Content-type: message/external-body; access- type=local-file; name="/u/nsb/Me.gif" Content-type: image/gif Content-ID: <id42@guppylake.bellcore.com> Content-Transfer-Encoding: binary
THIS IS NOT REALLY THE BODY!
The area at the end, which might be called the "phantom body", is
ignored for most external-body messages. However, it may be used to
contain auxiliary information for some such messages, as indeed it is
when the access-type is "mail-server". Of the access-types defined
by this document, the phantom body is used only when the access-type
is "mail-server". In all other cases, the phantom body is ignored.
The only always-mandatory parameter for message/external-body is
"access-type"; all of the other parameters may be mandatory or
optional depending on the value of access-type.
ACCESS-TYPE -- A case-insensitive word, indicating the supported
access mechanism by which the file or data may be obtained.
Values include, but are not limited to, "FTP", "ANON-FTP", "TFTP",
"AFS", "LOCAL-FILE", and "MAIL-SERVER". Future values, except for
experimental values beginning with "X-" must be registered with
IANA, as described in Appendix E .
In addition, the following three parameters are optional for ALL
access-types:
EXPIRATION -- The date (in the RFC 822 "date-time" syntax, as
extended by RFC 1123 to permit 4 digits in the year field) after
which the existence of the external data is not guaranteed.
SIZE -- The size (in octets) of the data. The intent of this
parameter is to help the recipient decide whether or not to expend
the necessary resources to retrieve the external data. Note that
this describes the size of the data in its canonical form, that
is, before any Content- Transfer-Encoding has been applied or
after the data have been decoded.
PERMISSION -- A case-insensitive field that indicates whether or
not it is expected that clients might also attempt to overwrite
the data. By default, or if permission is "read", the assumption
is that they are not, and that if the data is retrieved once, it
is never needed again. If PERMISSION is "read-write", this
assumption is invalid, and any local copy must be considered no
more than a cache. "Read" and "Read-write" are the only defined
values of permission.
The precise semantics of the access-types defined here are described
in the sections that follow.
The encapsulated headers in ALL message/external-body entities MUST
include a Content-ID header field to give a unique identifier by
which to reference the data. This identifier may be used for cacheing mechanisms, and for recognizing the receipt of the data when the access-type is "mail-server". Note that, as specified here, the tokens that describe external-body data, such as file names and mail server commands, are required to be in the US-ASCII character set. If this proves problematic in practice, a new mechanism may be required as a future extension to MIME, either as newly defined access-types for message/external-body or by some other mechanism. As with message/partial, it is specified that MIME entities of type message/external-body must always have a content-transfer-encoding of 7-bit (the default). In particular, even in environments that support binary or 8-bit transport, the use of a content-transfer- encoding of "8bit" or "binary" is explicitly prohibited for entities of type message/external-body. 7.3.3.1. The "ftp" and "tftp" access-types An access-type of FTP or TFTP indicates that the message body is accessible as a file using the FTP [RFC-959] or TFTP [RFC-783] protocols, respectively. For these access-types, the following additional parameters are mandatory: NAME -- The name of the file that contains the actual body data. SITE -- A machine from which the file may be obtained, using the given protocol. This must be a fully qualified domain name, not a nickname. Before any data are retrieved, using FTP, the user will generally need to be asked to provide a login id and a password for the machine named by the site parameter. For security reasons, such an id and password are not specified as content-type parameters, but must be obtained from the user. In addition, the following parameters are optional: DIRECTORY -- A directory from which the data named by NAME should be retrieved. MODE -- A case-insensitive string indicating the mode to be used when retrieving the information. The legal values for access-type "TFTP" are "NETASCII", "OCTET", and "MAIL", as specified by the TFTP protocol [RFC-783]. The legal values for access-type "FTP" are "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a decimal integer, typically 8. These correspond to the
representation types "A" "E" "I" and "L n" as specified by the FTP
protocol [RFC-959]. Note that "BINARY" and "TENEX" are not valid
values for MODE, but that "OCTET" or "IMAGE" or "LOCAL8" should be
used instead. IF MODE is not specified, the default value is
"NETASCII" for TFTP and "ASCII" otherwise.
7.3.3.2. The "anon-ftp" access-type
The "anon-ftp" access-type is identical to the "ftp" access type,
except that the user need not be asked to provide a name and password
for the specified site. Instead, the ftp protocol will be used with
login "anonymous" and a password that corresponds to the user's email
address.
7.3.3.3. The "local-file" and "afs" access-types
An access-type of "local-file" indicates that the actual body is
accessible as a file on the local machine. An access-type of "afs"
indicates that the file is accessible via the global AFS file system.
In both cases, only a single parameter is required:
NAME -- The name of the file that contains the actual body data.
The following optional parameter may be used to describe the locality
of reference for the data, that is, the site or sites at which the
file is expected to be visible:
SITE -- A domain specifier for a machine or set of machines that
are known to have access to the data file. Asterisks may be used
for wildcard matching to a part of a domain name, such as
"*.bellcore.com", to indicate a set of machines on which the data
should be directly visible, while a single asterisk may be used to
indicate a file that is expected to be universally available,
e.g., via a global file system.
7.3.3.4. The "mail-server" access-type
The "mail-server" access-type indicates that the actual body is
available from a mail server. The mandatory parameter for this
access-type is:
SERVER -- The email address of the mail server from which the
actual body data can be obtained.
Because mail servers accept a variety of syntaxes, some of which is
multiline, the full command to be sent to a mail server is not
included as a parameter on the content-type line. Instead, it is
provided as the "phantom body" when the content-type is
message/external-body and the access- type is mail-server.
An optional parameter for this access-type is:
SUBJECT -- The subject that is to be used in the mail that is sent
to obtain the data. Note that keying mail servers on Subject lines
is NOT recommended, but such mail servers are known to exist.
Note that MIME does not define a mail server syntax. Rather, it
allows the inclusion of arbitrary mail server commands in the phantom
body. Implementations must include the phantom body in the body of
the message it sends to the mail server address to retrieve the
relevant data.
It is worth noting that, unlike other access-types, mail-server
access is asynchronous and will happen at an unpredictable time in
the future. For this reason, it is important that there be a
mechanism by which the returned data can be matched up with the
original message/external-body entity. MIME mailservers must use the
same Content-ID field on the returned message that was used in the
original message/external-body entity, to facilitate such matching.
7.3.3.5. Examples and Further Explanations
With the emerging possibility of very wide-area file systems, it
becomes very hard to know in advance the set of machines where a file
will and will not be accessible directly from the file system.
Therefore it may make sense to provide both a file name, to be tried
directly, and the name of one or more sites from which the file is
known to be accessible. An implementation can try to retrieve remote
files using FTP or any other protocol, using anonymous file retrieval
or prompting the user for the necessary name and password. If an
external body is accessible via multiple mechanisms, the sender may
include multiple parts of type message/external-body within an entity
of type multipart/alternative.
However, the external-body mechanism is not intended to be limited to
file retrieval, as shown by the mail-server access-type. Beyond
this, one can imagine, for example, using a video server for external
references to video clips.
If an entity is of type "message/external-body", then the body of the
entity will contain the header fields of the encapsulated message.
The body itself is to be found in the external location. This means
that if the body of the "message/external-body" message contains two
consecutive CRLFs, everything after those pairs is NOT part of the
message itself. For most message/external-body messages, this
trailing area must simply be ignored. However, it is a convenient
place for additional data that cannot be included in the content-type
header field. In particular, if the "access-type" value is "mail-
server", then the trailing area must contain commands to be sent to
the mail server at the address given by the value of the SERVER
parameter.
The embedded message header fields which appear in the body of the
message/external-body data must be used to declare the Content-type
of the external body if it is anything other than plain ASCII text,
since the external body does not have a header section to declare its
type. Similarly, any Content-transfer-encoding other than "7bit"
must also be declared here. Thus a complete message/external-body
message, referring to a document in PostScript format, might look
like this:
From: Whomever
To: Someone
Subject: whatever
MIME-Version: 1.0
Message-ID: <id1@host.com>
Content-Type: multipart/alternative; boundary=42
Content-ID: <id001@guppylake.bellcore.com>
--42
Content-Type: message/external-body;
name="BodyFormats.ps";
site="thumper.bellcore.com";
access-type=ANON-FTP;
directory="pub";
mode="image";
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
Content-type: application/postscript
Content-ID: <id42@guppylake.bellcore.com>
--42
Content-Type: message/external-body;
name="/u/nsb/writing/rfcs/RFC-MIME.ps";
site="thumper.bellcore.com";
access-type=AFS
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
Content-type: application/postscript
Content-ID: <id42@guppylake.bellcore.com>
--42
Content-Type: message/external-body;
access-type=mail-server
server="listserv@bogus.bitnet";
expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
Content-type: application/postscript
Content-ID: <id42@guppylake.bellcore.com>
get RFC-MIME.DOC
--42--
Note that in the above examples, the default Content-transfer-
encoding of "7bit" is assumed for the external postscript data.
Like the message/partial type, the message/external-body type is
intended to be transparent, that is, to convey the data type in the
external body rather than to convey a message with a body of that
type. Thus the headers on the outer and inner parts must be merged
using the same rules as for message/partial. In particular, this
means that the Content-type header is overridden, but the From and
Subject headers are preserved.
Note that since the external bodies are not transported as mail, they
need not conform to the 7-bit and line length requirements, but might
in fact be binary files. Thus a Content-Transfer-Encoding is not
generally necessary, though it is permitted.
Note that the body of a message of type "message/external-body" is
governed by the basic syntax for an RFC 822 message. In particular,
anything before the first consecutive pair of CRLFs is header
information, while anything after it is body information, which is
ignored for most access-types.
The formal grammar for content-type header fields for data of type
message is given by:
message-type := "message" "/" message-subtype
message-subtype := "rfc822"
/ "partial" 2#3partial-param
/ "external-body" 1*external-param
/ extension-token
partial-param := (";" "id" "=" value)
/ (";" "number" "=" 1*DIGIT)
/ (";" "total" "=" 1*DIGIT)
; id & number required; total required for last part
external-param := (";" "access-type" "=" atype)
/ (";" "expiration" "=" date-time)
; Note that date-time is quoted
/ (";" "size" "=" 1*DIGIT)
/ (";" "permission" "=" ("read" / "read-write"))
; Permission is case-insensitive
/ (";" "name" "=" value)
/ (";" "site" "=" value)
/ (";" "dir" "=" value)
/ (";" "mode" "=" value)
/ (";" "server" "=" value)
/ (";" "subject" "=" value)
; access-type required;others required based on access-type
atype := "ftp" / "anon-ftp" / "tftp" / "local-file"
/ "afs" / "mail-server" / extension-token
; Case-insensitive
7.4. The Application Content-Type
The "application" Content-Type is to be used for data which do not
fit in any of the other categories, and particularly for data to be
processed by mail-based uses of application programs. This is
information which must be processed by an application before it is
viewable or usable to a user. Expected uses for Content-Type
application include mail-based file transfer, spreadsheets, data for
mail-based scheduling systems, and languages for "active"
(computational) email. (The latter, in particular, can pose security
problems which must be understood by implementors, and are considered
in detail in the discussion of the application/PostScript content-
type.)
For example, a meeting scheduler might define a standard
representation for information about proposed meeting dates. An
intelligent user agent would use this information to conduct a dialog
with the user, and might then send further mail based on that dialog.
More generally, there have been several "active" messaging languages
developed in which programs in a suitably specialized language are
sent through the mail and automatically run in the recipient's
environment.
Such applications may be defined as subtypes of the "application"
Content-Type. This document defines two subtypes: octet-stream, and
PostScript.
In general, the subtype of application will often be the name of the
application for which the data are intended. This does not mean,
however, that any application program name may be used freely as a
subtype of application. Such usages (other than subtypes beginning
with "x-") must be registered with IANA, as described in Appendix E. 7.4.1. The Application/Octet-Stream (primary) subtype The primary subtype of application, "octet-stream", may be used to indicate that a body contains binary data. The set of possible parameters includes, but is not limited to: TYPE -- the general type or category of binary data. This is intended as information for the human recipient rather than for any automatic processing. PADDING -- the number of bits of padding that were appended to the bit-stream comprising the actual contents to produce the enclosed byte-oriented data. This is useful for enclosing a bit-stream in a body when the total number of bits is not a multiple of the byte size. An additional parameter, "conversions", was defined in [RFC-1341] but has been removed. RFC 1341 also defined the use of a "NAME" parameter which gave a suggested file name to be used if the data were to be written to a file. This has been deprecated in anticipation of a separate Content-Disposition header field, to be defined in a subsequent RFC. The recommended action for an implementation that receives application/octet-stream mail is to simply offer to put the data in a file, with any Content-Transfer-Encoding undone, or perhaps to use it as input to a user-specified process. To reduce the danger of transmitting rogue programs through the mail, it is strongly recommended that implementations NOT implement a path-search mechanism whereby an arbitrary program named in the Content-Type parameter (e.g., an "interpreter=" parameter) is found and executed using the mail body as input. 7.4.2. The Application/PostScript subtype A Content-Type of "application/postscript" indicates a PostScript program. Currently two variants of the PostScript language are allowed; the original level 1 variant is described in [POSTSCRIPT] and the more recent level 2 variant is described in [POSTSCRIPT2]. PostScript is a registered trademark of Adobe Systems, Inc. Use of the MIME content-type "application/postscript" implies recognition of that trademark and all the rights it entails.
The PostScript language definition provides facilities for internal labeling of the specific language features a given program uses. This labeling, called the PostScript document structuring conventions, is very general and provides substantially more information than just the language level. The use of document structuring conventions, while not required, is strongly recommended as an aid to interoperability. Documents which lack proper structuring conventions cannot be tested to see whether or not they will work in a given environment. As such, some systems may assume the worst and refuse to process unstructured documents. The execution of general-purpose PostScript interpreters entails serious security risks, and implementors are discouraged from simply sending PostScript email bodies to "off-the-shelf" interpreters. While it is usually safe to send PostScript to a printer, where the potential for harm is greatly constrained, implementors should consider all of the following before they add interactive display of PostScript bodies to their mail readers. The remainder of this section outlines some, though probably not all, of the possible problems with sending PostScript through the mail. Dangerous operations in the PostScript language include, but may not be limited to, the PostScript operators deletefile, renamefile, filenameforall, and file. File is only dangerous when applied to something other than standard input or output. Implementations may also define additional nonstandard file operators; these may also pose a threat to security. Filenameforall, the wildcard file search operator, may appear at first glance to be harmless. Note, however, that this operator has the potential to reveal information about what files the recipient has access to, and this information may itself be sensitive. Message senders should avoid the use of potentially dangerous file operators, since these operators are quite likely to be unavailable in secure PostScript implementations. Message- receiving and -displaying software should either completely disable all potentially dangerous file operators or take special care not to delegate any special authority to their operation. These operators should be viewed as being done by an outside agency when interpreting PostScript documents. Such disabling and/or checking should be done completely outside of the reach of the PostScript language itself; care should be taken to insure that no method exists for re-enabling full-function versions of these operators. The PostScript language provides facilities for exiting the normal interpreter, or server, loop. Changes made in this "outer" environment are customarily retained across documents, and may in some cases be retained semipermanently in nonvolatile memory. The
operators associated with exiting the interpreter loop have the potential to interfere with subsequent document processing. As such, their unrestrained use constitutes a threat of service denial. PostScript operators that exit the interpreter loop include, but may not be limited to, the exitserver and startjob operators. Message- sending software should not generate PostScript that depends on exiting the interpreter loop to operate. The ability to exit will probably be unavailable in secure PostScript implementations. Message-receiving and -displaying software should, if possible, disable the ability to make retained changes to the PostScript environment, and eliminate the startjob and exitserver commands. If these commands cannot be eliminated, the password associated with them should at least be set to a hard-to-guess value. PostScript provides operators for setting system-wide and device- specific parameters. These parameter settings may be retained across jobs and may potentially pose a threat to the correct operation of the interpreter. The PostScript operators that set system and device parameters include, but may not be limited to, the setsystemparams and setdevparams operators. Message-sending software should not generate PostScript that depends on the setting of system or device parameters to operate correctly. The ability to set these parameters will probably be unavailable in secure PostScript implementations. Message-receiving and -displaying software should, if possible, disable the ability to change system and device parameters. If these operators cannot be disabled, the password associated with them should at least be set to a hard-to-guess value. Some PostScript implementations provide nonstandard facilities for the direct loading and execution of machine code. Such facilities are quite obviously open to substantial abuse. Message-sending software should not make use of such features. Besides being totally hardware- specific, they are also likely to be unavailable in secure implementations of PostScript. Message-receiving and -displaying software should not allow such operators to be used if they exist. PostScript is an extensible language, and many, if not most, implementations of it provide a number of their own extensions. This document does not deal with such extensions explicitly since they constitute an unknown factor. Message-sending software should not make use of nonstandard extensions; they are likely to be missing from some implementations. Message-receiving and -displaying software should make sure that any nonstandard PostScript operators are secure and don't present any kind of threat. It is possible to write PostScript that consumes huge amounts of various system resources. It is also possible to write PostScript programs that loop infinitely. Both types of programs have the
potential to cause damage if sent to unsuspecting recipients. Message-sending software should avoid the construction and dissemination of such programs, which is antisocial. Message- receiving and -displaying software should provide appropriate mechanisms to abort processing of a document after a reasonable amount of time has elapsed. In addition, PostScript interpreters should be limited to the consumption of only a reasonable amount of any given system resource. Finally, bugs may exist in some PostScript interpreters which could possibly be exploited to gain unauthorized access to a recipient's system. Apart from noting this possibility, there is no specific action to take to prevent this, apart from the timely correction of such bugs if any are found. 7.4.3. Other Application subtypes It is expected that many other subtypes of application will be defined in the future. MIME implementations must generally treat any unrecognized subtypes as being equivalent to application/octet- stream. The formal grammar for content-type header fields for application data is given by: application-type := "application" "/" application-subtype application-subtype := ("octet-stream" *stream-param) / "postscript" / extension-token stream-param := (";" "type" "=" value) / (";" "padding" "=" padding) padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" 7.5. The Image Content-Type A Content-Type of "image" indicates that the body contains an image. The subtype names the specific image format. These names are case insensitive. Two initial subtypes are "jpeg" for the JPEG format, JFIF encoding, and "gif" for GIF format [GIF]. The list of image subtypes given here is neither exclusive nor exhaustive, and is expected to grow as more types are registered with IANA, as described in Appendix E. The formal grammar for the content-type header field for data of type image is given by:
image-type := "image" "/" ("gif" / "jpeg" / extension-token)
7.6. The Audio Content-Type
A Content-Type of "audio" indicates that the body contains audio
data. Although there is not yet a consensus on an "ideal" audio
format for use with computers, there is a pressing need for a format
capable of providing interoperable behavior.
The initial subtype of "basic" is specified to meet this requirement
by providing an absolutely minimal lowest common denominator audio
format. It is expected that richer formats for higher quality and/or
lower bandwidth audio will be defined by a later document.
The content of the "audio/basic" subtype is audio encoded using 8-bit
ISDN mu-law [PCM]. When this subtype is present, a sample rate of
8000 Hz and a single channel is assumed.
The formal grammar for the content-type header field for data of type
audio is given by:
audio-type := "audio" "/" ("basic" / extension-token)
7.7. The Video Content-Type
A Content-Type of "video" indicates that the body contains a time-
varying-picture image, possibly with color and coordinated sound.
The term "video" is used extremely generically, rather than with
reference to any particular technology or format, and is not meant to
preclude subtypes such as animated drawings encoded compactly. The
subtype "mpeg" refers to video coded according to the MPEG standard
[MPEG].
Note that although in general this document strongly discourages the
mixing of multiple media in a single body, it is recognized that many
so-called "video" formats include a representation for synchronized
audio, and this is explicitly permitted for subtypes of "video".
The formal grammar for the content-type header field for data of type
video is given by:
video-type := "video" "/" ("mpeg" / extension-token)
7.8. Experimental Content-Type Values
A Content-Type value beginning with the characters "X-" is a private
value, to be used by consenting mail systems by mutual agreement.
Any format without a rigorous and public definition must be named
with an "X-" prefix, and publicly specified values shall never begin with "X-". (Older versions of the widely-used Andrew system use the "X-BE2" name, so new systems should probably choose a different name.) In general, the use of "X-" top-level types is strongly discouraged. Implementors should invent subtypes of the existing types whenever possible. The invention of new types is intended to be restricted primarily to the development of new media types for email, such as digital odors or holography, and not for new data formats in general. In many cases, a subtype of application will be more appropriate than a new top-level type.