Tech-invite3GPPspaceIETF RFCsSIP
in Index   Prev   Next

RFC 1341

MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies

Pages: 80
Obsoleted by:  1521
Part 3 of 3 – Pages 47 to 80
First   Prev   None

ToP   noToC   RFC1341 - Page 47   prevText
7.4  The Application Content-Type

            The "application" Content-Type is to be used for data  which
            do  not fit in any of the other categories, and particularly
            for data to be processed by mail-based uses  of  application
            programs.  This is information which must be processed by an
            application before it is  viewable  or  usable  to  a  user.
            Expected  uses  for  Content-Type  application include mail-
            based  file  transfer,  spreadsheets,  data  for  mail-based
            scheduling    systems,    and    languages    for   "active"
            (computational) email.  (The latter, in particular, can pose
            security    problems   which   should   be   understood   by
            implementors, and are considered in detail in the discussion
            of the application/PostScript content-type.)

            For example, a meeting scheduler  might  define  a  standard
            representation for information about proposed meeting dates.
            An intelligent user agent  would  use  this  information  to
            conduct  a dialog with the user, and might then send further
            mail based on that dialog. More generally, there  have  been
            several  "active"  messaging  languages  developed  in which
            programs in a suitably specialized language are sent through
            the   mail   and   automatically   run  in  the  recipient's

            Such  applications  may  be  defined  as  subtypes  of   the
            "application"  Content-Type.   This  document  defines three
            subtypes: octet-stream, ODA, and PostScript.

            In general, the subtype of application  will  often  be  the
            name  of  the  application  for which the data are intended.
            This does not mean, however, that  any  application  program
            name  may  be used freely as a subtype of application.  Such
            usages  must  be  registered  with  IANA,  as  described  in
            Appendix F.

7.4.1     The Application/Octet-Stream (primary) subtype

            The primary subtype of application, "octet-stream",  may  be
            used  to indicate that a body contains binary data.  The set
            of possible parameters includes, but is not limited to:

                 NAME -- a suggested name for the  binary  data  if
                 stored as a file.

                 TYPE -- the general type  or  category  of  binary
                 data.   This  is  intended  as information for the
                 human recipient  rather  than  for  any  automatic

                 CONVERSIONS -- the set  of  operations  that  have
                 been  performed  on  the data before putting it in
                 the mail (and before any Content-Transfer-Encoding
                 that   might   have  been  applied).  If  multiple
ToP   noToC   RFC1341 - Page 48
                 conversions have occurred, they must be  separated
                 by  commas  and  specified  in the order they were
                 applied -- that is, the leftmost conversion   must
                 have  occurred  first,  and conversions are undone
                 from right  to  left.   Note  that  NO  conversion
                 values   are   defined   by  this  document.   Any
                 conversion values that that do not begin with "X-"
                 must  be preceded by a published specification and
                 by  registration  with  IANA,  as   described   in
                 Appendix F.

                 PADDING -- the number of bits of padding that were
                 appended  to  the  bitstream comprising the actual
                 contents to  produce  the  enclosed  byte-oriented
                 data.  This is useful for enclosing a bitstream in
                 a body when the total number  of  bits  is  not  a
                 multiple of the byte size.

            The values  for  these  attributes  are  left  undefined  at
            present,  but  may  require specification in the future.  An
            example of a common (though UNIX-specific) usage might be:

                 Content-Type:  application/octet-stream;
                      name=foo.tar.Z; type=tar;

            However, it should be noted that the use of such conversions
            is  explicitly  discouraged due to a lack of portability and
            standardization.   The  use  of  uuencode  is   particularly
            discouraged,   in  favor  of  the  Content-Transfer-Encoding
            mechanism, which is both more standardized and more portable
            across mail boundaries.

            The recommended action for an implementation  that  receives
            application/octet-stream  mail is to simply offer to put the
            data in a file, with any  Content-Transfer-Encoding  undone,
            or perhaps to use it as input to a user-specified process.

            To reduce the danger of transmitting rogue programs  through
            the  mail,  it  is strongly recommended that implementations
            NOT implement a path-search mechanism whereby  an  arbitrary
            program  named  in  the  Content-Type  parameter  (e.g.,  an
            "interpreter=" parameter) is found and  executed  using  the
            mail body as input.

7.4.2     The Application/PostScript subtype

            A  Content-Type  of  "application/postscript"  indicates   a
            PostScript    program.    The   language   is   defined   in
            [POSTSCRIPT].  It is recommended  that  Postscript  as  sent
            through  email  should  use  Postscript document structuring
            conventions if at all possible, and correctly.
ToP   noToC   RFC1341 - Page 49
            The execution  of  general-purpose  PostScript  interpreters
            entails   serious   security  risks,  and  implementors  are
            discouraged from simply sending PostScript email  bodies  to
            "off-the-shelf"  interpreters.   While it is usually safe to
            send PostScript to a printer, where the potential  for  harm
            is  greatly constrained, implementors should consider all of
            the  following  before  they  add  interactive  display   of
            PostScript bodies to their mail readers.

            The remainder of this section outlines some, though probably
            not  all,  of  the possible problems with sending PostScript
            through the mail.

            Dangerous operations in the PostScript language include, but
            may  not be limited to, the PostScript operators deletefile,
            renamefile,  filenameforall,  and  file.    File   is   only
            dangerous  when  applied  to  something  other than standard
            input or output. Implementations may also define  additional
            nonstandard  file operators; these may also pose a threat to
            security.     Filenameforall,  the  wildcard   file   search
            operator,  may  appear at first glance to be harmless. Note,
            however, that this operator  has  the  potential  to  reveal
            information  about  what  files the recipient has access to,
            and this  information  may  itself  be  sensitive.   Message
            senders  should  avoid the use of potentially dangerous file
            operators, since these operators  are  quite  likely  to  be
            unavailable  in secure PostScript implementations.  Message-
            receiving and -displaying software should either  completely
            disable  all  potentially  dangerous  file operators or take
            special care not to delegate any special authority to  their
            operation. These operators should be viewed as being done by
            an outside agency when  interpreting  PostScript  documents.
            Such  disabling  and/or  checking  should be done completely
            outside of the reach of the PostScript language itself; care
            should  be  taken  to  insure  that  no  method  exists  for
            reenabling full-function versions of these operators.

            The PostScript language provides facilities for exiting  the
            normal  interpreter,  or  server, loop. Changes made in this
            "outer"  environment   are   customarily   retained   across
            documents, and may in some cases be retained semipermanently
            in nonvolatile memory. The operators associated with exiting
            the  interpreter  loop  have the potential to interfere with
            subsequent document processing. As such, their  unrestrained
            use  constitutes  a  threat  of  service denial.  PostScript
            operators that exit the interpreter loop  include,  but  may
            not  be  limited  to, the exitserver and startjob operators.
            Message-sending software should not generate PostScript that
            depends  on  exiting  the  interpreter  loop to operate. The
            ability to exit  will  probably  be  unavailable  in  secure
            PostScript     implementations.     Message-receiving    and
            -displaying  software  should,  if  possible,  disable   the
            ability   to   make   retained  changes  to  the  PostScript
            environment. Eliminate the startjob and exitserver commands.
ToP   noToC   RFC1341 - Page 50
            If  these  commands  cannot  be eliminated, at least set the
            password associated with them to a hard-to-guess value.

            PostScript provides operators for  setting  system-wide  and
            device-specific  parameters. These parameter settings may be
            retained across jobs and may potentially pose  a  threat  to
            the  correct  operation  of the interpreter.  The PostScript
            operators that set system and device parameters include, but
            may  not be limited to, the setsystemparams and setdevparams
            operators.  Message-sending  software  should  not  generate
            PostScript  that  depends on the setting of system or device
            parameters to operate correctly. The ability  to  set  these
            parameters will probably be unavailable in secure PostScript
            implementations. Message-receiving and -displaying  software
            should,  if  possible,  disable the ability to change system
            and  device  parameters.  If  these  operators   cannot   be
            disabled,  at least set the password associated with them to
            a hard-to-guess value.

            Some   PostScript   implementations   provide    nonstandard
            facilities  for  the direct loading and execution of machine
            code.  Such  facilities  are  quite    obviously   open   to
            substantial  abuse.    Message-sending  software  should not
            make use of such features. Besides being  totally  hardware-
            specific,  they  are also likely to be unavailable in secure
            implementations  of  PostScript.     Message-receiving   and
            -displaying  software  should not allow such operators to be
            used if they exist.

            PostScript is an extensible language, and many, if not most,
            implementations   of  it  provide  a  number  of  their  own
            extensions. This document does not deal with such extensions
            explicitly   since   they   constitute  an  unknown  factor.
            Message-sending software should not make use of  nonstandard
            extensions;   they  are  likely  to  be  missing  from  some
            implementations. Message-receiving and -displaying  software
            should  make  sure that any nonstandard PostScript operators
            are secure and don't present any kind of threat.

            It is  possible  to  write  PostScript  that  consumes  huge
            amounts  of various system resources. It is also possible to
            write PostScript programs that loop infinitely.  Both  types
            of  programs  have  the potential to cause damage if sent to
            unsuspecting recipients.   Message-sending  software  should
            avoid  the  construction and dissemination of such programs,
            which  is  antisocial.   Message-receiving  and  -displaying
            software  should  provide  appropriate  mechanisms  to abort
            processing of a document after a reasonable amount  of  time
            has  elapsed. In addition, PostScript interpreters should be
            limited to the consumption of only a  reasonable  amount  of
            any given system resource.

            Finally, bugs may  exist  in  some  PostScript  interpreters
            which  could  possibly  be  exploited  to  gain unauthorized
ToP   noToC   RFC1341 - Page 51
            access to a  recipient's  system.  Apart  from  noting  this
            possibility,  there is no specific action to take to prevent
            this, apart from the timely correction of such bugs  if  any
            are found.

7.4.3     The Application/ODA subtype

            The "ODA" subtype of application is used to indicate that  a
            body  contains  information  encoded according to the Office
            Document  Architecture  [ODA]   standards,  using  the  ODIF
            representation  format.   For  application/oda, the Content-
            Type line should also specify an attribute/value  pair  that
            indicates  the document application profile (DAP), using the
            key word "profile".  Thus an appropriate header field  might
            look like this:

            Content-Type:  application/oda; profile=Q112

            Consult the ODA standard [ODA] for further information.
ToP   noToC   RFC1341 - Page 52
7.5  The Image Content-Type

            A Content-Type of "image" indicates that the bodycontains an
            image.   The subtype names the specific image format.  These
            names are case insensitive.  Two initial subtypes are "jpeg"
            for the JPEG format, JFIF encoding, and "gif" for GIF format

            The list of image subtypes given here is  neither  exclusive
            nor  exhaustive,  and  is expected to grow as more types are
            registered with IANA, as described in Appendix F.

7.6  The Audio Content-Type

            A Content-Type of "audio" indicates that the  body  contains
            audio  data.   Although  there  is not yet a consensus on an
            "ideal" audio format for use  with  computers,  there  is  a
            pressing   need   for   a   format   capable   of  providing
            interoperable behavior.

            The initial subtype of "basic" is  specified  to  meet  this
            requirement by providing an absolutely minimal lowest common
            denominator  audio  format.   It  is  expected  that  richer
            formats for higher quality and/or lower bandwidth audio will
            be defined by a later document.

            The content of the "audio/basic" subtype  is  audio  encoded
            using  8-bit ISDN u-law [PCM]. When this subtype is present,
            a sample rate of 8000 Hz and a single channel is assumed.

7.7  The Video Content-Type

            A Content-Type of "video" indicates that the body contains a
            time-varying-picture   image,   possibly   with   color  and
            coordinated sound.   The  term  "video"  is  used  extremely
            generically,  rather  than  with reference to any particular
            technology or format, and is not meant to preclude  subtypes
            such  as animated drawings encoded compactly.    The subtype
            "mpeg" refers to video coded according to the MPEG  standard

            Note  that  although  in  general  this  document   strongly
            discourages  the  mixing of multiple media in a single body,
            it is recognized that many so-called "video" formats include
            a   representation  for  synchronized  audio,  and  this  is
            explicitly permitted for subtypes of "video".

7.8  Experimental Content-Type Values

            A Content-Type value beginning with the characters "X-" is a
            private  value,  to  be  used  by consenting mail systems by
            mutual agreement.  Any format without a rigorous and  public
            definition  must  be named with an "X-" prefix, and publicly
            specified  values  shall  never  begin  with  "X-".   (Older
ToP   noToC   RFC1341 - Page 53
            versions  of  the  widely-used Andrew system use the "X-BE2"
            name, so new systems  should  probably  choose  a  different

            In general, the use of  "X-"  top-level  types  is  strongly
            discouraged.   Implementors  should  invent  subtypes of the
            existing types whenever  possible.   The  invention  of  new
            types   is  intended  to  be  restricted  primarily  to  the
            development of new media types for email,  such  as  digital
            odors  or  holography,  and  not  for  new  data  formats in
            general. In many cases, a subtype  of  application  will  be
            more appropriate than a new top-level type.
ToP   noToC   RFC1341 - Page 54

            Using the MIME-Version, Content-Type, and  Content-Transfer-
            Encoding  header  fields,  it  is  possible to include, in a
            standardized way, arbitrary types of data objects  with  RFC
            822  conformant  mail  messages.  No restrictions imposed by
            either RFC 821 or RFC 822 are violated, and  care  has  been
            taken  to  avoid  problems caused by additional restrictions
            imposed  by  the  characteristics  of  some  Internet   mail
            transport  mechanisms  (see Appendix B). The "multipart" and
            "message"  Content-Types  allow  mixing   and   hierarchical
            structuring  of  objects  of  different  types  in  a single
            message.  Further  Content-Types  provide   a   standardized
            mechanism  for  tagging  messages  or  body  parts as audio,
            image, or several other  kinds  of  data.   A  distinguished
            parameter syntax allows further specification of data format
            details,  particularly  the   specification   of   alternate
            character  sets.  Additional  optional header fields provide
            mechanisms for certain extensions deemed desirable  by  many
            implementors.  Finally, a number of useful Content-Types are
            defined for general use by consenting user  agents,  notably
            text/richtext, message/partial, and message/external-body.
ToP   noToC   RFC1341 - Page 55

            This document is the result of the collective  effort  of  a
            large  number  of  people,  at several IETF meetings, on the
            IETF-SMTP  and  IETF-822  mailing  lists,   and   elsewhere.
            Although   any  enumeration  seems  doomed  to  suffer  from
            egregious  omissions,  the  following  are  among  the  many
            contributors to this effort:

            Harald Tveit Alvestrand       Timo Lehtinen
            Randall Atkinson              John R. MacMillan
            Philippe Brandon              Rick McGowan
            Kevin Carosso                 Leo Mclaughlin
            Uhhyung Choi                  Goli Montaser-Kohsari
            Cristian Constantinof         Keith Moore
            Mark Crispin                  Tom Moore
            Dave Crocker                  Erik Naggum
            Terry Crowley                 Mark Needleman
            Walt Daniels                  John Noerenberg
            Frank Dawson                  Mats Ohrman
            Hitoshi Doi                   Julian Onions
            Kevin Donnelly                Michael Patton
            Keith Edwards                 David J. Pepper
            Chris Eich                    Blake C. Ramsdell
            Johnny Eriksson               Luc Rooijakkers
            Craig Everhart                Marshall T. Rose
            Patrik Faeltstroem              Jonathan Rosenberg
            Erik E. Fair                  Jan Rynning
            Roger Fajman                  Harri Salminen
            Alain Fontaine                Michael Sanderson
            James M. Galvin               Masahiro Sekiguchi
            Philip Gladstone              Mark Sherman
            Thomas Gordon                 Keld Simonsen
            Phill Gross                   Bob Smart
            James Hamilton                Peter Speck
            Steve Hardcastle-Kille        Henry Spencer
            David Herron                  Einar Stefferud
            Bruce Howard                  Michael Stein
            Bill Janssen                  Klaus Steinberger
            Olle Jaernefors                Peter Svanberg
            Risto Kankkunen               James Thompson
            Phil Karn                     Steve Uhler
            Alan Katz                     Stuart Vance
            Tim Kehres                    Erik van der Poel
            Neil Katin                    Guido van Rossum
            Kyuho Kim                     Peter Vanderbilt
            Anders Klemets                Greg Vaudreuil
            John Klensin                  Ed Vielmetti
            Valdis Kletniek               Ryan Waldron
            Jim Knowles                   Wally Wedel
            Stev Knowles                  Sven-Ove Westberg
            Bob Kummerfeld                Brian Wideen
ToP   noToC   RFC1341 - Page 56
            Pekka Kytolaakso              John Wobus
            Stellan Lagerstr.m            Glenn Wright
            Vincent Lau                   Rayan Zachariassen
            Donald Lindsay                David Zimmerman
            The authors apologize for  any  omissions  from  this  list,
            which are certainly unintentional.
ToP   noToC   RFC1341 - Page 57
Appendix A -- Minimal MIME-Conformance

            The mechanisms described in this  document  are  open-ended.
            It  is definitely not expected that all implementations will
            support all of the Content-Types described,  nor  that  they
            will  all  share  the  same extensions.  In order to promote
            interoperability,  however,  it  is  useful  to  define  the
            concept  of  "MIME-conformance" to define a certain level of
            implementation  that  allows  the  useful  interworking   of
            messages  with  content that differs from US ASCII text.  In
            this  section,  we  specify  the   requirements   for   such

            A mail user agent that is MIME-conformant MUST:

                 1.  Always generate a "MIME-Version:  1.0"  header

                 2.  Recognize the Content-Transfer-Encoding header
                 field,  and  decode all received data encoded with
                 either    the    quoted-printable    or     base64
                 implementations.    Encode  any  data sent that is
                 not in seven-bit mail-ready  representation  using
                 one  of  these  transformations  and  include  the
                 appropriate    Content-Transfer-Encoding    header
                 field,  unless  the underlying transport mechanism
                 supports non-seven-bit data, as SMTP does not.

                 3.   Recognize  and  interpret  the   Content-Type
                 header  field,  and  avoid  showing users raw data
                 with a Content-Type field  other  than  text.   Be
                 able  to  send  at least text/plain messages, with
                 the character set specified as a parameter  if  it
                 is not US-ASCII.

                 4.  Explicitly handle the  following  Content-Type
                 values, to at least the following extents:

                      -- Recognize  and  display  "text"  mail
                           with the character set "US-ASCII."
                      -- Recognize  other  character  sets  at
                           least  to  the extent of being able
                           to  inform  the  user  about   what
                           character set the message uses.
                      -- Recognize the "ISO-8859-*"  character
                           sets to the extent of being able to
                           display those characters  that  are
                           common  to ISO-8859-* and US-ASCII,
                           namely all  characters  represented
                           by octet values 0-127.
                      -- For unrecognized  subtypes,  show  or
                           offer  to  show  the user the "raw"
                           version of the data.  An ability at
ToP   noToC   RFC1341 - Page 58
                           least to convert "text/richtext" to
                           plain text, as shown in Appendix D,
                           is encouraged, but not required for
                      --Recognize and  display  at  least  the
                           primary (822) encapsulation.
                      --   Recognize   the   primary   (mixed)
                           subtype.    Display   all  relevant
                           information on  the  message  level
                           and  the body part header level and
                           then display or  offer  to  display
                           each     of    the    body    parts
                      -- Recognize the "alternative"  subtype,
                           and    avoid   showing   the   user
                           redundant         parts          of
                           multipart/alternative mail.
                      -- Treat any unrecognized subtypes as if
                           they were "mixed".
                      -- Offer the ability to remove either of
                           the  two types of Content-Transfer-
                           Encoding defined in  this  document
                           and  put  the resulting information
                           in a user file.

                 5.  Upon encountering  any  unrecognized  Content-
                 Type, an implementation must treat it as if it had
                 a Content-Type of "application/octet-stream"  with
                 no  parameter  sub-arguments.  How  such  data are
                 handled is up to  an  implementation,  but  likely
                 options   for   handling  such  unrecognized  data
                 include offering the user to write it into a  file
                 (decoded   from  its  mail  transport  format)  or
                 offering the user to name a program to  which  the
                 decoded   data   should   be   passed   as  input.
                 Unrecognized predefined types, which  in  a  MIME-
                 conformant   mailer  might  still  include  audio,
                 image, or video, should also be  treated  in  this

            A user agent that meets the above conditions is said  to  be
            MIME-conformant.   The  meaning of this phrase is that it is
            assumed  to  be  "safe"  to  send  virtually  any  kind   of
            properly-marked  data to users of such mail systems, because
            such systems will at least be able  to  treat  the  data  as
            undifferentiated  binary, and will not simply splash it onto
            the screen of unsuspecting users.   There is  another  sense
            in  which  it is always "safe" to send data in a format that
            is MIME-conformant, which is that such data will  not  break
            or  be  broken by any known systems that are conformant with
            RFC 821 and RFC 822.  User agents that  are  MIME-conformant
ToP   noToC   RFC1341 - Page 59
            have  the  additional  guarantee  that  the user will not be
            shown data that were never intended to be viewed as text.
ToP   noToC   RFC1341 - Page 60
Appendix B -- General Guidelines For Sending Email Data

            Internet email is not a perfect, homogeneous  system.   Mail
            may  become  corrupted  at several stages in its travel to a
            final destination. Specifically, email sent  throughout  the
            Internet  may  travel  across  many networking technologies.
            Many networking and mail technologies  do  not  support  the
            full   functionality   possible   in   the   SMTP  transport
            environment. Mail traversing these systems is likely  to  be
            modified in such a way that it can be transported.

            There exist many widely-deployed non-conformant MTAs in  the
            Internet.  These  MTAs,  speaking  the  SMTP protocol, alter
            messages on the fly to take advantage of the  internal  data
            structure  of the hosts they are implemented on, or are just
            plain broken.

            The following guidelines may be useful to anyone devising  a
            data  format  (Content-Type)  that  will  survive the widest
            range of  networking  technologies  and  known  broken  MTAs
            unscathed.    Note  that  anything  encoded  in  the  base64
            encoding will satisfy these rules, but that some  well-known
            mechanisms,  notably  the  UNIX uuencode facility, will not.
            Note also that  anything  encoded  in  the  Quoted-Printable
            encoding will survive most gateways intact, but possibly not
            some gateways to systems that use the EBCDIC character set.

                 (1) Under some circumstances the encoding used for
                 data  may change as part of normal gateway or user
                 agent operation. In  particular,  conversion  from
                 base64  to  quoted-printable and vice versa may be
                 necessary. This may result  in  the  confusion  of
                 CRLF  sequences  with  line  breaks  in  text body
                 parts.  As  such,  the  persistence  of  CRLF   as
                 something  other  than  a line break should not be
                 relied on.

                 (2) Many systems may elect to represent and  store
                 text  data  using local newline conventions. Local
                 newline conventions may not match the RFC822  CRLF
                 convention -- systems are known that use plain CR,
                 plain LF, CRLF, or counted records.  The result is
                 that isolated CR and LF characters  are  not  well
                 tolerated  in    general;  they  may  be  lost  or
                 converted to delimiters on some systems, and hence
                 should not be relied on.

                 (3) TAB (HT) characters may be  misinterpreted  or
                 may be automatically converted to variable numbers
                 of  spaces.    This   is   unavoidable   in   some
                 environments, notably those not based on the ASCII
                 character  set.  Such   conversion   is   STRONGLY
                 DISCOURAGED,  but  it  may occur, and mail formats
                 should not rely on the  persistence  of  TAB  (HT)
ToP   noToC   RFC1341 - Page 61

                 (4) Lines longer than 76 characters may be wrapped
                 or  truncated  in some environments. Line wrapping
                 and line truncation are STRONGLY DISCOURAGED,  but
                 unavoidable  in  some  cases.  Applications  which
                 require long lines  should  somehow  differentiate
                 between  soft and hard line breaks.  (A simple way
                 to  do  this  is  to  use   the   quoted-printable

                 (5)  Trailing "white space" characters (SPACE, TAB
                 (HT)) on a line may be discarded by some transport
                 agents, while other transport agents may pad lines
                 with  these characters so that all lines in a mail
                 file are of equal  length.    The  persistence  of
                 trailing  white  space,  therefore,  should not be
                 relied on.

                 (6)  Many mail domains use variations on the ASCII
                 character  set,  or  use  character  sets  such as
                 EBCDIC which contain most but not all of  the  US-
                 ASCII  characters.   The  correct  translation  of
                 characters not in the "invariant"  set  cannot  be
                 depended  on across character converting gateways.
                 For example, this  situation  is  a  problem  when
                 sending  uuencoded  information  across BITNET, an
                 EBCDIC system.  Similar problems can occur without
                 crossing  a gateway, since many Internet hosts use
                 character sets other than ASCII  internally.   The
                 definition  of  Printable  Strings  in  X.400 adds
                 further restrictions in certain special cases.  In
                 particular,  the only characters that are known to
                 be consistent  across  all  gateways  are  the  73
                 characters  that correspond to the upper and lower
                 case letters A-Z and a-z, the 10 digits  0-9,  and
                 the following eleven special characters:

                                "'"  (ASCII code 39)
                                "("  (ASCII code 40)
                                ")"  (ASCII code 41)
                                "+"  (ASCII code 43)
                                ","  (ASCII code 44)
                                "-"  (ASCII code 45)
                                "."  (ASCII code 46)
                                "/"  (ASCII code 47)
                                ":"  (ASCII code 58)
                                "="  (ASCII code 61)
                                "?"  (ASCII code 63)

                 A maximally portable mail representation, such  as
                 the   base64  encoding,  will  confine  itself  to
                 relatively short lines of text in which  the  only
                 meaningful  characters  are taken from this set of
ToP   noToC   RFC1341 - Page 62
                 73 characters.

            Please note that the above list is NOT a list of recommended
            practices  for  MTAs.  RFC  821  MTAs  are  prohibited  from
            altering the character  of  white  space  or  wrapping  long
            lines.   These  BAD and illegal practices are known to occur
            on established networks, and implementions should be  robust
            in dealing with the bad effects they can cause.
ToP   noToC   RFC1341 - Page 63
Appendix C -- A Complex Multipart Example

            What follows is the outline of a complex multipart  message.
            This  message  has five parts to be displayed serially:  two
            introductory  plain  text  parts,  an   embedded   multipart
            message,  a  richtext  part, and a closing encapsulated text
            message  in  a  non-ASCII  character  set.    The   embedded
            multipart message has two parts to be displayed in parallel,
            a picture and an audio fragment.

                 MIME-Version: 1.0
                 From: Nathaniel Borenstein <>
                 Subject: A multipart example
                 Content-Type: multipart/mixed;

                 This is the preamble area of a multipart message.
                 Mail readers that understand multipart format
                 should ignore this preamble.
                 If you are reading this text, you might want to
                 consider changing to a mail reader that understands
                 how to properly display multipart messages.

                 ...Some text appears here...
                 [Note that the preceding blank line means
                 no header fields were given and this is text,
                 with charset US ASCII.  It could have been
                 done with explicit typing as in the next part.]

                 Content-type: text/plain; charset=US-ASCII

                 This could have been part of the previous part,
                 but illustrates explicit versus implicit
                 typing of body parts.

                 Content-Type: multipart/parallel;

                 Content-Type: audio/basic
                 Content-Transfer-Encoding: base64

                 ... base64-encoded 8000 Hz single-channel
                     u-law-format audio data goes here....

                 Content-Type: image/gif
                 Content-Transfer-Encoding: Base64
ToP   noToC   RFC1341 - Page 64
                 ... base64-encoded image data goes here....


                 Content-type: text/richtext

                 This is <bold><italic>richtext.</italic></bold>
                 <nl><nl>Isn't it

                 Content-Type: message/rfc822

                 From: (name in US-ASCII)
                 Subject: (subject in US-ASCII)
                 Content-Type: Text/plain; charset=ISO-8859-1
                 Content-Transfer-Encoding: Quoted-printable

                 ... Additional text in ISO-8859-1 goes here ...

ToP   noToC   RFC1341 - Page 65
Appendix D -- A Simple Richtext-to-Text Translator in C

            One of the major goals in the design of the richtext subtype
            of the text Content-Type is to make formatted text so simple
            that even  text-only  mailers  will  implement  richtext-to-
            plain-text  translators, thus increasing the likelihood that
            multifont text will become "safe" to use  very  widely.   To
            demonstrate  this  simplicity,  what follows is an extremely
            simple 44-line C program that converts richtext  input  into
            plain text output:

                 #include <stdio.h>
                 #include <ctype.h>
                 main() {
                     int c, i;
                     char token[50];

                     while((c = getc(stdin)) != EOF) {
                         if (c == '<') {
                             for (i=0; (i<49 && (c = getc(stdin)) != '>'
                                       && c != EOF); ++i) {
                                 token[i] = isupper(c) ? tolower(c) : c;
                             if (c == EOF) break;
                             if (c != '>') while ((c = getc(stdin)) !=
                                       && c != EOF) {;}
                             if (c == EOF) break;
                             token[i] = '\0';
                             if (!strcmp(token, "lt")) {
                                 putc('<', stdout);
                             } else if (!strcmp(token, "nl")) {
                                 putc('\n', stdout);
                             } else if (!strcmp(token, "/paragraph")) {
                                 fputs("\n\n", stdout);
                             } else if (!strcmp(token, "comment")) {
                                 int commct=1;
                                 while (commct > 0) {
                                     while ((c = getc(stdin)) != '<'
                                      && c != EOF) ;
                                     if (c == EOF) break;
                                     for (i=0; (c = getc(stdin)) != '>'
                                        && c != EOF; ++i) {
                                         token[i] = isupper(c) ?
                                          tolower(c) : c;
                                     if (c== EOF) break;
                                     token[i] = NULL;
                                     if (!strcmp(token, "/comment")) --
                                     if (!strcmp(token, "comment"))
ToP   noToC   RFC1341 - Page 66
                             } /* Ignore all other tokens */
                         } else if (c != '\n') putc(c, stdout);
                     putc('\n', stdout); /* for good measure */
            It should be noted that one can do considerably better  than
            this  in  displaying  richtext  data on a dumb terminal.  In
            particular, one can replace font information such as  "bold"
            with textual emphasis (like *this* or   _T_H_I_S_).  One can
            also  properly  handle  the  richtext  formatting   commands
            regarding  indentation, justification, and others.  However,
            the above program is all  that  is  necessary  in  order  to
            present richtext on a dumb terminal.
ToP   noToC   RFC1341 - Page 67
Appendix E -- Collected Grammar

            This appendix contains the complete BNF grammar for all  the
            syntax specified by this document.

            By itself, however, this grammar is incomplete.   It  refers
            to  several  entities  that  are defined by RFC 822.  Rather
            than   reproduce   those   definitions   here,   and    risk
            unintentional  differences  between  the  two, this document
            simply refers the  reader  to  RFC  822  for  the  remaining
            definitions.  Wherever a term is undefined, it refers to the
            RFC 822 definition.

            attribute := token

            body-part = <"message" as defined in RFC 822,
                     with all header fields optional, and with the
                     specified delimiter not occurring anywhere in
                     the message body, either on a line by itself
                     or as a substring anywhere.>

            boundary := 0*69<bchars> bcharsnospace

            bchars := bcharsnospace / " "

            bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+"  /
                           / "," / "-" / "." / "/" / ":" / "=" / "?"

            close-delimiter := delimiter "--"

            Content-Description := *text

            Content-ID := msg-id

            Content-Transfer-Encoding  :=      "BASE64"     /   "QUOTED-
            PRINTABLE" /
                                            "8BIT"  / "7BIT" /
                                            "BINARY"     / x-token

            Content-Type := type "/" subtype *[";" parameter]

            delimiter := CRLF "--" boundary   ; taken from  Content-Type
                                           ;   when   content-type    is
                                         ; There should be no space
                                         ; between "--" and boundary.

            encapsulation := delimiter CRLF body-part

            epilogue :=  *text                  ;  to  be  ignored  upon
ToP   noToC   RFC1341 - Page 68
            MIME-Version := 1*text

            multipart-body := preamble  1*encapsulation  close-delimiter

            parameter := attribute "=" value

            preamble :=  *text                  ;  to  be  ignored  upon

            subtype := token

            token := 1*<any CHAR except SPACE, CTLs, or tspecials>

            tspecials :=  "(" / ")" / "<" / ">" / "@"  ; Must be in
                       /  "," / ";" / ":" / "\" / <">  ; quoted-string,
                       /  "/" / "[" / "]" / "?" / "."  ; to use within
                       /  "="                        ; parameter values

            type :=            "application"     /  "audio"     ;  case-
                      / "image"           / "message"
                      / "multipart"  / "text"
                      / "video"           / x-token

            value := token / quoted-string

            x-token := <The two characters "X-" followed, with no
                       intervening white space, by any token>
ToP   noToC   RFC1341 - Page 69
Appendix F -- IANA Registration Procedures

            MIME  has  been  carefully  designed  to   have   extensible
            mechanisms,  and  it  is  expected  that the set of content-
            type/subtype pairs and their associated parameters will grow
            significantly with time.  Several other MIME fields, notably
            character  set  names,  access-type   parameters   for   the
            message/external-body  type,  conversions parameters for the
            application  type,  and  possibly   even   Content-Transfer-
            Encoding  values, are likely to have new values defined over
            time.  In order to ensure that the set  of  such  values  is
            developed  in an orderly, well-specified, and public manner,
            MIME defines a registration process which uses the  Internet
            Assigned  Numbers Authority (IANA) as a central registry for
            such values.

            In general, parameters in the content-type header field  are
            used  to convey supplemental information for various content
            types, and their use is defined when  the  content-type  and
            subtype  are  defined.  New parameters should not be defined
            as a way to introduce new functionality.

            In  order  to  simplify  and  standardize  the  registration
            process,  this appendix gives templates for the registration
            of new values with IANA.  Each of these is given in the form
            of  an  email  message  template,  to  be  filled  in by the
            registering party.

F.1  Registration of New Content-type/subtype Values

            Note that MIME is  generally  expected  to  be  extended  by
            subtypes.   If  a  new fundamental top-level type is needed,
            its  specification  should  be  published  as  an   RFC   or
            submitted  in  a  form   suitable  to  become an RFC, and be
            subject to the Internet standards process.

                 Subject:  Registration of new MIME content-type/subtype

                 MIME type name:

                 (If the above is not an existing top-level MIME type,
                 please explain why an existing type cannot be used.)

                 MIME subtype name:

                 Required parameters:

                 Optional parameters:

                 Encoding considerations:

                 Security considerations:
ToP   noToC   RFC1341 - Page 70
                 Published specification:

                 (The published specification must be an Internet RFC or
                 RFC-to-be if a new top-level type is being defined, and
                 must be a publicly available specification in any

                 Person & email address to contact for further
F.2  Registration of New Character Set Values

                 Subject:  Registration of new MIME character set value

                 MIME character set name:

                 Published specification:

                 (The published specification must be an Internet RFC or
                 RFC-to-be or an international standard.)

                 Person & email address to contact for further

F.3  Registration of New Access-type Values for

                 Subject:  Registration of new MIME Access-type for
                      Message/external-body content-type

                 MIME access-type name:

                 Required parameters:

                 Optional parameters:

                 Published specification:

                 (The published specification must be an Internet RFC or

                 Person & email address to contact for further

F.4  Registration of New Conversions Values for Application

                 Subject:  Registration of new MIME Conversions value
                 for Application content-type

                 MIME Conversions name:
ToP   noToC   RFC1341 - Page 71
                 Published specification:

                 (The published specification must be an Internet RFC or

                 Person & email address to contact for further
ToP   noToC   RFC1341 - Page 72
Appendix G -- Summary of the Seven Content-types

            Content-type: text

            Subtypes defined by this document:  plain, richtext

            Important Parameters: charset

            Encoding notes: quoted-printable generally preferred  if  an
                 encoding  is  needed and the character set is mostly an
                 ASCII superset.

            Security considerations:  Rich text formats such as TeX  and
                 Troff  often contain mechanisms for executing arbitrary
                 commands or file system operations, and should  not  be
                 used  automatically unless these security problems have
                 been addressed.  Even plain text  may  contain  control
                 characters that can be used to exploit the capabilities
                 of   "intelligent"   terminals   and   cause   security
                 violations.   User  interfaces  designed to run on such
                 terminals should be aware of and try  to  prevent  such

            Content-type: multipart

            Subtypes defined by  this  document:    mixed,  alternative,
                 digest, parallel.

            Important Parameters: boundary

            Encoding notes: No content-transfer-encoding is permitted.


            Content-type: message

            Subtypes  defined  by  this  document:    rfc822,   partial,

            Important Parameters: id, number, total

            Encoding notes: No content-transfer-encoding is permitted.


            Content-type: application

            Subtypes  defined   by   this   document:      octet-stream,
                 postscript, oda

            Important Parameters: profile
ToP   noToC   RFC1341 - Page 73
            Encoding notes: base64 generally preferred for  octet-stream
                 or other unreadable subtypes.

            Security considerations:  This  type  is  intended  for  the
            transmission  of data to be interpreted by locally-installed
            programs.  If used,  for  example,  to  transmit  executable
            binary  programs  or programs in general-purpose interpreted
            languages, such as LISP programs or  shell  scripts,  severe
            security  problems  could  result.   In  general, authors of
            mail-reading  agents  are  cautioned  against  giving  their
            systems  the  power  to  execute mail-based application data
            without carefully  considering  the  security  implications.
            While  it  is  certainly possible to define safe application
            formats and even safe interpreters for unsafe formats,  each
            interpreter  should  be  evaluated  separately  for possible
            security problems.

            Content-type: image

            Subtypes defined by this document:  jpeg, gif

            Important Parameters: none

            Encoding notes: base64 generally preferred


            Content-type: audio

            Subtypes defined by this document:  basic

            Important Parameters: none

            Encoding notes: base64 generally preferred


            Content-type: video

            Subtypes defined by this document:  mpeg

            Important Parameters: none

            Encoding notes: base64 generally preferred
ToP   noToC   RFC1341 - Page 74
Appendix H -- Canonical Encoding Model

            There was some confusion, in earlier drafts  of  this  memo,
            regarding  the model for when email data was to be converted
            to canonical form and encoded, and in  particular  how  this
            process  would affect the treatment of CRLFs, given that the
            representation of newlines varies  greatly  from  system  to
            system.   For this reason, a canonical model for encoding is
            presented below.

            The process of composing a MIME message part can be modelled
            as  being  done in a number of steps.  Note that these steps
            are roughly similar to those steps used in RFC1113:

            Step 1.  Creation of local form.

            The body part to be transmitted is created in  the  system's
            native format.   The native character set is used, and where
            appropriate local end of line conventions are used as  well.
            The may be a UNIX-style text file, or a Sun raster image, or
            a VMS indexed file, or  audio  data  in  a  system-dependent
            format   stored  only  in  memory,  or  anything  else  that
            corresponds to the local model  for  the  representation  of
            some form of information.

            Step 2.  Conversion to canonical form.

            The entire body part,  including  "out-of-band"  information
            such   as   record   lengths  and  possibly  file  attribute
            information, is converted to  a  universal  canonical  form.
            The  specific  content  type of the body part as well as its
            associated attributes dictate the nature  of  the  canonical
            form  that is used.  Conversion to the proper canonical form
            may involve  character  set  conversion,  transformation  of
            audio   data,   compression,  or  various  other  operations
            specific to the various content types.

            For example, in the case of text/plain data, the  text  must
            be  converted to a supported character set and lines must be
            delimited with CRLF delimiters in  accordance  with  RFC822.
            Note  that the restriction on line lengths implied by RFC822
            is eliminated  if  the  next  step  employs  either  quoted-
            printable or base64 encoding.

            Step 3.  Apply transfer encoding.

            A Content-Transfer-Encoding appropriate for this  body  part
            is  applied.   Note  that  there  is  no  fixed relationship
            between the content  type  and  the  transfer  encoding.  In
            particular,  it  may  be  appropriate  to base the choice of
            base64 or quoted-printable  on  character  frequency  counts
            which are specific to a given instance of body part.
ToP   noToC   RFC1341 - Page 75
            Step 4.  Insertion into message.

            The encoded object is inserted  into  a  MIME  message  with
            appropriate body part headers and boundary markers.

            It is vital to note that these steps are only a model;  they
            are  specifically  NOT  a blueprint for how an actual system
            would be built.  In particular, the model fails  to  account
            for two common designs:

                 1.  In many cases the conversion  to  a  canonical
                 form  prior  to encoding will be subsumed into the
                 encoder itself, which  understands  local  formats
                 directly.    For   example,   the   local  newline
                 convention for text  bodyparts  might  be  carried
                 through to the encoder itself along with knowledge
                 of what that format is.

                 2.  The output of the encoders may  have  to  pass
                 through  one  or  more  additional  steps prior to
                 being transmitted as  a  message.   As  such,  the
                 output  of  the  encoder may not be compliant with
                 the formats specified by RFC822.   In  particular,
                 once   again   it   may  be  appropriate  for  the
                 converter's output to  be  expressed  using  local
                 newline conventions rather than using the standard
                 RFC822 CRLF delimiters.

            Other implementation variations  are  conceivable  as  well.
            The  only  important  aspect  of this discussion is that the
            resulting messages are consistent with those produced by the
            model described here.
ToP   noToC   RFC1341 - Page 76

            [US-ASCII] Coded Character Set--7-Bit American Standard Code
            for Information Interchange, ANSI X3.4-1986.

            [ATK]  Borenstein,  Nathaniel  S.,  Multimedia  Applications
            Development with the Andrew Toolkit, Prentice-Hall, 1990.

            [GIF] Graphics Interchange Format (Version 89a), Compuserve,
            Inc., Columbus, Ohio, 1990.

            [ISO-2022] International Standard--Information  Processing--
            ISO  7-bit  and  8-bit  coded character sets--Code extension
            techniques, ISO 2022:1986.

            [ISO-8859] Information Processing -- 8-bit Single-Byte Coded
            Graphic  Character Sets -- Part 1: Latin Alphabet No. 1, ISO
            8859-1:1987.  Part 2: Latin  alphabet  No.  2,  ISO  8859-2,
            1987.  Part 3: Latin alphabet No. 3, ISO 8859-3, 1988.  Part
            4:  Latin  alphabet  No.  4,  ISO  8859-4,  1988.   Part  5:
            Latin/Cyrillic   alphabet,  ISO  8859-5,  1988.     Part  6:
            Latin/Arabic  alphabet,  ISO  8859-6,   1987.      Part   7:
            Latin/Greek   alphabet,   ISO   8859-7,   1987.     Part  8:
            Latin/Hebrew alphabet, ISO 8859-8, 1988.     Part  9:  Latin
            alphabet No. 5, ISO 8859-9, 1990.

            [ISO-646] International  Standard--Information  Processing--
            ISO  7-bit coded  character set for information interchange,
            ISO 646:1983.

            [MPEG]  Video  Coding  Draft  Standard  ISO  11172  CD,  ISO
            IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991.

            [ODA] ISO 8613;  Information  Processing:  Text  and  Office
            System;  Office  Document Architecture (ODA) and Interchange
            Format (ODIF), Part 1-8, 1989.

            [PCM] CCITT, Fascicle III.4 - Recommendation G.711,  Geneva,
            1972, "Pulse Code Modulation (PCM) of Voice Frequencies".

            [POSTSCRIPT]  Adobe  Systems,  Inc.,   PostScript   Language
            Reference Manual,  Addison-Wesley, 1985.

            [X400]  Schicker, Pietro, "Message Handling Systems, X.400",
            Message  Handling  Systems  and Distributed Applications, E.
            Stefferud, O-j. Jacobsen,  and  P.  Schicker,  eds.,  North-
            Holland, 1989, pp. 3-41.

            [RFC-783]  Sollins, K.R.  TFTP Protocol (revision 2).  June,
            1981, MIT, RFC-783.

            [RFC-821]  Postel,  J.B.   Simple  Mail  Transfer  Protocol.
            August, 1982, USC/Information Sciences Institute, RFC-821.
ToP   noToC   RFC1341 - Page 77
            [RFC-822]   Crocker, D.  Standard for  the  format  of  ARPA
            Internet  text  messages. August, 1982, UDEL, RFC-822.

            [RFC-934]   Rose, M.T.; Stefferud, E.A.   Proposed  standard
            for    message     encapsulation.  January,   1985, Delaware
            and NMA, RFC-934.

            [RFC-959]   Postel,  J.B.;  Reynolds,  J.K.   File  Transfer
            Protocol.      October,   1985,   USC/Information   Sciences
            Institute, RFC-959.

            [RFC-1049]   Sirbu,  M.A.   Content-Type  header  field  for
            Internet messages.  March, 1988, CMU,  RFC-1049.

            [RFC-1113]   Linn,  J.   Privacy  enhancement  for  Internet
            electronic    mail:  Part    I  -  message  encipherment and
            authentication procedures.   August,  1989, IAB Privacy Task
            Force, RFC-1113.

            [RFC-1154]  Robinson, D.; Ullmann, R.  Encoding header field
            for   Internet   messages.  April,   1990,   Prime Computer,
            Inc., RFC-1154.

            [RFC-1342] Moore, Keith, Representation of Non-Ascii Text in
            Internet   Message   Headers.   June,  1992,  University  of
            Tennessee, RFC-1342.

Security Considerations

            Security issues  are  discussed  in  Section  7.4.2  and  in
            Appendix  G.   Implementors should pay special attention  to
            the security implications of any mail content-types that can
            cause the remote execution of any actions in the recipient's
            environment.   In  such  cases,  the   discussion   of   the
            applicaton/postscript   content-type  in  Section  7.4.2 may
            serve as a model for considering  other  content-types  with
            remote execution capabilities.
ToP   noToC   RFC1341 - Page 78
Authors' Addresses

            For more information, the authors of this  document  may  be
            contacted via Internet mail:

                                Nathaniel S. Borenstein
                                 MRE 2D-296, Bellcore
                                     445 South St.
                               Morristown, NJ 07962-1910

                                Phone: +1 201 829 4270
                                 Fax:  +1 201 829 7019

                                       Ned Freed
                             Innosoft International, Inc.
                                 250 West First Street
                                       Suite 240
                                  Claremont, CA 91711

                                Phone:  +1 714 624 7907
                                 Fax: +1 714 621 5319
ToP   noToC   RFC1341 - Page 79

            Please discard this page and place the  following  table  of
            contents after the title page.
ToP   noToC   RFC1341 - Page 80
                               Table of Contents

            1     Introduction.......................................  1
            2     Notations, Conventions, and Generic BNF Grammar....  3
            3     The MIME-Version Header Field......................  5
            4     The Content-Type Header Field......................  6
            5     The Content-Transfer-Encoding Header Field......... 10
            5.1   Quoted-Printable Content-Transfer-Encoding......... 14
            5.2   Base64 Content-Transfer-Encoding................... 17
            6     Additional Optional Content- Header Fields......... 19
            6.1   Optional Content-ID Header Field................... 19
            6.2   Optional Content-Description Header Field.......... 19
            7     The Predefined Content-Type Values................. 20
            7.1   The Text Content-Type.............................. 20
            7.1.1 The charset parameter.............................. 20
            7.1.2 The Text/plain subtype............................. 23
            7.1.3 The Text/richtext subtype.......................... 23
            7.2   The Multipart Content-Type......................... 29
            7.2.1 Multipart:  The common syntax...................... 30
            7.2.2 The Multipart/mixed (primary) subtype.............. 34
            7.2.3 The Multipart/alternative subtype.................. 34
            7.2.4 The Multipart/digest subtype....................... 36
            7.2.5 The Multipart/parallel subtype..................... 36
            7.3   The Message Content-Type........................... 37
            7.3.1 The Message/rfc822 (primary) subtype............... 37
            7.3.2 The Message/Partial subtype........................ 37
            7.3.3 The Message/External-Body subtype.................. 40
            7.4   The Application Content-Type....................... 46
            7.4.1 The Application/Octet-Stream (primary) subtype..... 46
            7.4.2 The Application/PostScript subtype................. 47
            7.4.3 The Application/ODA subtype........................ 50
            7.5   The Image Content-Type............................. 51
            7.6   The Audio Content-Type............................. 51
            7.7   The Video Content-Type............................. 51
            7.8   Experimental Content-Type Values................... 51
                  Summary............................................ 53
                  Acknowledgements................................... 54
                  Appendix A -- Minimal MIME-Conformance............. 56
                  Appendix B -- General Guidelines For Sending Email Data59
                  Appendix C -- A Complex Multipart Example.......... 62
                  Appendix D -- A Simple Richtext-to-Text Translator in C64
                  Appendix E -- Collected Grammar.................... 66
                  Appendix F -- IANA Registration Procedures......... 68
                  F.1  Registration of New Content-type/subtype Values..68
                  F.2  Registration of New Character Set Values...... 69
                  F.3  Registration of New Access-type Values for Message/external-body69
                  F.4  Registration of New Conversions Values for Application69
                  Appendix G -- Summary of the Seven Content-types... 71
                  Appendix H -- Canonical Encoding Model............. 73
                  References......................................... 75
                  Security Considerations............................ 76
                  Authors' Addresses................................. 77