Internet Architecture Board (IAB) P. Hoffman
Request for Comments: 7991 ICANN
Obsoletes: 7749 December 2016
The "xml2rfc" Version 3 Vocabulary
This document defines the "xml2rfc" version 3 vocabulary: an
XML-based language used for writing RFCs and Internet-Drafts. It is
heavily derived from the version 2 vocabulary that is also under
discussion. This document obsoletes the v2 grammar described in
Status of This Memo
This document is not an Internet Standards Track specification; it is
published for informational purposes.
This document is a product of the Internet Architecture Board (IAB)
and represents information that the IAB has deemed valuable to
provide for permanent record. It represents the consensus of the
Internet Architecture Board (IAB). Documents approved for
publication by the IAB are not a candidate for any level of Internet
Standard; see Section 2 of RFC 7841.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
Copyright (c) 2016 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
8. IANA Considerations ............................................868.1. Internet Media Type Registration ..........................868.2. Link Relation Registration ................................879. References .....................................................889.1. Normative References ......................................889.2. Informative References ....................................88Appendix A. Front-Page ("Boilerplate") Generation .................93A.1. The "ipr" Attribute ........................................93A.1.1. Current Values: "*trust200902" .........................93A.1.2. Historic Values ........................................95A.2. The "submissionType" Attribute .............................96A.3. The "consensus" Attribute ..................................97Appendix B. The v3 Format and Processing Tools ....................98B.1. Including External Text with XInclude ......................99B.2. Anchors and IDs ...........................................100B.2.1. Overlapping Values ....................................100B.3. Attributes Controlled by the Prep Tool ....................102Appendix C. RELAX NG Schema ......................................104Appendix D. Schema Differences from v2 ...........................127
IAB Members at the Time of Approval ..............................151
Author's Address .................................................151
This document describes version 3 ("v3") of the "xml2rfc" vocabulary:
an XML-based language ("Extensible Markup Language" [XML]) used for
writing RFCs [RFC7322] and Internet-Drafts [IDGUIDE].
This document obsoletes the version 2 vocabulary ("v2") [RFC7749],
which contains the extended language definition. That document in
turn obsoletes the original version ("v1") [RFC2629]. This document
directly copies the material from [RFC7749] where possible.
The v3 format will be used as part of the new RFC Series format
described in [RFC6949]. The new format will be handled by one or
more new tools for preparing the XML and converting it to other
representations. Features of the expected tools are described in
Appendix B. That section defines some terms used throughout this
document, such as "prep tool" and "formatter".
Note that the vocabulary contains certain constructs that might not
be used when generating the final text; however, they can provide
useful data for other uses (such as index generation, populating a
keyword database, or syntax checks).
In this document, the term "format" is used when describing types of
documents, primarily XML and HTML. The term "representation" is used
when talking about a specific instantiation of a format, such as an
XML document or an HTML document that was created by an XML document.
1.1. Expected Updates to the Specification
Non-interoperable changes in later versions of this specification are
likely based on experience gained in implementing the new publication
toolsets. Revised documents will be published capturing those
changes as the toolsets are completed. Other implementers must not
expect those changes to remain backwards-compatible with the details
described in this document.
1.2. Design Criteria for the Changes in v3
The design criteria of the changes from v2 to v3 are as follows:
o The intention is that starting and editing a v3 document will be
easier than for a v2 document.
o There will be good v2-to-v3 conversion tools for when an author
wants to change versions.
o There are no current plans to make v3 XML the required submission
format for drafts or RFCs. That might happen eventually, but it
is likely to be years away.
There is a desire to keep as much of the v2 grammar as makes sense
within the above design criteria and not to make gratuitous changes
to the v2 grammar. Another way to say this is "we would rather
encourage backwards compatibility but not be constrained by it."
Still, the goal of starting and editing a v3 document being easier
than for a v2 document is more important than backwards compatibility
with v2, given the latter two design criteria.
v3 is upwards compatible with v2, meaning that a v2 document is meant
to be a valid v3 document as well. However, some features of v2 are
deprecated in v3 in favor of new elements. Deprecated features are
listed in Section 1.3.3 and are described in [RFC7749].
1.3. Differences from v2 to v3
This is a (hopefully) complete list of all the technical changes
between [RFC7749] and this document.
1.3.1. New Elements in v3
o Add <dl>, <ul>, and <ol> as new ways to make lists. This is a
significant change from v2 in that the child under these elements
is <li>, not <t>. <li> has a model of either containing one or
more <t> elements, or containing the flowing text normally found
in <t>. These lists are children of <section>s and other lists
instead of <t>.
o Add <strong>, <em>, <tt>, <sub>, and <sup> for character
o Add <aside> for incidental text that will be indented when
o Add <sourcecode> to differentiate from <artwork>.
o Add <table>, <thead>, <tbody>, <tfoot>, <tr>, <td>, and <th> to
give table functionality like that in HTML.
o Add <boilerplate> to hold the automatically generated boilerplate
o Add <blockquote> to indicate a quotation as in a paragraph-like
o Add <name> to sections, notes, figures, and texttables to allow
character formatting (fixed-width font) in their titles and to
allow references in the names.
o Add <postalLine>, free text that represents one line of the
o Add <displayreference> to allow display of more mnemonic anchor
names for automatically included references.
o Add <refcontent> to allow better control of text in a reference.
o Add <referencegroup> to allow referencing multi-RFC documents such
as STDs and BCPs.
o Add <relref> to allow referencing specific sections or anchors in
o Add <link> to point to a resource related to the RFC.
o Add <br> to allow line breaks (but not blank lines) in the
generated output for table cells.
o Add <svg> to allow easy inclusion of SVG drawings in <artwork>.
1.3.2. New Attributes for Existing Elements
o Add "sortRefs", "symRefs", "tocDepth", and "tocInclude" attributes
to <rfc> to cover Processing Instructions (PIs) that were in v2
that are still needed in the grammar. Add "prepTime" to indicate
the time that the XML went through a preparation step. Add
"version" to indicate the version of xml2rfc vocabulary used in
the document. Add "scripts" to indicate which scripts are needed
to render the document. Add "expiresDate" when an Internet-Draft
o Add "ascii" attributes to <email>, <organization>, <street>,
<city>, <region>, <country>, and <code>. Also add
"asciiFullname", "asciiInitials", and "asciiSurname" to <author>.
This allows an author to specify their information in their native
scripts as the primary entry and still allow the ASCII-equivalent
values to appear in the processed documents.
o Add "anchor" attributes to many block elements to allow them to be
linked with <relref> and <xref>.
o Add the "section", "relative", and "sectionFormat" attributes to
o Add the "numbered" and "removeInRFC" attributes to <section>.
o Add the "removeInRFC" attribute to <note>.
o Add "pn" to <artwork>, <aside>, <blockquote>, <boilerplate>, <dt>,
<figure>, <iref>, <li>, <references>, <section>, <sourcecode>,
<t>, and <table> to hold automatically generated numbers for items
in a section that don't have their own numbering (namely figures
o Add "display" to <cref> to indicate to tools whether or not to
display the comment.
o Add "keepWithNext" and "keepWithPrevious" to <t> as a hint to
tools that do pagination that they should try to keep the
paragraph with the next/previous element.
1.3.3. Elements and Attributes Deprecated from v2
Deprecated elements and attributes are legacy vocabulary from v2 that
are supported for input to v3 tools. They are likely to be removed
from those tools in the future. Deprecated attributes are still
listed in Section 2, and deprecated elements are listed in Section 3.
See Appendix B for more information on tools and how they will handle
o Deprecate <list> in favor of <dl>, <ul>, and <ol>.
o Deprecate <spanx>; replace it with <strong>, <em>, and <tt>.
o Deprecate <vspace> because the major use for it, creating pseudo-
paragraph-breaks in lists, is now handled properly.
o Deprecate <texttable>, <ttcol>, and <c>; replace them with the new
table elements (<table> and the elements that can be contained
o Deprecate <facsimile> because it is rarely used.
o Deprecate <format> because it is not useful and has caused
surprise for authors in the past. If the goal is to provide a
single URI (Uniform Resource Identifier) for a reference, use the
"target" attribute in <reference> instead.
o Deprecate <preamble> and <postamble> in favor of simply using <t>
before or after the figure. This also deprecates the "align"
attribute in <figure>.
o Deprecate the "title" attribute in <section>, <note>, <figure>,
<references>, and <texttable> in favor of the new <name>.
o Deprecate the "alt" and "src" attributes in <figure> because they
overlap with the attributes in <artwork>.
o Deprecate the "xml:space" attribute in <artwork> because there was
only one useful value. Deprecate the "height" and "width"
attributes in both <artwork> and <figure> because they are not
needed for the new output formats.
o Deprecate the "pageno" attribute in <xref> because it was unused
in v2. Deprecate the "none" values for the "format" attribute in
<xref> because it makes no sense semantically.
1.3.4. Additional Changes from v2
o Allow non-ASCII characters in the format; the characters that are
actually allowed will be determined by the RFC Series Editor.
o Allow <artwork> and <sourcecode> to be used on their own in
<section> (no longer confine them to a figure).
o Give more specifics of handling the "type" attribute in <artwork>.
o Allow <strong>, <em>, <tt>, <eref>, and <xref> in <cref>.
o Allow the sub-elements inside a <reference> to be in any order.
o Turn off the autogeneration of anchors in <cref> because there is
no use case for them that cannot be achieved in other ways.
o Allow more than one <artwork>, or more than one <sourcecode>, in
o In <front>, make <date> optional.
o In <date>, add restrictions to the "date" and "year" attributes
when used in the <front> for the document's boilerplate text.
o In <postal>, allow the sub-elements to be in any order. Also
allow the inclusion of the new <postalLine> instead of the older
o In <section>, restrict the names of the anchors that can be used
on some types of sections.
o Make <seriesInfo> a child of <front>, and deprecated it as a child
of <reference>. This also deprecates some of the attributes from
<rfc> and moves them into <seriesInfo>.
o <t> now only contains non-block elements, so it no longer contains
o Do not generate the grammar from a DTD, but instead get it
directly from the RELAX Next Generation (RNG) grammar [RNG].
1.4. Syntax Notation
The XML vocabulary here is defined in prose, based on the RELAX NG
schema [RNC] contained in Appendix C (specified in RELAX NG Compact
Note that the schema can be used for automated validity checks, but
certain constraints are only described in prose (example: the
conditionally required presence of the "abbrev" attribute).
The sections below describe all elements and their attributes.
Note that attributes not labeled "mandatory" are optional.
Many elements have an optional "anchor" attribute. In all cases, the
value of the "anchor" attribute needs to be a valid XML "Name"
(Section 2.3 of [XML]), additionally constrained to US-ASCII
characters [USASCII]. Thus, the character repertoire consists of
"A-Z", "a-z", "0-9", "_", "-", ".", and ":", where "0-9", ".", and
"-" are disallowed as start characters. Anchors are described in
more detail in Appendix B.2.
Tools interpreting the XML described here will collapse horizontal
whitespace and line breaks to a single whitespace (except inside
<artwork> and <sourcecode>) and will trim leading and trailing
whitespace. Tab characters (U+0009) inside <artwork> and
<sourcecode> are prohibited.
Some of the elements have attributes that are not described in this
section because those attributes are specific to the prep tool.
People writing tools to process this format should read all of the
appendices for a complete description of these attributes.
Every element in the v3 vocabulary can have an "xml:lang" attribute,
an "xml:base" attribute, or both. The xml:lang attribute specifies
the language used in the element. This is sometimes useful for
renderers that display different fonts for ideographic characters
used in China and Japan. The xml:base attribute is sometimes added
to an XML file when doing XML-to-XML conversion where the base file
has XInclude attributes (see Appendix B.1).
Contains the Abstract of the document. See [RFC7322] for more
information on restrictions for the Abstract.
This element appears as a child element of <front> (Section 2.26).
In any order, but at least one of:
o <dl> elements (Section 2.20)
o <ol> elements (Section 2.34)
o <t> elements (Section 2.53)
o <ul> elements (Section 2.63)
2.1.1. "anchor" Attribute
Document-wide unique identifier for the Abstract.
o <strong> elements (Section 2.50)
o <sub> elements (Section 2.51)
o <sup> elements (Section 2.52)
o <tt> elements (Section 2.62)
o <xref> elements (Section 2.66)
Provides information about the IETF area to which this document
relates (currently not used when generating documents).
The value ought to be either the full name or the abbreviation of one
of the IETF areas as listed on <http://www.ietf.org/iesg/area.html>.
A list of full names and abbreviations will be kept by the RFC Series
This element appears as a child element of <front> (Section 2.26).
Content model: only text content.
This element allows the inclusion of "artwork" in the document.
<artwork> provides full control of horizontal whitespace and line
breaks; thus, it is used for a variety of things, such as diagrams
("line art") and protocol unit diagrams. Tab characters (U+0009)
inside of this element are prohibited.
Alternatively, the "src" attribute allows referencing an external
graphics file, such as a vector drawing in SVG or a bitmap graphic
file, using a URI. In this case, the textual content acts as a
fallback for output representations that do not support graphics;
thus, it ought to contain either (1) a "line art" variant of the
graphics or (2) prose that describes the included image in sufficient
In [RFC7749], the <artwork> element was also used for source code and
formal languages; in v3, this is now done with <sourcecode>.
There are at least five ways to include SVG in artwork in
o Inline, by including all of the SVG in the content of the element,
such as: <artwork type="svg"><svg xmlns="http://www.w3.org/2000/
o Inline, but using XInclude (see Appendix B.1), such as: <artwork
o As a data: URI, such as: <artwork type="svg" src="data:image/
o As a URI to an external entity, such as: <artwork type="svg"
o As a local file, such as: <artwork type="svg" src="diagram12.svg">
The use of SVG in Internet-Drafts and RFCs is covered in much more
detail in [RFC7996].
The above methods for inclusion of SVG art can also be used for
including text artwork, but using a data: URI is probably confusing
for text artwork.
Formatters that do pagination should attempt to keep artwork on a
single page. This is to prevent artwork that is split across pages
from looking like two separate pieces of artwork.
See Section 5 for a description of how to deal with issues of using
"&" and "<" characters in artwork.
This element appears as a child element of <aside> (Section 2.6),
<blockquote> (Section 2.10), <dd> (Section 2.18), <figure>
(Section 2.25), <li> (Section 2.29), <section> (Section 2.46), <td>
(Section 2.56), and <th> (Section 2.58).
<svg> elements (Section 4)
2.5.1. "align" Attribute
Controls whether the artwork appears left justified (default),
centered, or right justified. Artwork is aligned relative to the
left margin of the document.
o "left" (default)
2.5.2. "alt" Attribute
Alternative text description of the artwork (which is more than just
a summary or caption). When the art comes from the "src" attribute
and the format of that artwork supports alternate text, the
alternative text comes from the text of the artwork itself, not from
this attribute. The contents of this attribute are important to
readers who are visually impaired, as well as those reading on
devices that cannot show the artwork well, or at all.
2.5.3. "anchor" Attribute
Document-wide unique identifier for this artwork.
2.5.4. "height" Attribute
2.5.5. "name" Attribute
A filename suitable for the contents (such as for extraction to a
local file). This attribute can be helpful for other kinds of tools
(such as automated syntax checkers, which work by extracting the
artwork). Note that the "name" attribute does not need to be unique
for <artwork> elements in a document. If multiple <artwork> elements
have the same "name" attribute, a processing tool might assume that
the elements are all fragments of a single file, and the tool can
collect those fragments for later processing. See Section 7 for a
discussion of possible problems with the value of this attribute.
2.5.6. "src" Attribute
The URI reference of a graphics file [RFC3986], or the name of a file
on the local disk. This can be a "data" URI [RFC2397] that contains
the contents of the graphics file. Note that the inclusion of art
with the "src" attribute depends on the capabilities of the
processing tool reading the XML document. Tools need to be able to
handle the file: URI, and they should be able to handle http: and
https: URIs as well. The prep tool will be able to handle reading
the "src" attribute.
If no URI scheme is given in the attribute, the attribute is
considered to be a local filename relative to the current directory.
Processing tools must be careful to not accept dangerous values for
the filename, particularly those that contain absolute references
outside the current directory. Document creators should think hard
before using relative URIs due to possible later problems if files
move around on the disk. Also, documents should most likely use
explicit URI schemes wherever possible.
In some cases, the prep tool may remove the "src" attribute after
processing its value. See [RFC7998] for a description of this.
It is an error to have both a "src" attribute and content in the
2.5.7. "type" Attribute
Specifies the type of the artwork. The value of this attribute is
free text with certain values designated as preferred.
The preferred values for <artwork> types are:
The RFC Series Editor will maintain a complete list of the preferred
values on the RFC Editor web site, and that list is expected to be
updated over time. Thus, a consumer of v3 XML should not cause a
failure when it encounters an unexpected type or no type is
specified. The table will also indicate which type of art can appear
in plain-text output (for example, type="svg" cannot).
2.5.8. "width" Attribute
2.5.9. "xml:space" Attribute
This element is a container for content that is semantically less
important or tangential to the content that surrounds it.
This element appears as a child element of <section> (Section 2.46).
In any order:
o <artwork> elements (Section 2.5)
o <dl> elements (Section 2.20)
o <figure> elements (Section 2.25)
o <iref> elements (Section 2.27)
o <list> elements (Section 3.4)
o <ol> elements (Section 2.34)
o <t> elements (Section 2.53)
o <table> elements (Section 2.54)
o <ul> elements (Section 2.63)
2.6.1. "anchor" Attribute
Document-wide unique identifier for this aside.