Internet Engineering Task Force (IETF) S. McGlashan
Request for Comments: 6231 Hewlett-Packard
Category: Standards Track T. Melanchuk
ISSN: 2070-1721 Rainwillow
May 2011 An Interactive Voice Response (IVR) Control Package
for the Media Control Channel Framework
This document defines a Media Control Channel Framework Package for
Interactive Voice Response (IVR) dialog interaction on media
connections and conferences. The package defines dialog management
request elements for preparing, starting, and terminating dialog
interactions, as well as associated responses and notifications.
Dialog interactions are specified in a dialog language. This package
defines a lightweight IVR dialog language (supporting prompt
playback, runtime controls, Dual-Tone Multi-Frequency (DTMF)
collection, and media recording) and allows other dialog languages to
be used. The package also defines elements for auditing package
capabilities and IVR dialogs.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
Copyright (c) 2011 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
The Media Control Channel Framework [RFC6230] provides a generic
approach for establishment and reporting capabilities of remotely
initiated commands. The Channel Framework -- an equivalent term for
the Media Control Channel Framework -- utilizes many functions
provided by the Session Initiation Protocol (SIP) [RFC3261] for the
rendezvous and establishment of a reliable channel for control
interactions. The Control Framework also introduces the concept of a
Control Package. A Control Package is an explicit usage of the
Control Framework for a particular interaction set. This document
defines a Control Package for Interactive Voice Response (IVR)
dialogs on media connections and conferences. The term 'dialog' in
this document refers to an IVR dialog and is completely unrelated to
the notion of a SIP dialog. The term 'IVR' is used in its inclusive
sense, allowing media other than voice for dialog interaction.
The package defines dialog management request elements for preparing,
starting, and terminating dialog interactions, as well as associated
responses and notifications. Dialog interactions are specified using
a dialog language where the language specifies a well-defined syntax
and semantics for permitted operations (play a prompt, record input
from the user, etc.). This package defines a lightweight IVR dialog
language (supporting prompt playback, runtime controls, DTMF
collection, and media recording) and allows other dialog languages to
be used. These dialog languages are specified inside dialog
management elements for preparing and starting dialog interactions.
The package also defines elements for auditing package capabilities
and IVR dialogs.
This package has been designed to satisfy IVR requirements documented
in "Media Server Control Protocol Requirements" [RFC5167] -- more
specifically, REQ-MCP-28, REQ-MCP-29, and REQ-MCP-30. It achieves
this by building upon two major approaches to IVR dialog design.
These approaches address a wide range of IVR use cases and are used
in many applications that are extensively deployed today.
First, the package is designed to provide the major IVR functionality
of SIP media server languages such as netann [RFC4240], Media Server
Control Markup Language (MSCML) [RFC5022], and Media Server Markup
Language (MSML) [RFC5707], which themselves build upon more
traditional non-SIP languages ([H.248.9], [RFC2897]). A key
differentiator is that this package provides IVR functionality using
the Channel Framework.
Second, its design is aligned with key concepts of the web model as
defined in W3C Voice Browser languages. The key dialog management
mechanism is closely aligned with Call Control XML (CCXML) [CCXML10].
The dialog functionality defined in this package can be largely seen
as a subset of VoiceXML ([VXML20], [VXML21]): where possible, basic
prompting, DTMF collection, and media recording features are
incorporated, but not any advanced VoiceXML constructs (such as
<form>, its interpretation algorithm, or a dynamic data model). As
W3C develops VoiceXML 3.0 [VXML30], we expect to see further
alignment, especially in providing a set of basic independent
primitive elements (such as prompt, collect, record, and runtime
controls) that can be reused in different dialog languages.
By reusing and building upon design patterns from these approaches to
IVR languages, this package is intended to provide a foundation that
is familiar to current IVR developers and sufficient for most IVR
applications, as well as a path to other languages that address more
This Control Package defines a lightweight IVR dialog language. The
scope of this dialog language is the following IVR functionality:
o playing one or more media resources as a prompt to the user
o runtime controls (including VCR controls like speed and volume)
o collecting DTMF input from the user according to a grammar
o recording user media input
Out of scope for this dialog language are more advanced functions
including ASR (Automatic Speech Recognition), TTS (Text-to-Speech),
fax, automatic prompt recovery ('media fallback'), and media
transformation. Such functionality can be addressed by other dialog
languages (such as VoiceXML) used with this package, extensions to
this package (addition of foreign elements or attributes from another
namespace), or other Control Packages.
The functionality of this package is defined by messages, containing
XML [XML] elements, transported using the Media Control Channel
Framework. The XML elements can be divided into three types: dialog
management elements; a dialog element that defines a lightweight IVR
dialog language used with dialog management elements; and finally,
elements for auditing package capabilities as well as dialogs managed
by the package.
Dialog management elements are designed to manage the general
lifecycle of a dialog. Elements are provided for preparing a dialog,
starting the dialog on a conference or connection, and terminating
execution of a dialog. Each of these elements is contained in a
Media Control Channel Framework CONTROL message sent to the media
server. When the appropriate action has been executed, the media
server sends a REPORT message (or a 200 response to the CONTROL
message if it can execute in time) with a response element indicating
whether or not the operation was successful (e.g., if the dialog
cannot be started, then the error is reported in this response).
Once a dialog has been successfully started, the media server can
send further event notifications in a framework CONTROL message.
This package defines two event notifications: a DTMF event indicating
the DTMF activity, and a dialogexit event indicating that the dialog
has exited. If the dialog has executed successfully, the dialogexit
event includes information collected during the dialog. If an error
occurs during execution (e.g., a media resource failed to play, no
recording resource available, etc.), then error information is
reported in the dialogexit event. Once a dialogexit event is sent,
the dialog lifecycle is terminated.
The dialog management elements for preparing and starting a dialog
specify the dialog using a dialog language. A dialog language has
well-defined syntax and semantics for defined dialog operations.
Typically, dialog languages are written in XML where the root element
has a designated XML namespace and, when used as standalone
documents, have an associated MIME media type. For example, VoiceXML
is an XML dialog language with the root element <vxml> with the
designated namespace 'http://www.w3.org/2001/vxml' and standalone
documents are associated with the MIME media type 'application/
This Control Package defines its own lightweight IVR dialog language.
The language has a root element (<dialog>) with the same designated
namespace as used for other elements defined in this package (see
Section 8.2). The root element contains child elements for playing
prompts to the user, specifying runtime controls, collecting DTMF
input from the user, and recording media input from the user. The
child elements can co-occur so as to provide 'play announcement',
'prompt and collect', as well as 'prompt and record' functionality.
The dialog management elements for preparing and starting a dialog
can specify the dialog language either by including inline a fragment
with the root element or by referencing an external dialog document.
The dialog language defined in this package is specified inline.
Other dialog languages, such as VoiceXML, can be used by referencing
an external dialog document.
The document is organized as follows. Section 3 describes how this
Control Package fulfills the requirements for a Media Control Channel
Framework Control Package. Section 4 describes the syntax and
semantics of defined elements, including dialog management
(Section 4.2), the IVR dialog element (Section 4.3), and audit
elements (Section 4.4). Section 5 describes an XML schema for these
elements and provides extensibility by allowing attributes and
elements from other namespaces. Section 6 provides examples of
package usage. Section 7 describes important security considerations
for use of this Control Package. Section 8 provides information on
IANA registration of this Control Package, including its name, XML
namespace, and MIME media type. It also establishes a registry for
prompt variables. Finally, Section 9 provides additional information
on using VoiceXML when supported as an external dialog language.
2. Conventions and Terminology
In this document, BCP 14 [RFC2119] defines the key words "MUST",
"MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT",
"RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL". In
addition, BCP 15 indicates requirement levels for compliant
The following additional terms are defined for use in this document:
Dialog: A dialog performs media interaction with a user following
the concept of an IVR (Interactive Voice Response) dialog (this
sense of 'dialog' is completely unrelated to a SIP dialog). A
dialog is specified as inline XML or via a URI reference to an
external dialog document. Traditional IVR dialogs typically
feature capabilities such as playing audio prompts, collecting
DTMF input, and recording audio input from the user. More
inclusive definitions include support for other media types,
runtime controls, synthesized speech, recording and playback of
video, recognition of spoken input, and mixed initiative
Application Server: A SIP [RFC3261] application server (AS) hosts
and executes services such as interactive media and conferencing
in an operator's network. An AS influences and impacts the SIP
session, in particular by terminating SIP sessions on a media
server, which is under its control.
Media Server: A media server (MS) processes media streams on behalf
of an AS by offering functionality such as interactive media,
conferencing, and transcoding to the end user. Interactive media
functionality is realized by way of dialogs that are initiated by
the application server.
3. Control Package Definition
This section fulfills the mandatory requirement for information that
MUST be specified during the definition of a Control Framework
Package, as detailed in Section 7 of [RFC6230].
3.1. Control Package Name
The Control Framework requires a Control Package to specify and
register a unique name.
The name of this Control Package is "msc-ivr/1.0" (Media Server
Control - Interactive Voice Response - version 1.0). Its IANA
registration is specified in Section 8.1.
Since this is the initial ("1.0") version of the Control Package,
there are no backwards-compatibility issues to address.
3.2. Framework Message Usage
The Control Framework requires a Control Package to explicitly detail
the CONTROL messages that can be used as well as provide an
indication of directionality between entities. This will include
which role type is allowed to initiate a request type.
This package specifies Control and response messages in terms of XML
elements defined in Section 4, where the message bodies have the MIME
media type defined in Section 8.4. These elements describe requests,
responses, and notifications and all are contained within a root
<mscivr> element (Section 4.1).
In this package, the MS operates as a Control Server in receiving
requests from, and sending responses to, the AS (operating as Control
Client). Dialog management requests and responses are defined in
Section 4.2. Audit requests and responses are defined in
Section 4.4. Dialog management and audit responses are carried in a
framework 200 response or REPORT message bodies. This package's
response codes are defined in Section 4.5.
Note that package responses are different from framework response
codes. Framework error response codes (see Section 7 of [RFC6230])
are used when the request or event notification is invalid; for
example, a request is invalid XML (400), or not understood (500).
The MS also operates as a Control Client in sending event
notification to the AS (Control Server). Event notifications
(Section 4.2.5) are carried in CONTROL message bodies. The AS MUST
respond with a Control Framework 200 response.
3.3. Common XML Support
The Control Framework requires a Control Package definition to
specify if the attributes for media dialog or conference references
This package requires that the XML schema in Section A.1 of [RFC6230]
MUST be supported for media dialogs and conferences.
The package uses "connectionid" and "conferenceid" attributes for
various element definitions (Section 4). The XML schema (Section 5)
imports the definitions of these attributes from the framework
3.4. CONTROL Message Body
The Control Framework requires a Control Package to define the
control body that can be contained within a CONTROL command request
and to indicate the location of detailed syntax definitions and
semantics for the appropriate body types.
When operating as Control Server, the MS receives Control message
bodies with the MIME media type defined in Section 8.4 and containing
an <mscivr> element (Section 4.1) with either a dialog management or
audit request child element.
The following dialog management request elements are carried in
CONTROL message bodies to the MS: <dialogprepare> (Section 4.2.1),
<dialogstart> (Section 4.2.2), and <dialogterminate> (Section 4.2.3)
The <audit> request element (Section 4.4.1) is also carried in
CONTROL message bodies.
When operating as Control Client, the MS sends CONTROL messages with
the MIME media type defined in Section 8.4 and a body containing an
<mscivr> element (Section 4.1) with a notification <event> child
element (Section 4.2.5).
3.5. REPORT Message Body
The Control Framework requires a Control Package definition to define
the REPORT body that can be contained within a REPORT command
request, or that no report package body is required. This section
indicates the location of detailed syntax definitions and semantics
for the appropriate body types.
When operating as Control Server, the MS sends REPORT bodies with the
MIME media type defined in Section 8.4 and containing a <mscivr>
element (Section 4.1) with a response child element. The response
element for dialog management requests is a <response> element
(Section 4.2.4). The response element for an audit request is an
<auditresponse> element (Section 4.4.2).
The Control Framework encourages Control Packages to specify whether
auditing is available, how it is triggered, as well as the query/
This Control Package supports auditing of package capabilities and
dialogs on the MS. An audit request is carried in a CONTROL message
(see Section 3.4) and an audit response in a REPORT message (or a 200
response to the CONTROL if it can execute the audit in time) (see
The syntax and semantics of audit request and response elements are
defined in Section 4.4.
The Control Framework recommends Control Packages to provide a range
of message flows that represent common flows using the package and
this framework document.
This Control Package provides examples of such message flows in