Tech-invite3GPPspaceIETF RFCsSIP
in Index   Prev   Next

RFC 8610

Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures

Pages: 64
Proposed Standard
Part 3 of 4 – Pages 36 to 52
First   Prev   Next

Top   ToC   RFC8610 - Page 36   prevText

4. Making Use of CDDL

In this section, we discuss several potential ways to employ CDDL.

4.1. As a Guide for a Human User

CDDL can be used to efficiently define the layout of CBOR data, such that a human implementer can easily see how data is supposed to be encoded. Since CDDL maps parts of the CBOR data to human-readable names, tools could be built that use CDDL to provide a human-friendly representation of the CBOR data and allow them to edit such data while remaining compliant with its CDDL definition.

4.2. For Automated Checking of CBOR Data Structures

CDDL has been specified such that a machine can handle the CDDL definition and related CBOR data (and, thus, also JSON data). For example, a machine could use CDDL to check whether or not CBOR data is compliant with its definition.
Top   ToC   RFC8610 - Page 37
   The need for thoroughness of such compliance checking depends on the
   application.  For example, an application may decide not to check the
   data structure at all and use the CDDL definition solely as a means
   to indicate the structure of the data to the programmer.

   On the other hand, the application may also implement a checking
   mechanism that goes as far as checking that all mandatory map members
   are available.

   The matter of how far the data description must be enforced by an
   application is left to the designers and implementers of that
   application, keeping in mind related security considerations.

   In no case is it intended that a CDDL tool would be "writing code"
   for an implementation.

4.3. For Data Analysis Tools

In the long run, it can be expected that more and more data will be stored using the CBOR data format. Where there is data, there is data analysis and the need to process such data automatically. CDDL can be used for such automated data processing, allowing tools to verify data, clean it, and extract particular parts of interest from it. Since CBOR is designed with constrained devices in mind, a likely use of it would be small sensors. An interesting use would thus be automated analysis of sensor data.

5. Security Considerations

This document presents a content rules language for expressing CBOR data structures. As such, it does not bring any security issues on itself, although specifications of protocols that use CBOR naturally need security analyses when defined. General guidelines for writing security considerations are defined in [RFC3552] (BCP 72). Specifications using CDDL to define CBOR structures in protocols need to follow those guidelines. Additional topics that could be considered in a security considerations section for a specification that uses CDDL to define CBOR structures include the following: o Where could the language maybe cause confusion in a way that will enable security issues?
Top   ToC   RFC8610 - Page 38
   o  Where a CDDL matcher is part of the implementation of a system,
      the security of the system ought not depend on the correctness of
      the CDDL specification or CDDL implementation without any further
      defenses in place.

   o  Where the CDDL specification includes extension points, the impact
      of extensions on the security of the system needs to be carefully

   Writers of CDDL specifications are strongly encouraged to value
   clarity and transparency of the specification over its elegance.
   Keep it as simple as possible while still expressing the needed data

   A related observation about formal description techniques in general
   that is strongly recommended to be kept in mind by writers of CDDL
   specifications: just because CDDL makes it easier to handle
   complexity in a specification, that does not make that complexity
   somehow less bad (except maybe on the level of the humans having to
   grasp the complex structure while reading the spec).

6. IANA Considerations

6.1. CDDL Control Operators Registry

IANA has created a registry for control operators (Section 3.8). The "CDDL Control Operators" registry has been created within the "Concise Data Definition Language (CDDL)" registry. Each entry in the subregistry must include the name of the control operator (by convention given with the leading dot) and a reference to its documentation. Names must be composed of the leading dot followed by a text string conforming to the production "id" in Appendix B.
Top   ToC   RFC8610 - Page 39
   Initial entries in this registry are as follows:

                       | Name     | Documentation |
                       | .size    | RFC 8610      |
                       | .bits    | RFC 8610      |
                       | .regexp  | RFC 8610      |
                       | .cbor    | RFC 8610      |
                       | .cborseq | RFC 8610      |
                       | .within  | RFC 8610      |
                       | .and     | RFC 8610      |
                       | .lt      | RFC 8610      |
                       | .le      | RFC 8610      |
                       | .gt      | RFC 8610      |
                       | .ge      | RFC 8610      |
                       | .eq      | RFC 8610      |
                       | .ne      | RFC 8610      |
                       | .default | RFC 8610      |

   All other control operator names are Unassigned.

   The IANA policy for additions to this registry is "Specification
   Required" as defined in [RFC8126] (which involves an Expert Review)
   for names that do not include an internal dot and "IETF Review" for
   names that do include an internal dot.  The expert reviewer is
   specifically instructed that other Standards Development
   Organizations (SDOs) may want to define control operators that are
   specific to their fields (e.g., based on a binary syntax already in
   use at the SDO); the review process should strive to facilitate such
   an undertaking.
Top   ToC   RFC8610 - Page 40

7. References

7.1. Normative References

[ISO6093] ISO, "Information processing -- Representation of numerical values in character strings for information interchange", ISO 6093, 1985. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <>. [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, DOI 10.17487/RFC3552, July 2003, <>. [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 2003, <>. [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, <>. [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008, <>. [RFC7049] Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", RFC 7049, DOI 10.17487/RFC7049, October 2013, <>. [RFC7493] Bray, T., Ed., "The I-JSON Message Format", RFC 7493, DOI 10.17487/RFC7493, March 2015, <>. [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, June 2017, <>. [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <>.
Top   ToC   RFC8610 - Page 41
   [RFC8259]  Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
              Interchange Format", STD 90, RFC 8259,
              DOI 10.17487/RFC8259, December 2017,

              Biron, P. and A. Malhotra, "XML Schema Part 2: Datatypes
              Second Edition", World Wide Web Consortium Recommendation
              REC-xmlschema-2-20041028, October 2004,

7.2. Informative References

[CDDL-Freezer] Bormann, C., "A feature freezer for the Concise Data Definition Language (CDDL)", Work in Progress, draft-bormann-cbor-cddl-freezer-01, August 2018. [GRASP] Bormann, C., Carpenter, B., Ed., and B. Liu, Ed., "A Generic Autonomic Signaling Protocol (GRASP)", Work in Progress, draft-ietf-anima-grasp-15, July 2017. [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE Std 754-2008. [JCR] Newton, A. and P. Cordell, "A Language for Rules Describing JSON Content", Work in Progress, draft-newton-json-content-rules-09, September 2017. [PEG] Ford, B., "Parsing expression grammars: a recognition- based syntactic foundation", Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '04, DOI 10.1145/964001.964011, January 2004. [RELAXNG] ISO/IEC, "Information technology -- Document Schema Definition Language (DSDL) -- Part 2: Regular-grammar- based validation -- RELAX NG", ISO/IEC 19757-2, December 2008. [RFC7071] Borenstein, N. and M. Kucherawy, "A Media Type for Reputation Interchange", RFC 7071, DOI 10.17487/RFC7071, November 2013, <>. [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", RFC 7950, DOI 10.17487/RFC7950, August 2016, <>.
Top   ToC   RFC8610 - Page 42
   [RFC8007]  Murray, R. and B. Niven-Jenkins, "Content Delivery Network
              Interconnection (CDNI) Control Interface / Triggers",
              RFC 8007, DOI 10.17487/RFC8007, December 2016,

   [RFC8152]  Schaad, J., "CBOR Object Signing and Encryption (COSE)",
              RFC 8152, DOI 10.17487/RFC8152, July 2017,

   [RFC8428]  Jennings, C., Shelby, Z., Arkko, J., Keranen, A., and C.
              Bormann, "Sensor Measurement Lists (SenML)", RFC 8428,
              DOI 10.17487/RFC8428, August 2018,

   [YAML]     Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup
              Language (YAML[TM]) Version 1.2", 3rd Edition,
              October 2009, <>.
Top   ToC   RFC8610 - Page 43

Appendix A. Parsing Expression Grammars (PEGs)

This appendix is normative. Since the 1950s, many grammar notations are based on Backus-Naur Form (BNF), a notation for context-free grammars (CFGs) within Chomsky's generative system of grammars. The Augmented Backus-Naur Form (ABNF) [RFC5234], widely used in IETF specifications and also inspiring the syntax of CDDL, is an example of this. Generative grammars can express ambiguity well, but this very property may make them hard to use in recognition systems, spawning a number of subdialects that pose constraints on generative grammars to be used with parser generators; this scenario may be hard for the specification writer to manage. PEGs [PEG] provide an alternative formal foundation for describing grammars that emphasizes recognition over generation and resolves what would have been ambiguity in generative systems by introducing the concept of "prioritized choice". The notation for PEGs is quite close to BNF, with the usual "Extended BNF" features, such as repetition, added. However, where BNF uses the unordered (symmetrical) choice operator "|" (incidentally notated as "/" in ABNF), PEG provides a prioritized choice operator "/". The two alternatives listed are to be tested in left-to-right order, locking in the first successful match and disregarding any further potential matches within the choice (but not disabling alternatives in choices containing this choice, as a cut (Section 3.5.4) would). For example, the ABNF expressions A = "a" "b" / "a" (1) and A = "a" / "a" "b" (2) are equivalent in ABNF's original generative framework but are very different in PEG: in (2), the second alternative will never match, as any input string starting with an "a" will already succeed in the first alternative, locking in the match. Similarly, the occurrence indicators ("?", "*", "+") are "greedy" in PEG, i.e., they consume as much input as they match (and, as a consequence, "a* a" in PEG notation or "*a a" in CDDL syntax never can match anything, as all input matching "a" is already consumed by the initial "a*", leaving nothing to match the second "a").
Top   ToC   RFC8610 - Page 44
   Incidentally, the grammar of CDDL itself, as written in ABNF in
   Appendix B, can be interpreted both (1) in the generative framework
   on which RFC 5234 is based and (2) as a PEG.  This was made possible
   by ordering the choices in the grammar such that a successful match
   made on the left-hand side of a "/" operator is always the intended
   match, instead of relying on the power of symmetrical choices (for
   example, note the sequence of alternatives in the rule for "uint",
   where the lone zero is behind the longer match alternatives that
   start with a zero).

   The syntax used for expressing the PEG component of CDDL is based on
   ABNF, interpreted in the obvious way with PEG semantics.  The ABNF
   convention of notating occurrence indicators before the controlled
   primary, and of allowing numeric values for minimum and maximum
   occurrence around a "*" sign, is copied.  While PEG is only about
   characters, CDDL has a richer set of elements, such as types and
   groups.  Specifically, the following constructs map:

       | CDDL  | PEG   | Remark                                    |
       | "="   | "<-"  | /= and //= are abbreviations              |
       | "//"  | "/"   | prioritized choice                        |
       | "/"   | "/"   | prioritized choice, limited to types only |
       | "?" P | P "?" | zero or one                               |
       | "*" P | P "*" | zero or more                              |
       | "+" P | P "+" | one or more                               |
       | A B   | A B   | sequence                                  |
       | A, B  | A B   | sequence, comma is decoration only        |

   The literal notation and the use of square brackets, curly braces,
   tildes, ampersands, and hash marks are specific to CDDL and unrelated
   to the conventional PEG notation.  The DOT (".") from PEG is replaced
   by the unadorned "#" or its alias "any".  Also, CDDL does not provide
   the syntactic predicate operators NOT ("!") or AND ("&") from PEG,
   reducing expressiveness as well as complexity.

   For more details about PEG's theoretical foundation and interesting
   properties of the operators such as associativity and distributivity,
   the reader is referred to [PEG].
Top   ToC   RFC8610 - Page 45

Appendix B. ABNF Grammar

This appendix is normative. The following is a formal definition of the CDDL syntax in ABNF [RFC5234]. Note that, as is defined in ABNF, the quote-delimited strings below are case insensitive (while string values and names are case sensitive in CDDL). cddl = S 1*(rule S) rule = typename [genericparm] S assignt S type / groupname [genericparm] S assigng S grpent typename = id groupname = id assignt = "=" / "/=" assigng = "=" / "//=" genericparm = "<" S id S *("," S id S ) ">" genericarg = "<" S type1 S *("," S type1 S ) ">" type = type1 *(S "/" S type1) type1 = type2 [S (rangeop / ctlop) S type2] ; space may be needed before the operator if type2 ends in a name type2 = value / typename [genericarg] / "(" S type S ")" / "{" S group S "}" / "[" S group S "]" / "~" S typename [genericarg] / "&" S "(" S group S ")" / "&" S groupname [genericarg] / "#" "6" ["." uint] "(" S type S ")" / "#" DIGIT ["." uint] ; major/ai / "#" ; any rangeop = "..." / ".." ctlop = "." id group = grpchoice *(S "//" S grpchoice) grpchoice = *(grpent optcom)
Top   ToC   RFC8610 - Page 46
     grpent = [occur S] [memberkey S] type
            / [occur S] groupname [genericarg]  ; preempted by above
            / [occur S] "(" S group S ")"

     memberkey = type1 S ["^" S] "=>"
               / bareword S ":"
               / value S ":"

     bareword = id

     optcom = S ["," S]

     occur = [uint] "*" [uint]
           / "+"
           / "?"

     uint = DIGIT1 *DIGIT
          / "0x" 1*HEXDIG
          / "0b" 1*BINDIG
          / "0"

     value = number
           / text
           / bytes

     int = ["-"] uint

     ; This is a float if it has fraction or exponent; int otherwise
     number = hexfloat / (int ["." fraction] ["e" exponent ])
     hexfloat = ["-"] "0x" 1*HEXDIG ["." 1*HEXDIG] "p" exponent
     fraction = 1*DIGIT
     exponent = ["+"/"-"] 1*DIGIT

     text = %x22 *SCHAR %x22
     SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
     SESC = "\" (%x20-7E / %x80-10FFFD)

     bytes = [bsqual] %x27 *BCHAR %x27
     BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
     bsqual = "h" / "b64"
Top   ToC   RFC8610 - Page 47
     id = EALPHA *(*("-" / ".") (EALPHA / DIGIT))
     ALPHA = %x41-5A / %x61-7A
     EALPHA = ALPHA / "@" / "_" / "$"
     DIGIT = %x30-39
     DIGIT1 = %x31-39
     HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
     BINDIG = %x30-31

     S = *WS
     WS = SP / NL
     SP = %x20
     PCHAR = %x20-7E / %x80-10FFFD
     CRLF = %x0A / %x0D.0A

                           Figure 13: CDDL ABNF

   Note that this ABNF does not attempt to reflect the detailed rules of
   what can be in a prefixed byte string.

Appendix C. Matching Rules

This appendix is normative. In this appendix, we go through the ABNF syntax rules defined in Appendix B and briefly describe the matching semantics of each syntactic feature. In this context, an instance (data item) "matches" a CDDL specification if it is allowed by the CDDL specification; this is then broken down into parts of specifications (type and group expressions) and parts of instances (data items). cddl = S 1*(rule S) A CDDL specification is a sequence of one or more rules. Each rule gives a name to a right-hand-side expression, either a CDDL type or a CDDL group. Rule names can be used in the rule itself and/or other rules (and tools can output warnings if that is not the case). The order of the rules is significant only in two cases: 1. The first rule defines the semantics of the entire specification; hence, there is no need to give that root rule a special name or special syntax in the language (as, for example, with "start" in RELAX NG); its name can therefore be chosen to be descriptive. (As with all other rule names, the name of the initial rule may be used in itself or in other rules.)
Top   ToC   RFC8610 - Page 48
   2.  Where a rule contributes to a type or group choice (using "/=" or
       "//="), that choice is populated in the order the rules are
       given; see below.

   rule = typename [genericparm] S assignt S type
        / groupname [genericparm] S assigng S grpent

   typename = id
   groupname = id

   A rule defines a name for a type expression (production "type") or
   for a group expression (production "grpent"), with the intention that
   the semantics does not change when the name is replaced by its
   (parenthesized if needed) definition.  Note that whether the name
   defined by a rule stands for a type or a group isn't always
   determined by syntax alone: e.g., "a = b" can make "a" a type if "b"
   is a type, or a group if "b" is a group.  More subtly, in "a = (b)",
   "a" may be used as a type if "b" is a type, or as a group both when
   "b" is a group and when "b" is a type (a good convention to make the
   latter case stand out to the human reader is to write "a = (b,)").
   (Note that the same dual meaning of parentheses applies within an
   expression but often can be resolved by the context of the
   parenthesized expression.  On the more general point, it may not be
   clear immediately either whether "b" stands for a group or a type --
   this semantic processing may need to span several levels of rule
   definitions before a determination can be made.)

   assignt = "=" / "/="
   assigng = "=" / "//="

   A plain equals sign defines the rule name as the equivalent of the
   expression to the right; it is an error if the name was already
   defined with a different expression.  A "/=" or "//=" extends a named
   type or a group by additional choices; a number of these could be
   replaced by collecting all the right-hand sides and creating a single
   rule with a type choice or a group choice built from the right-hand
   sides in the order of the rules given.  (It is not an error to extend
   a rule name that has not yet been defined; this makes the right-hand
   side the first entry in the choice being created.)

   genericparm = "<" S id S *("," S id S ) ">"
   genericarg = "<" S type1 S *("," S type1 S ) ">"

   Rule names can have generic parameters, which cause temporary
   assignments within the right-hand sides to the parameter names from
   the arguments given when citing the rule name.

   type = type1 *(S "/" S type1)
Top   ToC   RFC8610 - Page 49
   A type can be given as a choice between one or more types.  The
   choice matches a data item if the data item matches any one of the
   types given in the choice.  The choice uses PEG semantics as
   discussed in Appendix A: the first choice that matches wins.  (As a
   result, the order of rules that contribute to a single rule name can
   very well matter.)

   type1 = type2 [S (rangeop / ctlop) S type2]

   Two types can be combined with a range operator (see below) or a
   control operator (see Section 3.8).

   type2 = value

   A type can be just a single value (such as 1 or "icecream" or
   h'0815'), which matches only a data item with that specific value (no
   conversions defined),

      / typename [genericarg]

   or be defined by a rule giving a meaning to a name (possibly after
   supplying generic arguments as required by the generic parameters),

      / "(" S type S ")"

   or be defined in a parenthesized type expression (parentheses may be
   necessary to override some operator precedence), or

      / "{" S group S "}"

   a map expression, which matches a valid CBOR map the key/value pairs
   of which can be ordered in such a way that the resulting sequence
   matches the group expression, or

      / "[" S group S "]"

   an array expression, which matches a CBOR array the elements of which
   -- when taken as values and complemented by a wildcard (matches
   anything) key each -- match the group, or

      / "~" S typename [genericarg]

   an "unwrapped" group (see Section 3.7), which matches the group
   inside a type defined as a map or an array by wrapping the group, or

      / "&" S "(" S group S ")"
      / "&" S groupname [genericarg]
Top   ToC   RFC8610 - Page 50
   an enumeration expression, which matches any value that is within the
   set of values that the values of the group given can take, or

      / "#" "6" ["." uint] "(" S type S ")"

   a tagged data item, tagged with the "uint" given and containing the
   type given as the tagged value, or

      / "#" DIGIT ["." uint]                ; major/ai

   a data item of a major type (given by the DIGIT), optionally
   constrained to the additional information given by the uint, or

      / "#"                                 ; any

   any data item.

   rangeop = "..." / ".."

   A range operator can be used to join two type expressions that stand
   for either two integer values or two floating-point values; it
   matches any value that is between the two values, where the first
   value is always included in the matching set and the second value is
   included for ".." and excluded for "...".

   ctlop = "." id

   A control operator ties a _target_ type to a _controller_ type as
   defined in Section 3.8.  Note that control operators are an extension
   point for CDDL; additional documents may want to define additional
   control operators.

   group = grpchoice *(S "//" S grpchoice)

   A group matches any sequence of key/value pairs that matches any of
   the choices given (again using PEG semantics).

   grpchoice = *(grpent optcom)

   Each of the component groups is given as a sequence of group entries.
   For a match, the sequence of key/value pairs given needs to match the
   sequence of group entries in the sequence given.

   grpent = [occur S] [memberkey S] type

   A group entry can be given by a value type, which needs to be matched
   by the value part of a single element; and, optionally, a memberkey
   type, which needs to be matched by the key part of the element, if
Top   ToC   RFC8610 - Page 51
   the memberkey is given.  If the memberkey is not given, the entry can
   only be used for matching arrays, not for maps.  (See below for how
   that is modified by the occurrence indicator.)

       / [occur S] groupname [genericarg]  ; preempted by above

   A group entry can be built from a named group, or

       / [occur S] "(" S group S ")"

   from a parenthesized group, again with a possible occurrence

   memberkey = type1 S ["^" S] "=>"
             / bareword S ":"
             / value S ":"

   Key types can be given by a type expression, a bareword (which stands
   for a type that just contains a string value created from this
   bareword), or a value (which stands for a type that just contains
   this value).  A key value matches its key type if the key value is a
   member of the key type, unless a cut preceding it in the group
   applies (see Section 3.5.4 for how map matching is influenced by the
   presence of the cuts denoted by "^" or ":" in previous entries).

   bareword = id

   A bareword is an alternative way to write a type with a single text
   string value; it can only be used in the syntactic context given

   optcom = S ["," S]

   (Optional commas do not influence the matching.)

   occur = [uint] "*" [uint]
         / "+"
         / "?"

   An occurrence indicator modifies the group given to its right by
   requiring the group to match the sequence to be matched exactly for a
   certain number of times (see Section 3.2) in sequence, i.e., it acts
   as a (possibly infinite) group choice that contains choices with the
   group repeated each of the occurrences times.
Top   ToC   RFC8610 - Page 52
   The rest of the ABNF describes syntax for value notation that should
   be familiar to readers from programming languages, with the possible
   exception of h'..' and b64'..' for byte strings, as well as syntactic
   elements such as comments and line ends.

(page 52 continued on part 4)

Next Section