|
|
|
|
|
| |
| |
|
ABNF Notation and Rules -- RFC 5234
|
|
This page reviews the ABNF -- Augmented Backus-Naur Form -- notation described in
the following RFC:
|
| |
| |
|
| | |
RFC5234 01/2008 (16 p.)
[html]
[pdf(2)] |
D. Crocker P. Overell |
|
|
Augmented BNF for Syntax Specifications: ABNF |
|
Internet technical specifications often need to define a formal
syntax. Over the years, a modified version of Backus-Naur Form
(BNF), called Augmented BNF (ABNF), has been popular among many
Internet specifications. The current specification documents ABNF.
It balances compactness and simplicity with reasonable
representational power. The differences between standard BNF and
ABNF involve naming rules, repetition, alternatives, order-
independence, and value ranges. This specification also supplies
additional rule definitions and encoding for a core lexical analyzer
of the type common to several Internet specifications.
|
|
|
| |
| Top |
Status: | Standard (STD: 68) |
|
| |
| |
| |
| |
|
| |
| Concatenation |
Rule1 Rule2 |
|
| |
| Alternatives |
Rule1 / Rule2 |
|
| |
| Incremental Alternatives |
Rule1 =/ Rule2 |
|
The rule set:
|
| ruleset | = |
alt1 / alt2 |
| ruleset | /= |
alt3 |
|
| |
is the same as specifying:
|
| ruleset | = |
alt1 / alt2 / alt3 |
|
|
| |
| Value Range Alternatives |
%c##-## |
|
| |
| Sequence Group |
(Rule1 Rule2) |
|
| |
| Variable Repetition |
*Rule |
The full form is: <a>*<b>element
where <a> and <b> are optional decimal values, indicating at least
<a> and at most <b> occurrences of element.
Default values are 0 and infinity so that:
| *<element> | |
allows any number, including zero |
| 1*<element> | |
requires at least one |
| 3*3<element> | |
allows exactly 3 |
| 1*2<element> | |
allows one or two |
|
| |
| Specific Repetition |
nRule |
A rule of the form:
<n>element
is equivalent to:
<n>*<n>element
That is, exactly <n> occurrences
of <element>. Thus 2DIGIT is a
2-digit number, and 3ALPHA is a string of three alphabetic
characters.
|
| |
| Optional Sequence |
[Rule] |
|
| |
| Comment |
; Comment |
|
| |
| |
| |
|
|
Here are the core rules defined in Appendix B.1 of RFC 5234
(Augmented BNF for Syntax Specifications: ABNF) as a common set
for specific grammars.
|
|
| |
| OCTET | = |
%x00-FF |
; 8 bits of data |
| CHAR | = |
%x01-7F |
; any 7-bit US-ASCII character, excluding NUL |
| VCHAR | = |
%x21-7E |
; visible (printing) characters |
| ALPHA | = |
%x41-5A / %x61-7A |
; A-Z / a-z |
| DIGIT | = |
%x30-39 |
; 0-9
|
| CTL | = |
%x00-1F / %x7F |
; any US-ASCII control character:
; (octets 0 - 31) and DEL (127)
|
| HTAB | = |
%x09 |
; horizontal tab |
| LF | = |
%x0A |
; linefeed |
| CR | = |
%x0D |
; carriage return |
| SP | = |
%x20 |
; space |
| DQUOTE | = |
%x22 |
; " (Double Quote) |
| BIT | = |
"0"
/ "1" |
|
| HEXDIG | = |
DIGIT
/ "A"
/ "B"
/ "C"
/ "D"
/ "E"
/ "F" |
; Note:
; according to the 'char-val' rule,
; letters (A-F) are case insensitive
|
| CRLF | = |
CR LF |
; Internet standard newline |
| WSP | = |
SP / HTAB |
; white space |
| LWSP | = |
*(WSP / CRLF WSP) |
; linear white space (past newline) |
| |
| |
| |
|
| |
| rulelist | = |
1*( rule
/ (*c-wsp c-nl) )
|
| rule | = |
rulename
defined-as
elements
c-nl
; continues if next line starts with white space
|
| rulename | = |
ALPHA *(ALPHA
/ DIGIT /
"-" )
|
| defined-as | = |
*c-wsp
("="
/ "=/" )
*c-wsp
; basic rules definition and incremental alternatives
|
| elements | = |
alternation
*c-wsp
|
| c-wsp | = |
WSP / (c-nl WSP)
|
| c-nl | = |
comment / CRLF
; comment or newline
|
| comment | = |
";" *(WSP
/ VCHAR) CRLF
|
| alternation | = |
concatenation
*(*c-wsp
"/"
*c-wsp concatenation)
|
| concatenation | = |
repetition
*(1*c-wsp repetition)
|
| repetition | = |
[repeat]
element
|
| repeat | = |
1*DIGIT / (*DIGIT "*" *DIGIT)
|
| element | = |
rulename
/ group
/ option
/ char-val
/ num-val
/ prose-val
|
| group | = |
"("
*c-wsp
alternation CRLF
*c-wsp
")"
|
| option | = |
"["
*c-wsp
alternation CRLF
*c-wsp
"]"
|
| char-val | = |
DQUOTE *(%x20-21 / %x23-7E) DQUOTE
; ; quoted string of SP and VCHAR without DQUOTE
; ; NOTE: ABNF strings are case-insensitive.
; Hence: rulename = "abc" and: rulename = "aBc"
; will match "abc", "Abc", "aBc", "abC", "ABc", "aBC", "AbC", and "ABC".
; To specify a rule that IS case SENSITIVE, specify the characters individually.
; For example:
; rulename = %d97 %d98 %d99
; or
; rulename = %d97.98.99
; will match only the string that comprises only the lowercased characters, abc.
|
| num-val | = |
"%"
(bin-val
/ dec-val
/ hex-val)
|
| bin-val | = |
"b" 1*BIT
[ 1*("." 1*BIT)
/ ("-" 1*BIT) ]
; series of concatenated bit values or single ONEOF range
|
| dec-val | = |
"d" 1*DIGIT
[ 1*("." 1*DIGIT)
/ ("-" 1*DIGIT) ]
|
| hex-val | = |
"x" 1*HEXDIG
[ 1*("." 1*HEXDIG)
/ ("-" 1*HEXDIG) ]
|
| prose-val | = |
"<" *(%x20-3D / %x3F-7E)
">"
; bracketed string of SP and VCHAR without angles
; prose description, to be used as last resort
|
| |
|
|
|
|