Everipedia Logo
Everipedia is now IQ.wiki - Join the IQ Brainlist and our Discord for early access to editing on the new platform and to participate in the beta testing.
Extended Backus–Naur Form

Extended Backus–Naur Form

In computer science, extended Backus–Naur form (EBNF) is a family of metasyntax notations, any of which can be used to express a context-free grammar. EBNF is used to make a formal description of a formal language such as a computer programming language. They are extensions of the basic Backus–Naur form (BNF) metasyntax notation.

The earliest EBNF was developed by Niklaus Wirth incorporating some of the concepts (with a different syntax and notation) from Wirth syntax notation. However, many variants of EBNF are in use. The International Organization for Standardization adopted an EBNF standard (ISO/IEC 14977 [20] ) in 1996. However, according to Zaytsev this standard "only ended up adding yet another three dialects to the chaos" and, after noting its lack of success, also notes that the ISO EBNF is not even used in all ISO standards. Wheeler argues against using the ISO standard when using an EBNF, and recommends considering alternative EBNF notations such as the one from the W3C Extensible Markup Language (XML) 1.0 (Fifth Edition).

This article uses EBNF as specified by the ISO for examples applying to all EBNFs. Other EBNF variants use somewhat different syntactic conventions.

Basics

EBNF is a code that expresses the grammar of a formal language. An EBNF consists of terminal symbols and non-terminal production rules which are the restrictions governing how terminal symbols can be combined into a legal sequence. Examples of terminal symbols include alphanumeric characters, punctuation marks, and whitespace characters.

The EBNF defines production rules where sequences of symbols are respectively assigned to a nonterminal:

This production rule defines the nonterminal digit which is on the left side of the assignment. The vertical bar represents an alternative and the terminal symbols are enclosed with quotation marks followed by a semicolon as terminating character. Hence a digit is a 0 or a digit excluding zero that can be 1 or 2 or 3 and so forth until 9.

A production rule can also include a sequence of terminals or nonterminals, each separated by a comma:

Expressions that may be omitted or repeated can be represented through curly braces { ... }:

In this case, the strings 1, 2, ..., 10, ..., 12345, ... are correct expressions. To represent this, everything that is set within the curly braces may be repeated arbitrarily often, including not at all.

An option can be represented through squared brackets [ ... ]. That is, everything that is set within the square brackets may be present just once, or not at all:

Therefore, an integer is a zero (0) or a natural number that may be preceded by an optional minus sign.

EBNF also provides, among other things, the syntax to describe repetitions (of a specified number of times), to exclude some part of a production, and to insert comments in an EBNF grammar.

Table of symbols

The following represents a proposed ISO/IEC 14977 standard, by R. S. Scowen, page 7, table 1.

UsageNotation
definition=
concatenation,
termination;
alternation|
optional[ ... ]
repetition{ ... }
grouping( ... )
terminal string" ... "
terminal string' ... '
comment(* ... *)
special sequence? ... ?
exception

Examples

Even EBNF can be described using EBNF. Consider the sketched grammar below:

A Pascal-like programming language that allows only assignments can be defined in EBNF as follows:

A syntactically correct program then would be:

The language can easily be extended with control flows, arithmetical expressions, and Input/Output instructions. Then a small, usable programming language would be developed.

Advantages over BNF

Any grammar defined in EBNF can also be represented in BNF, though representations in the latter are generally lengthier. E.g., options and repetitions cannot be directly expressed in BNF and require the use of an intermediate rule or alternative production defined to be either nothing or the optional production for option, or either the repeated production of itself, recursively, for repetition. The same constructs can still be used in EBNF.

The BNF uses the symbols (<, >, |, ::=) for itself, but does not include quotes around terminal strings. This prevents these characters from being used in the languages, and requires a special symbol for the empty string. In EBNF, terminals are strictly enclosed within quotation marks ("..." or '...'). The angle brackets ("<...>") for nonterminals can be omitted.

BNF syntax can only represent a rule in one line, whereas in EBNF a terminating character, the semicolon, marks the end of a rule.

Furthermore, EBNF includes mechanisms for enhancements, defining the number of repetitions, excluding alternatives, comments, etc.

Conventions

As examples, the following syntax rules illustrate the facilities for expressing repetition:

Terminal strings defined by these rules are as follows:

Extensibility

According to the ISO 14977 standard EBNF is meant to be extensible, and two facilities are mentioned. The first is part of EBNF grammar, the special sequence, which is arbitrary text enclosed with question marks. The interpretation of the text inside a special sequence is beyond the scope of the EBNF standard. For example, the space character could be defined by the following rule:

The second facility for extension is using the fact that parentheses cannot in EBNF be placed next to identifiers (they must be concatenated with them). The following is valid EBNF:

The following is not valid EBNF:

Therefore, an extension of EBNF could use that notation. For example, in a Lisp grammar, function application could be defined by the following rule:

  • The W3C used a different EBNF [21] to specify the XML syntax.

  • The British Standards Institution published a standard for an EBNF: BS 6154 in 1981.

  • The IETF uses augmented BNF (ABNF), specified in RFC 5234 [22] .

See also

  • Regular expression

  • Spirit Parser Framework

  • Phrase structure rules, the direct equivalent of EBNF in natural languages.

References

[1]
Citation Linkwww.w3.orga different EBNF
Sep 26, 2019, 9:14 PM
[2]
Citation Linktools.ietf.orgRFC 5234
Sep 26, 2019, 9:14 PM
[3]
Citation Linkdoi.acm.orgWhat can we do about the unnecessary diversity of notation for syntactic definitions?
Sep 26, 2019, 9:14 PM
[4]
Citation Linkdwheeler.comDon't Use ISO/IEC 14977 Extended Backus-Naur Form (EBNF)
Sep 26, 2019, 9:14 PM
[5]
Citation Linkwww.grammarware.netBNF WAS HERE: What Have We Done About the Unnecessary Diversity of Notation for Syntactic Definitions?
Sep 26, 2019, 9:14 PM
[6]
Citation Linkwww.iso.orgISO 14977
Sep 26, 2019, 9:14 PM
[7]
Citation Linkstandards.iso.orgZip-compressed PDF file
Sep 26, 2019, 9:14 PM
[8]
Citation Linkwww.ics.uci.eduEBNF: A Notation to Describe Syntax (PDF)
Sep 26, 2019, 9:14 PM
[9]
Citation Linkwww.garshol.priv.noBNF and EBNF: What are they and how do they work?
Sep 26, 2019, 9:14 PM
[10]
Citation Linkxml.comThe Naming of Parts
Sep 26, 2019, 9:14 PM
[11]
Citation Linkwww.cl.cam.ac.ukISO/IEC 14977 : 1996(E)
Sep 26, 2019, 9:14 PM
[12]
Citation Linktools.ietf.orgRFC 4234
Sep 26, 2019, 9:14 PM
[13]
Citation Linktools.ietf.orgRFC 5234
Sep 26, 2019, 9:14 PM
[14]
Citation Linkwww.cs.man.ac.ukBNF/EBNF variants
Sep 26, 2019, 9:14 PM
[15]
Citation Linkweb.archive.orgCreate syntax diagrams from EBNF
Sep 26, 2019, 9:14 PM
[16]
Citation Linkkarmin.chEBNF Parser & Renderer
Sep 26, 2019, 9:14 PM
[17]
Citation Linkgithub.comParser & Renderer in PHP5
Sep 26, 2019, 9:14 PM
[18]
Citation Linkwww.lukas-renggli.chWriting Parsers with PetitParser
Sep 26, 2019, 9:14 PM
[19]
Citation Linkgithub.comSRFB Syntax Diagram representation by Function Basis + EBNF generation (javascript)
Sep 26, 2019, 9:14 PM
[20]
Citation Linkstandards.iso.orgISO/IEC 14977
Sep 26, 2019, 9:14 PM