Main Page | See live article | Alphabetical index

Parser

A parser is a computer program or a component of a program that analyses the grammatical structure of an input, with respect to a given formal grammar, a process known as parsing. Parsers can be made both for natural languages and for programming languages. Programming language parsers tend to be based around context free grammars as fast and efficient parsers can be written for them. For example LALR parsers are capable of efficiently analysing a wide class of context free grammars.

The task of the parser can be summarized as to determine if and how the input can be derived from the start symbol with the rules of the grammar. A parser can do this in essentially two ways: it can start with the input and attempt to rewrite it to the start symbol, a so-called bottom-up parser, or it can start with the start symbol and try to rewrite it to the input, a so-called top-down parser. For example LL parsers are top-down parsers and LR parsers are bottom-up parsers.

Another important distinction is whether the parser generates a leftmost derivation or a rightmost derivation (see context-free grammar). LL parsers will generate a leftmost derivation and LR parsers will generate a rightmost derivation (although usually in reverse).

A parser generator is a program which takes a formal description of a grammar (e.g. in BNF) and outputs source code for a parser. This parser will then recognise valid strings obeying that grammar and perform associated actions. Unix's yacc is a well known example. Among others are SableCC.

Table of contents
1 Overview of Parsers
2 See also

Overview of Parsers

Top-down parsers

As the name suggests, a Top-down parser works in principle by constructing an abstract syntax tree from the Top node on Down, usually in a pre-order tree traversal pattern. The syntax tree is derived according to the rules of the grammar and the current input token. See the links below for common types of Top-down parsers.

Bottom-up parsers

See also


This article (or an earlier version of it) contains material from FOLDOC, used with permission.