Main Page | See live article | Alphabetical index

Tuple calculus

The tuple calculus is a calculus that was introduced by Edgar F. Codd as part of the relational model in order to give a declarative database query language for this data model. It formed the inspiration for the database query languages QUEL and SQL of which the latter, although far less faithful to the original relational model and calculus, is now used in almost all relational database management systems as the ad-hoc query language. Along with the tuple calculus Codd also introduced the domain calculus which is closer to first-order logic and showed that these two calculi (and the relational algebra) are equivalent in expressive power.

Definition of the Calculus

The fundamental assumption in this calculus is that variables range over tuples as they are defined in the relational model. (In what follows the term tuple is always used in that meaning) The atoms of our language are going to be statements about such variables and defined by the following context-free grammar:

B ::= (T.A = T.A) | (T.A = C) | R(T)

where T denotes the set of tuple variables, A denotes the set of attribute names, C denotes the set of constant values and R denotes the set of relation names. Examples of atoms are These atoms may be combined into formulas, as is usual in first-order logic, with the logicial operators ∧ (and), ∨ (or) and ¬ (not), and we can use the existential quantifier (∃) and the universal quantifier (∀) to bind the variables:

F ::= B | ( FF ) | ( FF ) | ¬ F | ∃ T ( F ) | ∀ T ( F )

Examples of formulas: Note that the last formula states that all books that are written by C. J. Date have as their subject the relational model. As usual we omit brackets if this causes no ambiguity about the semantics of the formula.

We will assume that the quantifiers quantify over the universe of all tuples. It is easy to see that then a closed formula, i.e., a formula with no free variables, is either true or false for a certain database instance. This means that we can already use those to ask boolean queries, i.e., queries that result in either "yes" or "no". If it is not closed then given a database instance a formula defines a set of bindings, i.e., a function b of the free variables to tuples, such that the formula becomes true iff every free variable x in it is replaced with b(x).

A query language should not only make statements about the contents of the database but it should also be able to construct new tuples that are not yet in the database. For this purpose the tuple constructor is introduced that can construct a tuple given a certain variable binding. Its syntax is defined as follows:

K::= ( A:T.A, ..., A:T.A )

Examples of tuple constructors: The semantics of a tuple constructor is defined as a partial function that maps a variable binding, i.e., a partial function from tuple variables to tuples, to a tuple. In order for this function to be defined the variable binding should satisfy the following constraints: Finally we define what a query expression looks like:

Q ::= { K | F }

The result of a query { k | f } is the set of tuples that is constructed by the tuple constructor k from all the bindings that make the formula f true. Examples of query expressions are: Since the result of the tuple constructor is only defined for certain bindings we have to add certain restrictions such that the result of the tupel constructor is defined for all bindings generated by the formula. Fortunately we can decide this as follows.

We define P(f) and N(f) as functions that map a formula f to a set of pairs (t, a) with t a tuple variable and a an attribute name. The intended meaning of these functions is that P(f) contains a pair (t, a) if in all bindings that make f true the variable t is bound to a tuple with an attribute a, and for N(f) this holds if this is the case for all bindings that make f false. The value of these functions can be inductively defined (and therefore computed) as follows:

P( t.a = s.b ) = { (t, a), (s, b) } N( t.a = s.b ) = ∅
P( t.a = c ) = { (t, a) } N( t.a = c ) = ∅
P( r(t) ) = { (t, a) : a is in the header r } N( r(t) ) = ∅
P( fg ) = P(f) ∪ P(g) N( fg ) = N(f) ∩ N(g)
P( fg ) = P(f) ∩ P(g) N( fg ) = N(f) ∪ N(g)
P( ¬ f ) = N(f) N( ¬ f ) = P(f)
P( ∀ t ( f ) ) = { (s, a) ∈ P(f) : st } N( ∀ t ( f ) ) = { (s, a) ∈ N(f) : st }
P( ∃ t ( f ) ) = { (s, a) ∈ P(f) : st } N( ∃ t ( f ) ) = { (s, a) ∈ N(f) : st }

We then call a query expression { k | f } well formed if for every occurrence of t.a in k it holds that (t, a) is in P(f). It will be clear that the result of every well formed query expression is well defined.

Discuss safeness to prevent that the result may be infinite. Actually we should also check if the comparisos are legal according to the schema, but this is not strictly necessary.

Extensions of Tuple Calculus

Linda, extensions by Leler et al)