Combinatory logic is a simplified model of computation, used in computability theory (the study of what can be computed) and proof theory (the study of what can be mathematically proven.) The theory, despite its simplicity, captures many essential features of the nature of computation.
Combinatory logic is a variation of the lambda calculus, in which lambda expressions (used to allow for functional abstraction) are replaced by a limited set of primitive functions.
For complete details about the lambda calculus, see the article under that head. We will summarize here. The lambda calculus is concerned with objects called lambda-terms, which are strings of symbols of one of the following forms:
The term λv.E1 represents the function which, if applied an argument, binds the formal parameter v to the argument and then computes the resulting value of E1---that is, it returns E1, with every occurrence of v replaced by the argument.
Terms of the form (E1 E2) are called applications. Applications model function invocation or execution: The function represented by E1 is to be invoked, with E2 as its argument, and the result is computed. If E1 (sometimes called the applicand) is an abstraction, the term may be reduced: E2, the argument, may be substituted into the body of E1 in place of the formal parameter of E1, and the result is a new lambda term which is equivalent to the old one. If a lambda term contains no subterms of the form (λv.E1 E2) then it cannot be reduced, and is said to be in normal form.
The expression E[a/v] represents the result of taking the term E and replacing all free occurrences of v with a. Thus we write
(λv.E a) => E[a/v]By convention, we take (a b c d ... z) as short for ''(...(((a b) c) d) ... z)''.
The motivation for this definition of reduction is that it captures the essential behavior of all mathematical functions. For example, consider the function that computes the square of a number. We might write
The square of x is x*x(Using "*" to indicate multiplication.) x here is the formal parameter of the function. To evaluate the square for a particular argument, say 3, we insert it into the definition in place of the formal parameter:
The square of 3 is 3*3To evaluate the resulting expression 3*3, we would have to resort to our knowledge of multiplication and the number 3. Since any computation is simply a composition of the evaluation of suitable functions on suitable primitve arguments, this simple substitution principle suffices to capture the essential mechanism of computation. Moreover, in the lambda calculus, notions such as '3' and '*' can be represented without any need for externally defined primitive operators or constants. It is possible to identify terms in the lambda calculus, which, when suitably interpreted, behave like the number 3 and like the multiplication operator.
The lambda calculus is known to be computationally equivalent in power to many other plausible models for computation (including Turing machines); that is, any calculuation that can be accomplished in any of these other models can be expressed in the lambda calculus, and vice versa. According to the Church-Turing thesis, both models can express any possible computation.
It is perhaps surprising that lambda-calculus can represent any conceivable computation using only the simple notions of function abstraction and application based on simple textual substitution of terms for variables. But even more remarkable is that abstraction is not even required. Combinatory logic is a model of computation equivalent to the lambda calculus, but without abstraction.
A combinatory term has one of the following forms:
V P (E1 E2)where V is a variable, P is one of the primitive functions, or E1 and E2 are combinatorial terms. The primitive functions themselves are combinators, or functions that contain no free variables.
The simplest example of a combinator is I, the identity combinator, defined by
(I x) = xfor all terms x. Another simple combinator is K, which manufactures constant functions: (K x) is the function which, for any argument, returns x, so we say
((K x) y) = xfor all terms x and y. Or, following the same convention for multiple application as in the lambda-calculus,
(K x y) = xA third combinator is S, which is a generalized version of application:
(S x y z) = (x z (y z))S applies x to y after first substituting z into each of them.
Given S and K, I itself is unnecessary, since it can be built from the other two:
((S K K) x) = (S K K x) = (K x (K x)) = xfor any term x. Note that although ((S K K) x) = (I x) for any x, (S K K) itself is not equal to I. We say the terms are extensionally equal. Extensional equality captures the mathematical notion of the equality of functions: that two functions are 'equal' if they always produce the same results for the same arguments. In contrast, the terms themselves capture the notion of intensional equality of functions: that two functions are 'equal' only if they have identical implemenations. There are many ways to implement an identity function; (S K K) and I are among these ways. (S K S) is yet another. We will use the word equivalent to indicate extensional equality, reserving equal for identical combinatorial terms.
T[ ] may be defined as follows:
For example, we will convert the lambda term λx.λy.(y x)) to a combinator:
T[λx.λy.(y x)] = T[λx.T[λy.(y x)]] (by 5) = T[λx.(S T[λy.y] T[λy.x])] (by 6) = T[λx.(S I T[λy.x])] (by 4) = T[λx.(S I (K x))] (by 3) = (S T[λx.(S I)] T[λx.(K x)]) (by 6) = (S (K (S I)) T[λx.(K x)]) (by 3) = (S (K (S I)) (S T[λx.K] T[λx.x])) (by 6) = (S (K (S I)) (S (K K) T[λx.x])) (by 3) = (S (K (S I)) (S (K K) I)) (by 4)If we apply this combinator to any two terms x and y, it reduces as follows:
(S (K (S I)) (S (K K) I) x y) = (K (S I) x (S (K K) I x) y) = (S I (S (K K) I x) y) = (I y (S (K K) I x y)) = (y (S (K K) I x y)) = (y (K K x (I x) y)) = (y (K (I x) y)) = (y (I x)) = (y x)The combinatory representation, (S (K (S I)) (S (K K) I)) is much longer than the representation as a lambda term, λx.λy.(y x). This is typical. In general, the T[ ] construction may expand a lambda term of length n to a combinatorial term of length &Theta(3^{n}).
(E1 E2)[a/x] = (E1[a/x] E2[a/x])
(λx.(E1 E2) a) = ((λx.E1 a) (λx.E2 a)) = (S λx.E1 λx.E2 a) = ((S λx.E1 λx.E2) a)By extensional equality,
λx.(E1 E2) = (S λx.E1 λx.E2)Therefore, to find a combinator equivalent to λx.(E1 E2), it is sufficient to find a combinator equivalent to (S λx.E1 λx.E2), and
(S T[λx.E1] T[λx.E2])evidently fits the bill. E1 and E2 each contain strictly fewer applications than (E1 E2), so the recursion must terminate in a lambda term with no applications at all---either a variable, or a term of the form λx.E.
T[λx.(E x)] = T[E] (if x is not free in E)λx.(E x) is the function which takes an argument, x, and applies the function E to it; this is extensionally equal to the function E itself. It is therefore sufficient to convert E to combinatorial form.
Taking this simplification into account, the example above becomes:
T[λx.λy.(y x)] = ... = (S (K (S I)) T[λx.(K x)])This combinator is equivalent to the earlier, longer one:= (\S (K (S I)) K) (by η-reduction)
(S (K (S I)) K x y) = (K (S I) x (K x) y) = (S I (K x) y) = (I y (K x y)) = (y (K x y)) = (y x)Similarly, the original version of the T[] transformation transformed the identity function λf.λx.(f x) into (S (S (K S) (S (K
K) I)) (K I)). With the η-reduction rule, λf.λx.(f x) is transformed into I.
(B a b c) = (a c b) (C a b c) = (a (b c))Using these combinators, we can extend the rules for the transformation as follows:
T[λx.λy.(y x)] = T[λx.T[λy.(y x)]] = T[λx.(B T[λy.y] x)] (by rule 7) = T[λx.(B I x)] = (B I) (η-reduction)And indeed, (B I x y) does reduce to (y x):
(B I x y) = (I y x) = (y x)The motivation here is that B and C are limited versions of S. Whereas S takes a value and substitutes it into both the applicand and its argument before performing the application, B performs the substitution only in the applicand, and C only in the argument.
The conversion L[ ] from combinatorial terms to lambda terms is trivial:
L[I] = λx.x L[K] = λx.λy.x L[B] = λx.λy.λz.(x z y) L[C] = λx.λy.λz.(x (y z)) L[S] = λx.λy.λz.(x z (y z)) L[(E1 E2)] = (L[E1] L[E2])Note, however, that this transformation is not the inverse transformation of any of the versions of T[ ] that we have seen.
W = (S I I (S I I))has no normal form, because it reduces to itself after three steps, as follows:
(S I I (S I I)) = (I (S I I) (I (S I I))) = (S I I (I (S I I))) = (S I I (S I I))and clearly no other reduction order can make the expression shorter.
Now, suppose N were a combinator for detecting normal forms, such that
(N x) => T, if x has a normal form F, otherwise.(Where T and F the transformations of the conventional lambda-calculus definitions of true and false, λx.λy.x and λx.λy.y. The combinatory versions have T = K and F = (K I).)
Now let
Z = (B (B (C N (S I I)) W) I)now consider the term (S I I Z). Does (S I I Z) have a normal form? It does if and only if the following do also:
(S I I Z) = (I Z (I Z)) = (Z (I Z)) = (Z Z) = (B (B (C N (S I I)) W) I Z) (definition of Z) = (B (C N (S I I)) W Z I) = (C N (S I I) Z W I) = (N (S I I Z) W I)Now we need to apply N to (S I I Z). Either (S I I Z) has a normal form, or it does not. If it does have a normal form, then the foregoing reduces as follows:
(N (S I I Z) W I) = (K W I) (definition of N) = Wbut W does nothave a normal form, so we have a contradiction. But if (S I I Z) does not have a normal form, the foregoing reduces as follows:
(N (S I I Z) W I) = (K I W I) (definition of N) = (I I) Iwhich means that the normal form of (S I I Z) is simply I, another contradiction. Therefore, the hypothetical normal-form combinator N cannot exist.
(Add details with illustrations; don't forget to discuss the effect of fixed-point combinators.)
(Discuss strict vs. lazy evaluation semantics. Note implications of graph reduction implementation for lazy evaluation. Point out efficiency problem in lazy semantics: Repeated evaluation of the same expression, in, e.g. (square COMPLICATED) => (* COMPLICATED COMPLICATED), normally avoided by eager evaluation and call-by-value. Discuss benefit of graph reduction in this case: when (square COMPLICATED) is evaluated, the representation of COMPLICATED can be shared by both branches of the resulting graph for (* COMPLICATED COMPLICATED), and evaluated only once.)
The Curry-Howard isomorphism implies a relationship between logic and programming: Every valid proof of a theorem of logic corresponds directly to a reduction of a lambda term, and vice versa. Theorems themselves are identified with function type signatures.
(Add: Demonstration that the axioms
(a -> a) (a -> b -> a) (a -> b -> c) -> (a -> b) -> (a -> c)are complete for the intuitionistic fragment of propositional logic.)
See also: