SSA was developed by researchers at IBM in the 80's.

Table of contents |

2 Converting to SSA |

The primary usefulness of SSA comes from how it simultaneously simplifies and improves the results of a variety of compiler optimizations, by simplifying the properties of variables. For example, consider this piece of code:

y := 1

y := 2

x := y

As humans, we can see that the first assignment isn't necessary, and that the value of `y`

being used in the third line comes from the second assignment of `y`

. A program would have to perform reaching definition analysis to determine this. But if the program is in SSA form, both of these are immediate:

y_{1} := 1

y_{2} := 2

x_{1} := y_{2}

Compiler optimization algorithms which are either permitted or strongly enhanced by the use of SSA include:

- constant propagation
- dead code elimination
- global value numbering
- partial redundancy elimination
- register allocation
- sparse conditional constant propagation

Converting ordinary code into SSA form is primarily a simple matter of replacing the target of each assignment with a new variable, and replacing each use of a variable with the "version" of the variable reaching that point. For example, consider the following control flow graph:

Notice that we could change the name on the left side of "x x - 3", and change the following uses of `x` to use that new name, and the program would still do the same thing. We exploit this in SSA by create two new variables, `x`_{1} and `x`_{2}, each of which is assigned only once. We likewise give distinguishing subscripts to all the other variables, and we get this:

We've figured out which definition each use is referring to, except for one thing: the uses of `y` in the bottom block could be referring to *either* `y`_{1} or `y`_{2}, depending on which way the control flow came from. So how do we know which one to use?

The answer is that we add a special statement, called a *φ (phi) function*, to the beginning of the last block. This statement will generate a new definition of `y`, `y`_{3}, by "choosing" either `y`_{1} or `y`_{2}, depending on which arrow control arrived from:

Now, the uses of `y` in the last block can simply use `y`_{3}, and they'll obtain the correct value either way. You might ask at this point, do we need to add a φ function for `x` too? The answer is no; only one version of `x`, namely `x`_{2} is reaching this place, so there's no problem.

A more general question along the same lines is, given an arbitrary control flow graph, how can I tell where to insert φ functions, and for what variables? This is a difficult question, but one that has an efficient solution that can be computed using a concept called *dominance frontiers*.