Main Page | See live article | Alphabetical index

In-line expansion

In-line expansion is a technique compilers use in generating target code which "expands" a call site into the actual implementation of the subprograms which is called, rather than each call sending control to a common chunk of code. This avoids overhead associated with the function call, which is especially important for small functions, and it also provides opportunity for many more call-site-specific compiler optimizations, especially constant propagation. The main drawback is that the expansion usually results in a larger binary code, which can actually hurt performance if it damages locality of reference.

Table of contents
1 Comparison to macros
2 Language support
3 Problems with in-lining
4 Implementing in-lining

Comparison to macros

Traditionally, in languages such as C, the same effect was accomplished using preprocessor macros. Inlining provides several benefits over this approach to the programmer:

Some compilers can also in-line some recursive functions; recursive macros are typically illegal.

Language support

C++, C99, and GNU C each have support for inline functions which standard C does not. In the Ada programming language, a pragma can be used to inline functions. Most functional languages aggressively in-line functions. Different compilers vary in how complex a function they can manage to in-line. Some Java runtimes (notably the HotSpot compiler) support aggressive inlining based on actual runtime call patterns; only parts that are frequently used are in-lined. Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any small subprogram, even in C code.

An in-line function can be written in C++ like this:

inline int max (int a, int b)
{
  if (a > b) return a;
        else return b;
}

a = max (x, y); // This is now equivalent to "a = (x > y ? x : y);"

Problems with in-lining

Replacing a call site with an expanded function body can present several problems that may make this "optimization" actually hurt performance:

Implementing in-lining

Once one has decided to in-line a particular function, it is usually a simple matter to do so. Depending on whether one wants cross-language inline functions, the inlining can be done with either a high-level intermediate representation, like abstract syntax trees, or a low-level intermediate representation. In either case, one simply computes the arguments, stores them in variables corresponding to the function's arguments, and then inserts the body of the function at the call site.

Need to talk more about how to automatically choose which functions to inline

Need to talk more about inlining recursive functions