In-line expansion

In-line expansion is a technique compilers use in generating target code which "expands" a call site into the actual implementation of the subprograms which is called, rather than each call sending control to a common chunk of code. This avoids overhead associated with the function call, which is especially important for small functions, and it also provides opportunity for many more call-site-specific compiler optimizations, especially constant propagation. The main drawback is that the expansion usually results in a larger binary code, which can actually hurt performance if it damages locality of reference.

Table of contents

1 Comparison to macros
2 Language support
3 Problems with in-lining
4 Implementing in-lining

Comparison to macros

Traditionally, in languages such as C, the same effect was accomplished using preprocessor macros. Inlining provides several benefits over this approach to the programmer:

Macros do not perform type checking.
Macros can introduce unintended side-effects due to reevaluation of arguments and order of operations.
Compiler errors within macros are often difficult to understand, because they refer to the expanded code.
Many constructs are awkward or impossible to express using macros, or use a significantly different syntax. In-line functions use the same syntax as ordinary functions, and can be inlined and un-inlined at will.
The compiler can automatically choose which functions are best to in-line.

Some compilers can also in-line some recursive functions; recursive macros are typically illegal.

Language support

C++, C99, and GNU C each have support for inline functions which standard C does not. In the Ada programming language, a pragma can be used to inline functions. Most functional languages aggressively in-line functions. Different compilers vary in how complex a function they can manage to in-line. Some Java runtimes (notably the HotSpot compiler) support aggressive inlining based on actual runtime call patterns; only parts that are frequently used are in-lined. Mainstream C++ compilers like Microsoft Visual C++ and GCC support an option that lets the compilers automatically inline any small subprogram, even in C code.

An in-line function can be written in C++ like this:

inline int max (int a, int b)
{
  if (a > b) return a;
        else return b;
}
a = max (x, y); // This is now equivalent to "a = (x > y ? x : y);"

Problems with in-lining

Replacing a call site with an expanded function body can present several problems that may make this "optimization" actually hurt performance:

In applications where code size is more important than speed, such as many embedded systems, in-lining is usually disadvantageous.
The increase in code size may cause a small, critical section of code to no longer fit in the cache, causing cache misses and slowdown.
The added variables from the in-lined procedure may consume additional registers, and in an area where register pressure is already high this may force spilling, which means additional RAM accesses.
A language specification may allow a program to make additional assumptions about arguments to procedures which it can no longer make after the procedure is in-lined.

Implementing in-lining

Once one has decided to in-line a particular function, it is usually a simple matter to do so. Depending on whether one wants cross-language inline functions, the inlining can be done with either a high-level intermediate representation, like abstract syntax trees, or a low-level intermediate representation. In either case, one simply computes the arguments, stores them in variables corresponding to the function's arguments, and then inserts the body of the function at the call site.

Need to talk more about how to automatically choose which functions to inline

Need to talk more about inlining recursive functions