Register allocation by graph coloring

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Definition [?]
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.

In Computer Science, register allocation by graph coloring is a graph theoretic solution to the register allocation problem.

The Register Allocation Problem

Main Article: register allocation

Typically, a compiler will translate a source written in a high-level language (HLL) to a target in assembly language or machine code. There are a few critical differences between the source and target languages. In particular, HLLs allow for a large or unlimited number of variables, while the typical microprocessor only allows for efficient access to a small number of variables, called registers, and inefficient access to main memory. For efficiency reasons, there is incentive to place as many variables as possible into registers, and as few as possible into memory.

Since only a small percentage of the variables within a computer program are used at one time, there is an opportunity to place multiple variables into the same register---but only if those variables are not used at the same time.

More formally, if at any point during program execution two variables each hold some value that will later be read by a future operation, then those two variables are said to interfere. Unfortunately, it is not always possible to determine if two variables interfere (a consequence of the halting problem), and thus a compiler must act conservatively and assume that they do interfere unless they can be shown not to.

The register allocation problem is to assign a register or memory location to each variable in a computer program, in such a way that no two interfering variables share a register or memory location. A trivial solution to the register allocation problem is to assign each variable a distinct memory location using a stack frame. Although this solution produces correct results, they are generally quite inefficient in computation time and memory space.

The Interference Graph

The register allocation problem can be viewed as a graph coloring problem as follows: every variable or intermediate calculation in a program is treated as a vertex in an undirected graph, and two vertices are connected by an edge if and only if their corresponding variables interfere. This graph is called the interference graph, and a complete vertex coloring of this graph can be viewed as a register allocation, with each register represented by a vertex color.

Determining a -coloring for a graph is NP-Hard. Nonetheless, the compiler must solve such a problem. Thus, a compiler uses a heuristic algorithm to produce a sub-optimal solution to the vertex coloring problem, attempting to minimize the number of colors used. If, ultimately, the number of colors needed is greater than the number of registers in the target language, then the graph must be transformed to reduce the number of necessary colors. A compiler accomplishes this by spilling one or more variables, storing them in a memory location instead of a register.

The Algorithm

Similar work was done by Lavrov in 1961, though his work was on the analogous problem of memory allocation. The use of graph coloring to solve register allocation was first presented by Chaitin in 1982, and was later improved in the PhD dissertation of Briggs in 1992. Below is high level description of Briggs' algorithm:

To determine a k-coloring for a computer program, represented in some intermediate representation that assumes an infinite number of registers,

  1. Compute an interference graph for the program.
  2. Repeatedly simplify the graph until it is empty,
    1. If there are vertices with degree less than k, then
      1. Choose the vertex V with the smallest degree.
    2. Otherwise,
      1. Choose the vertex V that is most favorable to spill.
    3. Remove V, along with its edges, from the interference graph, and place it onto the stack.
      1. In doing so, adjust the degree of all adjacent vertices.
  3. Repeatedly select a vertex vertex V by popping it from the stack, until the stack is empty,
    1. Add V, along with its edges, back to the interference graph.
    2. If possible, select a color for V which does not interfere with its adjacent vertices in the partially reconstructed graph.
    3. Otherwise, leave that vertex uncolored for now, and continue.
  4. If any vertex remained uncolored, then the coloring has failed, and
    1. Insert "spill code" into the program for each uncolored node, hopefully reducing the complexity of the problem.
    2. Repeat the whole process.

The primary motivation for this algorithm is the observation that, when returning a vertex from the stack to the interference graph in step 3.1, if the vertex previously had degree less than k, then it necessarily can be colored.

The major difference between this algorithm and Chaintin's is in step 2.2.1. Chaitin's algorithm would pessimistically spill the variable at this point. Briggs' algorithm behaves less pessimistically, observing that a vertex with degree greater than or equal to k does not necessarily require to be spilled, and that spilling one variable may reduce other high-degree variables, enabling them to be colored.

Also, one should note that, during the select phase, the algorithm does not quit after the first non-colorable vertex is found. In this way, the algorithm attempts to collect a complete list of the fewest variables to spill.

Implementation of the Algorithm

Briggs also calls for a coalescing phase before graph simplification. The coalescing phase is quite similar to the copy propagation and constant folding optimizations that may be performed at other times during compilation. Coalescing is theoretically unnecessary, because it replaces two non-interfering variables with one. However, it does greatly reduce the size of the graph coloring problem (fewer vertices in the graph), and thus improves performance.

It should be noted that, before register allocation, the compiler has produced an intermediate representation of the source program. In this IR, the compiler has assumed that there are an infinite number of registers available on the target machine; i.e. the IR assumes that every variable or intermediate computation is stored to a register.

As there are a large number of variables and intermediate computations in a program, a naive representation of the interference graph can grow to be quite large. In general, for variables and intermediates, space will be required. Since the interference graph is reflexive and symmetric, the interference graph can be efficiently represented as a triangular bit matrix.

Also, since the simplify procedure will need to repeatedly choose minimum vertices from the graph, it is profitable to store all vertices in a priority queue, sorted by the degree of the vertex if less than k, or by the spill cost if degree greater than or equal to k. One should select a priority queue that allows for repeated efficient key updates, such as a Fibonacci heap. Note that when adjusting the degree of vertices upon vertex simplification, if the degree changes from greater than or equal to to less than , then it should be sorted in the queue against degree instead of by spill cost.

On Spilling and Spill Costs

Spilling a variable means modifying the program so that it stores said variable into a memory location, instead of assuming that there will be a register available for it. In other words, all writes to the variable are changed into writes to a memory location, and all reads from that variable are changed into reads from a memory location. On some architectures, most notably load-store architectures, such modifications will require the insertion of some instructions.

Spill cost is a heuristic measure of the decrease in execution performance associated with spilling a variable from a register into a memory location. Unfortunately, this is a very complicated concept, and is based upon many factors, such as the number of times the section of code will be executed and the organization of the processor's memory cache. Many suggestions have been made, none of which are optimal.

See Also