Approximate Graph Coloring

Coloring a graph's vertices with a minimal number of colors is a classic NP-complete problem, but there are many useful polynomial-time approximation algorithms.

1

Motivation

What is vertex coloring?

A graph is k-colorable if each of its vertices can be assigned a color, chosen from at most k unique colors, such that no two vertices connected directly by an edge share the same color. The chromatic number of a graph is the minimum possible k for which a valid coloring is possible.

A complete graph with 7 vertices (left) requires 7 colors; a tree with 7 vertices (right) requires only 2 colors.

Determining the chromatic number of an arbitrary graph is one of Karp's famous 21 NP-complete problems; a polynomial-time algorithm would imply P = NP, which is considered unlikely.

Besides its fundamental theoretical importance, minimum vertex coloring has numerous practical applications. (Vertex coloring is colloquially known as graph coloring; edge coloring is also studied.) In the examination scheduling problem, exams at a university must be scheduled such the number of exam periods is minimized and no student has two exams at once. This can be modeled as a vertex coloring problem: represent each course as a vertex, and add edges between courses that share students. A vertex coloring of this graph maps each course to an exam period; if two courses share students, they must have a different color and therefore must have examinations at different times. Vertex coloring also arises in the study of compilers, where it is used to model the register allocation problem.

Easier problems

There are many bounds, approximation algorithms, and heuristics for vertex coloring. All planar graphs have a chromatic number of at most 4 and can be 5-colored in polynomial time. Nontrivial trees and bipartite graphs have a chromatic number of 2 and can be easily colored via depth-first search; cycles have a chromatic number of at most 3 and are also easy to color. More generally, it is well-known that \chi(G) \leq \Delta(G) + 1, where \chi(G) is the chromatic number of a graph G and \Delta(G) is the maximum degree of all the vertices of G. Such a coloring can be attained in polynomial time.

In this project, we investigate approximate vertex coloring algorithms for graphs where there is a large gap between the trivial \Delta(G) + 1-coloring and the best attainable coloring. We focus specifically on 3-colorable graphs with high maximum degree. We investigate heuristics for vertex coloring, show where they fail, and describe Wigderson's vertex coloring algorithm for 3-colorable graphs that offers provably better worst-case performance than prior-known heuristics. We conclude with a survey of improvements made since the publication of Wigderson's 1983 paper.

A 2-colorable graph with a maximum degree of 29.

Random graphs

To generate test cases for the heuristics, we will primarily consider the Erdős–Rényi random graph model G(n, p). This model is widely studied in the approximate graph coloring literature. To sample a graph from G(n, p), we simply construct a graph with n vertices and then insert each of the {n \choose 2} unique possible edges with probability p. At the extremes, G(n, 0) contains only the graph with n vertices and no edges, and G(n, 1) only the complete graph K_n with all possible edges.

Despite this simple construction, Erdős–Rényi random graphs exhibit rich and surprising behavior, perhaps most notably rapid phase transitions. When p \gt \frac{1}{n}, a graph drawn from G(n, p) almost always has a "giant component" that contains the vast majority of the vertices in the graph, but when p \lt \frac{1}{n}, a graph drawn from G(n, p) usually has many small components of size O(\log n). Similarly, for sufficiently large n, graphs drawn from G(n, \frac{1-\epsilon}{n}) are almost always 3-colorable, but this does not necessarily hold for graphs drawn from G(n, \frac{1+\epsilon}{n}).

2

Heuristics

Simple greedy coloring

Perhaps the simplest graph coloring heuristic is the following: choose an uncolored vertex at random and assign it a color not used by its neighbors, introducing a new color if the vertex's neighbors are using all previously available colors. Repeat this operation until all vertices are colored.

While this algorithm often performs poorly compared to other heuristics, it is a useful baseline. Grimett and McDiarmid show that this algorithm, in expectation, colors Erdős–Rényi random graphs with only twice as many colors as required.

As a simple improvement to the random coloring algorithm, sort the vertices of a graph by degree and color the vertices with largest degree first. Intuitively, this algorithm handles the hardest vertices—that is, the vertices most likely to be constrained by their neighbors—first and thus avoids excessively introducing new colors in some cases.

Brélaz's algorithm (DSatur)

Brélaz's widely used degree saturation (DSatur) algorithm often performs better than random coloring.The original Brélaz paper does not provide comprehensive benchmarks against the random coloring algorithm; instead, benchmarks against other heuristics, which were presumably all invented as improvements over naive greedy coloring, are provided. Some cursory benchmarking on small Erdős–Rényi random graphs suggests that Brélaz's algorithm almost always performs better than random coloring.A vertex's degree saturation is the number of unique colors used by its neighbors. Brélaz's algorithm can be viewed as an extension of the largest-degree-first algorithm. It simply introduces the additional intuition that, in a partially colored graph, vertices already surrounded by many different colors are more likely to eventually run out of available colors, regardless of degree.

Brélaz's degree saturation algorithm

Input: a graph G (chromatic number unknown).

Output: a coloring of G.

Initialization: i \leftarrow 1


while G is partially uncolored:

  1. Choose the uncolored vertex v \in G with maximum saturation degree. If more than one vertex has this saturation degree, choose the vertex with maximum degree instead.
  2. Attempt to color v with the minimum possible color in [1, i]. If this is not possible, introduce color i + 1 and use it to color v.

Brelaz here

Worst-case behavior

Spinrad and Vijayan construct a class of 3-colorable graphs that Brélaz's algorithm may color with O(n) colors.

While these heuristics are useful in practice, they yield no worst-case asymptotic improvement over simply assigning each vertex its own color. Johnson constructs examples for random coloring and largest-degree-first coloring that use O(n) colors for a 3-colorable graph. Brélaz's algorithm can fail similarly, as shown above.

Greedy independent set coloring

Not all simple heuristics for vertex coloring have such poor worst-case performance. Johnson and Wigderson describe the greedy independent set heuristic, which can color a k colorable graph with O(n \ / \log_k n) colors.

Greedy independent set coloring

Input: a graph G (chromatic number unknown).

Output: a coloring of G with at most 3\frac{n}{\log_k n} colors.

Initialization:

  1. i \leftarrow 1 (current maximum color)
  2. W \leftarrow V (the set of uncolored vertices)
  3. U \leftarrow V (the set of uncolored vertices in the current greedy independent set)

while G is partially uncolored (W is nonempty):

  1. Choose the vertex v \in U with minimal degree in the subgraph of G induced by U. Color v with i. Remove v from W, as it is now colored.
  2. Remove v and its neighbors from the current greedy independent set U.
  3. If U is empty, let i \leftarrow i + 1 and U \leftarrow W (initialize a new greedy independent set with a different color).

For any proper coloring, each color corresponds to an independent set of vertices—by definition, no two vertices with the same color share an edge. Step 2 maintains a greedy independent set in U—removing the neighbors of v at each step guarantees that the coloring produced by the algorithm is proper, but the number of independent sets found by algorithm is not necessarily minimal.

Nonetheless, this algorithm exhibits better worst-case performance than its competitors. For the first stage of the algorithm, when there are at least |V| \ / \log_k(|V|) vertices left in W, each color is assigned to O(\log_k |V|) vertices—for each iteration, the degree of v is bounded such that step 2 cannot remove all vertices in U at once.

Independent set interactive here

3

Wigderson's approximations

In a 1983 paper, Avi Wigderson introduced an algorithm for coloring 3-colorable graphs with at most O(\sqrt{n}) colors; this algorithm can be extended to produce k-colorings with a better performance guarantee than the greedy independent set algorithm. Though there are now more advanced and more performant graph coloring algorithms, Wigderson's collection of algorithms are an instructive example of how coloring heuristics can be improved to achieve better performance guarantees with little added complexity.

3-colorable graphs

It is easy to achieve a reasonable coloring of a 3-colorable graph with low maximum degree. For instance, if \Delta(G) \in O(\sqrt{n}), there is a O(\sqrt{n})-coloring of G. This follows from the more general result that every graph admits a \Delta(G) + 1-coloring; simple polynomial-time algorithms exist to find such colorings. However, this bound is not always helpful. In the counterexample for Brélaz's algorithm shown above, most vertices have degree O(n), but the graph is still 3-colorable.

For graphs with at least \lceil \sqrt{n} \rceil maximum degree, Wigderson suggests the following algorithm.

Algorithm A (3-coloring with high maximum degree)

Input: a 3-colorable graph G = (V, E).

Output: a coloring of G with at most 3 \lceil \sqrt{n} \rceil colors.

Initialization: i \leftarrow 1, n \leftarrow |V|


while \Delta(G) \geq \lceil \sqrt{n} \rceil:

  1. Choose the vertex v with maximum degree from G.
  2. Let H be the subgraph of G induced by the neighbors of v.
  3. 2-color H with colors i, i + 1.
  4. Color v with color i + 2.
  5. Let i \leftarrow i + 2.
  6. Delete v and its neighbors from G.

Color any remaining vertices in G with colors i, i+1, … using the maximum-degree algorithm.

The maximum-degree algorithm. A graph G can be \Delta(G)+1-colored in the following way: for each vertex v, recursively remove v from the graph, color G - \{v\}, and reinsert v. In the worst case, each neighbor of some vertex v_{\text{max}} with degree \Delta(G) uses a different color. Then v_{\text{max}} must have a different color than any of its neighbors, so \Delta(G) + 1 colors are used in total.

Proof of correctness (sketch). Why can H be 2-colored at each step? For a 3-colorable graph, the subgraph induced by the neighborhood of any vertex (but not including that vertex) is bipartite and therefore 2-colorable. Each iteration of the loop uses two new colors and colors at least \lceil \sqrt{n} \rceil + 1 vertices, so at most 2 \lceil \sqrt{n} \rceil colors are used in the loop. Once the graph has a maximum degree of less than \lceil \sqrt{n} \rceil, it is clearly \lceil \sqrt{n} \rceil-colorable. So Algorithm A produces a proper coloring using at most 3 \lceil \sqrt{n} \rceil colors.

It took nearly ten years after its publication for a 3-coloring algorithm beating the "O(n^{\frac{1}{2} - o(1)}) barrier" to be developed. Berger and Rompel (1990) offer a minor improvement to Wigderson's algorithm that uses O(\sqrt{n}/\sqrt{\log n}) colors. Arvim Blum's 3-coloring algorithm (1991), the first to break the barrier, uses O(n^\frac{3}{8} \text{polylog}(n)) colors and is considerably more complicated. Blum later published an improved algorithm that uses O(n^\frac{3}{14} \text{polylog}(n)) colors.

3-coloring interactive here

k-coloring

Just as Wigderson's algorithm for 3-colorable graphs lifts a polynomial-time 2-coloring algorithm, we can create an approximate algorithm for k-colorable graphs that lifts the core idea of Wigderson's algorithm for 3-colorable graphs.This approach is not specific to Wigderson's algorithms. Similar techniques can be used to achieve loose bounds for other algorithms—for instance, see section 7.3 of .

Algorithm B (k-coloring)

Input: a k-colorable graph G and a starting color i.

Output: a coloring of G with at most 2k \lceil n^{1-1/(k-1)} \rceil colors.

Initialization: n \leftarrow |V|

Base cases: if k=2, 2-color G. If k \geq \log n, assign each vertex its own color.


function B(k, G, i):

Recursive stage

while \Delta(G) \geq \lceil n^{1-1/(k-1)} \rceil:

  1. Choose v \in G with maximum degree.
  2. Let H be the subgraph of G induced by the neighbors of v.
  3. Color H recursively: call B(k-1, H, i). Let j be the number of new colors.
  4. Color v with i + j. Let i \leftarrow i + j .
  5. Delete v and its neighbors from G.

Brute-force stage

Color G with at most \lceil n^{1-1/(k-1)} \rceil colors.

The details of showing that the performance guarantee is achieved are complicated and involve a clever application of Jensen's inequality, but the fundamental insight is that we can use an algorithm for k-colorable graphs to color k+1-colorable graphs. Specifically, if G is k+1-colorable, then the subgraph induced by the neighbors of some v \in G must be k-colorable—by removing edges to v, an extra color is made available at each of the neighbors. This subgraph can be colored with the available k-coloring algorithm and v can be colored with the k+1-th color if necessary, yielding a coloring algorithm for k+1-colorable graphs.

This algorithm only beats the greedy independent set algorithm for graphs of low degree, but the two algorithms can be trivially combined to yield an algorithm that colors a graph with O(n (\log \log n)^2 \ / (\log n)^2) colors. This is an improvement over the O(n \ / \log n) performance guarantee achieved by the greedy independent set algorithm alone. Halldórsson slightly improved this algorithm, yielding a O(n (\log \log n)^2 \ / (\log n)^3) performance guarantee.

Inapproximability

Garey and Johnson showed in 1976 that no polynomial-time vertex coloring algorithm exists with an approximation factor beter than 2. This was later improved to a much stronger result: unless NP = ZPP, Wigderson's algorithm for k-coloring is close to optimal, polylogarithmic improvements notwithstanding—Feige and Kilian showed that the chromatic number of a graph cannot be approximated within a factor of n^{1-\epsilon} for any \epsilon \gt 0. Thus, minimal graph coloring is a particularly thorny combinatorial problem—it is not just difficult to solve but difficult to even roughly approximate in polynomial time. In contrast, many other famous NP-complete problems, such as the 0/1 knapsack problem and the subset sum problem, have tight approximations.

4

Other algorithms

Metaheuristics

Local search, simulated annealing, tabu search, and evolutionary algorithms are popular techniques for heuristically solving a broad range of combinatorial optimization problems. All of these approaches use an objective function and a notion of a solution neighborhood to guide the search process. Defining a useful objective function and a way of moving between solutions is nontrivial; some definitions allow for improper coloring in intermediate steps and use an objective function that encourages a proper coloring. Johnson et al. considered three different simulated annealing formulations of the vertex and found that, while simulated annealing often performs better than simpler heuristics such as DSatur, there is not a universally superior simulated annealing formulation. Tabu search, an enhanced version of local search, also performs well in practice.

Evolutionary algorithms, which are loosely inspired by biological evolution, maintain a population of solution candidates that is continually improved through mutation events (analogous to random DNA mutation) and crossover events (analogous to mating). Defining the crossover step is again nontrivial—vertex colorings are fragile, and the combination of two proper colorings might yield an improper coloring. Carefully tailored evolutionary approaches, particularly when combined with local search, are competitive in practice.

Semidefinite programming

Karger et al. give a semidefinite programming formulation for 3-coloring and k-coloring problem. Semidefinite programming is a generalization of linear programming; semidefinite programs can be (approximately) solved in polynomial time. Relaxing an NP-hard problem so that it can be expressed as a linear program and then rounding the solution appropriately is a popular approximation technique; Karger's formulation employs a generalization of this method to produce a O(n^{\frac{1}{4}} \log^{\frac{1}{2}} n)-coloring of a 3-colorable graph. Karger uses the notion of a vector k-coloring, which is a relaxed, continuous version of the discrete coloring problem. Vector coloring is loosely related to fractional coloring, another relaxation of the coloring problem.

k-satisfiability reduction

For many small instances, it is possible to solve vertex coloring problems exactly despite the hardness of the general problem. One exact technique involves converting a given coloring problem to a Boolean satisfiability (SAT) problem. SAT problems are NP-complete, but there are many industrial SAT solvers that can handle problem instances with hundreds of thousands of variables.

Converting the k-satisfiability problem to the k-coloring problem—that is, showing that any algorithm that can optimally color a graph in polynomial time could also efficiently solve SAT problems—is a common way to show that the coloring problem is NP-complete. However, we can also express coloring problems as Boolean satisfiability problems. This approach works well for many graphs with hundreds of vertices and thousands of edges.

5

Conclusion

Determining the chromatic number of a graph is very hard—the best known algorithms perform only slightly better than the simplest heuristics. Nonetheless, there is a wide range of efficient approaches to graph coloring that often produce good colorings in practice.