# Beyond Backpacks

A Genetic Algorithm for the Multidimensional Knapsack Problem

By now we are all familiar with formulation of the one-dimensional knapsack problem: given a set of $n$ items, each with value $v_i$ and cost $c_i$, select a subset of maximal value without exceeding a budget $B$. That is,

max $\sum_{i=1}^n v_i x_i$ such that

$\sum_{i=1}^n c_ix_i \le B$

$x_i \in \{0, 1\} \enspace \forall \enspace i = 1, 2, \ldots$

We can extend this problem to several dimensions, considering, for example, a set of projects $J = \{1,2,...,n\}$, each of which consumes resources $r_{ij}$ and produces profit $p_j$. We want to invest in the set of projects that produces maximal value without exhausting our budget, which consists of a set $I$ of resources, $I = \{1, 2, . . . m\}$. Mathematically,

max $\sum_{j=1}^n p_j x_j$ such that

$\sum_{j=1}^n r_{ij}x_j \le b_i$ for $i = 1, 2, ..., n$

$x_j \in \{0, 1\}$ for $j = 1,2, \ldots, n$

This problem is NP-hard, so most approaches involve solving the LP-relaxation of the multidimensional knapsack problem through branch-and-bound techniques, dynamic programming, and heuristics. Chu and Beasley (1998) propose a genetic algorithm heuristic that, they contend, dominates the existing solutions on the set of 57 canonical multidimensional knapsack problems and on a set of more complex problems they invented to assess their algorithm’s efficacy.

A genetic algorithm is an algorithm that applies the principles of evolution through natural selection to optimization. The algorithm initializes a population of candidate so- lutions, then evaluates the solutions’ fitness according to an objective function. Highly fit solutions, selected according to specific protocols, reproduce by blending their charac- teristics. The ‘child’ solution is then subjected to mutation via the alteration of various components in a randomized procedure. Offspring then replace the least-fit solutions in the current population and the process is repeated until the new child’s fitness exceeds the fitness of all existing solutions.

Chu and Beasley (1998) propose a genetic algorithm for the multidimensional knapsack problem. Each candidate solution is a string of length $n$ of binary variables, where a 1 in position $i$ indicates that in the solution $x_i = 1$, etc. It is important to note that a randomly-generated $n$-bit string may not represent a feasible solution; the authors choose to address this problem by using a heuristic operator to convert infeasible solutions to feasible solutions.

After generating a random population of solutions, the algorithm proceeds to evaluate fitness to select the parents of the next generation. Chu and Beasley apply the binary tournament method, forming two groups of $T$ solutions each and selecting the solution in each group with the highest objective function value as a parent. A binary random number generator then determines which bits of each parent solution will be copied into the child solution. If the random number generator has a 0 in position $j$, for example, then bit $j$ in the child solution is copied from parent 2, and vice versa. After creating the blended child solution in this way, the algorithm introduces mutation by randomly flipping a few bits in the child solution.

At this stage, the child solution may not be feasible, as previously discussed. The authors propose a repair operator that relies on the surrogate LP-relaxation of the multidimensional knapsack problem to convert the child solution to a feasible solution. This algorithm guarantees a feasible solution regardless of input.

To assess the computational efficiency of the algorithm, Chu and Beasley test it against CPLEX on 55 known test problems. Their GA approach finds the optimal solution to all these problems, but the problems are too simple to permit any efficiency distinctions between CPLEX and the GA. To determine whether their algorithm improves computational efficiency, therefore, they generated 270 new problems with larger variable and constraint sets.

Since these problems were generated at random, their solutions are not known. Assessment of the algorithm’s performance therefore depends on the gap between the algorithm’s optimal solution and the solution to the LP-relaxation of the problem in question. In gen- eral, the GA produced solutions at least as good as those produced by CPLEX, with an average percentage gap of 0.54% compared to 3.14% for CPLEX. The GA also performs well compared to other heuristics, but the time complexity of the GA per iteration is ap- proximately O(mn). The authors nonetheless contend that the dramatically higher quality of solutions obtained from the GA validate its computational intensiveness.

References: P.C. Chu and J.E. Beasley. “A Genetic Algorithm for the Multidimensional Knapsack Problem.” Journal of Heuristics 4: 63-86, 1998.