# 4.8. [응용4] The 0-1 Knapsack Problem

In 
  • Problem
    • Given two sets of positive integers \{w_1, w_2, ..., w_n\} and \{p_1, p_2, ..., p_n\} of size n and a positive integer W, find a subset A of \{1,2,...,n\} that maximizes \Sigma_{i \in A} p_i subject to \Sigma_{i \in A} w_i \leq W
  • Example
    • \{w_1, w_2, ..., w_5\} = \{6,5,10,3,4\}
    • \{p_1, p_2, ..., p_5\} = \{9,7,11,6,8\}, W=15
    • \rightarrow \{1,2,5\}
  • An intuitive interpretation
    • There are n items in a store.
    • The i th item weighs w_i kilograms and is worth p_i wons, where w_i and p_i are positive integers.
    • A thief has a knapsack that can carry at most W kilograms, where W is a positive integer.
    • What items should the thief take to maximize his “profit”?

# 0. A 0-1 Knapsack Problem in Real Life

  • ref

  • Problem

    • Given two sets of positive integers \{w_1, w_2, ..., w_n\} and \{p_1, p_2, ..., p_n\} of size n and a positive integer W, find a subset A of \{1,2,...,n\} that maximizes \Sigma_{i \in A} p_i subject to \Sigma_{i \in A} w_i \leq W
    • You have a marketing budget of 5 million dollars.
    • You have the following marketing options and their paybacks in new potential customers:
Option Cost (dollars) Expected reach (people)
Super bowl 3M 80M
Radio ad campaign for 40 metro areas 800K 20M
TV non peak hour campaign 500K 22M
City top paper network 2M 75M
Viral marketing campaign 50K 4M
Web advertising 600K 10M
  • Which marketing campaigns would you choose to maximize the total expected reach under the condition that, for each of these marketing campaigns, you either select it or you don’t?

# 1. How to Solve the 0-1 Knapsack Problem

  • Naïve approach
    • There are 2^n subsets of \{1, 2, ..., n\}!
  • Dynamic programming approach
    • Let P(i,w) be the maximized profit obtained when choosing items *only from the first i items under the restriction that the total weight cannot exceed w.
    • If we let A* be an optimal subset of \{1, 2, ..., n\},
    1. n \in A* : P(n,W) = p_n + P(n-1, W-w_n)
    2. n \notin A* : P(n,W) = P(n-1, W)
    • Optimal substructure
    • P(i,w)=
      • 0 if i=0 || w = 0
      • P(i-1,w) if i>0 || w_i >w
      • \max \{(P(i-1,w), p_i+P(i-1,w-w_i))\} if i>0 || w_i \geq w
  • Example
    • \{w_1, w_2, ..., w_4\} = \{4,3,2,3\}
    • \{p_1, p_2, ..., p_4\} = \{3,2,4,4\}, W=6
    • P(2,4) = \max{(P(1,4), p_2 + P(1, 4-w_2))} = 3
    • P(4,2) = P(3,2) = 4
    • P(3,5) = \max{(P(2,5), p_3 + P(2,5-w_3))} = 6 | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | | 1 | 0 | 0 | 0 | 0 | 3 | 3 | 3 | | 2 | 0 | 0 | 0 | 2 | 3 | 3 | 3 | | 3 | 0 | 0 | 4 | 4 | 4 | 6 | 7 | | 4 | 0 | 0 | 4 | 4 | 4 | 8 | 8 |

# 3. How to Reconstruct the Solution

  • P(4,6) = \max{(P(3,6), p_4 + P(3, 6-w_4))} = 8

  • P(3,3) = \max{(P(2,3), p_3 + P(2, 3-w_3))} = 4

  • P(2,1) = P(1,1) = 0

  • P(1,1) = P(0,1) = 0

    0 1 2 3 4 5 6
    0 0 0 0 0 0 0 0
    1 0 0 0 0 3 3 3
    2 0 0 0 2 3 3 3
    3 0 0 4 4 4 6 7
    4 0 0 4 4 4 8 8

# 4. Implementation and Time Complexity

  • O(nW) Time

    int zero_one_knapsack(int *p, int *w, int n, int W)
    {
        int i, ww, tmp;
        for (ww = 0; ww <= W; ww++) P[0][ww] = 0;
        for (i = 1; i <= n; i++)
        {
            P[i][0] = 0;
            for (ww = 1; ww <= W; ww++)
            {
                if (w[i] <= ww)
                {
                    if ((tmp = p[i] + P[i - 1][ww - w[i]]) > P[i - 1][ww])
                        P[i][ww] = tmp;
                    else
                        P[i][ww] = P[i - 1][ww];
                }
                else P[i][ww] = P[i - 1][ww];
            }
        }
        return P[n][W];
    }

# 5. 0-1 Knapsack Ex. 1: n = 6, W = 10

1 2 3 4 5 6
p_i 6 4 5 3 9 7
w_i 4 2 3 1 6 4
P 0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 6 6 6 6 6 6 6
2 0 0 4 4 6 6 10 10 10 10 10
3 0 0 4 5 6 9 10 11 11 15 15
4 0 3 4 7 8 9 12 13 14 15 18
5 0 3 4 7 8 9 12 13 14 16 18
6 0 3 4 7 8 10 12 14 15 16 19
Q 0 1 2 3 4 5 6 7 8 9 10
0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 1 1 1 1 1 1
2 0 0 1 1 0 0 1 1 1 1 1
3 0 0 0 1 0 1 0 1 1 1 1
4 0 1 0 1 1 0 1 1 1 0 1
5 0 0 0 0 0 0 0 0 0 1 0
6 0 0 0 0 0 1 0 1 1 0 1
? 0 1 2 3 4 5 6 7 8 9 10
0
1 0
2 2
3 5
4 6
5 6
6 10
  • Selected items: i = 2, 3, 4, 6
  • Obtained profit: 19
  • Is the time-complexity O(nW) an efficient one?
    • This is not a linear-time algorithm!
      • A problem is that W is not bounded with respect to n.
      • What if n = 20 and W = 20!? → O(n*n!)
      • When W is extremely large in comparison with n, this algorithm is worse than the brute-force algorithm that simply considers all subsets.
    • This algorithm can be improved so that the worst-case number of entries computed is O(2^n).
    • No one has ever found an algorithm for the 0-1 Knapsack problem whose worst-case time complexity is better than exponential, yet no one has proven that such an algorithm is not possible!

# 6. A Variation of the 0-1 Knapsack Problem

  • Problem

    • Decision Problem
    • Given n items of length l_1, l_2, ..., l_n, is there a subset of these items with total length exactly L?
  • Example \{ 1, 2, 7, 14, 49, 98, 343, 686, 2409, 2793, 16808, 17206, 117705, 117993 \}, \\ L = 138457\{1, 2, 7, 98, 343, 686, 2409, 17206, 117705\}

  • Dynamic programming approach

    • Let P(i,w) be the maximized profit obtained when choosing items *only from the first i items under the restriction that the total weight cannot exceed w.
    • If we let A* be an optimal subset of \{1, 2, ..., n\},
    1. n \in A* : P(n,W) = p_n + P(n-1, W-w_n)
    2. n \notin A* : P(n,W) = P(n-1, W)
    • \rightarrow fill (i, j)

# 7. A Divide-and-Conquer Approach

  • Let fill(i,j) return TRUE \iff \exists subset of the first i items that has total length j.
  • When fill(i,j) returns TRUE,
    1. If the $i$th item is used, fill(i - 1, j - l_i) must return TRUE.
    2. If the $i$th item is not used, fill(i - 1, j) must return TRUE.
  • To solve fill(int n, int L),
    • T(n) \geq c if n=0
    • T(n) \geq 2T(n-1) +d if n>0
    • \rightarrow T(n) = \Theta(2^n)
int fill(int i, int j) { 
  // l[i]: global variable if (i == 0) 
  {
    if(j == 0) return TRUE;
    else return FALSE;
  }

if (fill(i-1, j)) 
  return TRUE;
else if (l[i] <= j)
  return fill(i-1, j-l[i]);
}

# 8. A Dynamic Programming Approach

  • The optimal substructure :
    • F(i,j)= FALSE if i=0, j!=0
    • F(i,j)= TRUE if i=0, j=0
    • F(i,j)= F(i-1,j) || ((l_i \geq j) || F(i-1, j-l_i)) if i>0
  • O(nL) time implementation
    ...
    F[0][0] = TRUE;
    for (ll = 1; ll <= L; ll++) 
      F[0][ll] =FALSE; 
    for (i = 1; i <= n; i++) {
      for (ll = 0; ll <= L; ll++) { 
        F[i][ll] = F[i-1][ll];
        if (ll – l[i] >= 0)
          F[i][ll] = F[i][ll] || F[i-1][ll-l[i]]; 
      }
    }
    return (F[n][L]);
  • Example
    • L = 15, (l_1, l_2, l_3, l_4, l_5, l_6, l_7)= (1, 2, 2, 4, 5, 2, 4) | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | 0 | T | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | | 1 | T | T | F | F | F | F | F | F | F | F | F | F | F | F | F | F | | 2 | T | T | T | T | F | F | F | F | F | F | F | F | F | F | F | F | | 3 | T | T | T | T | T | T | F | F | F | F | F | F | F | F | F | F | | 4 | T | T | T | T | T | T | T | T | T | T | F | F | F | F | F | F | | 5 | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | | 6 | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | | 7 | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T |

# 9. Subset Sum

  • Problem
    • Given a set of positive integers \{w_1, w_2, ..., w_n\} of size n and a postivie integer W, Find a subset A of \{1,2,...,n\} that maximizes \Sigma_{i\in A}{w_i} subject to \Sigma_{i\in A}{w_i} \geq W
  • Example
    • \{w_1, ..., w_9\} = \{20,30,14,70,40,50,15,25,80,60,10,95\}, W=99 \rightarrow \{20,14,40,25\}
  • Application
    • There are n jobs, each of which takes w_i time.
    • Now we have a CPU with W free cycles, and want to choose the set of jobs that minimizes the number of idle cycles.
  • Relation to the 0-1 Knapsack problem
    • Given two sets of positive integers \{w_1, w_2, ..., w_n\} and \{p_1, p_2, ..., p_n\} of size n and a positive integer W, find a subset A of \{1,2,...,n\} that maximizes \Sigma_{i \in A} p_i subject to \Sigma_{i \in A} w_i \geq W \iff
    • Given a set of positive integers \{w_1, w_2, ..., w_n\} of size n and a postivie integer W, Find a subset A of \{1,2,...,n\} that maximizes \Sigma_{i\in A}{w_i} subject to \Sigma_{i\in A}{w_i} \geq W
  • 참고
    • If it is possible to solve the 0-1 knapsack problem in polynomial time, the subset sum problem can be solved in polynomial time too.
    • Somebody has already proven that the subset sum problem is very hard.
    • In other words, the subset sum problem is NP-complete. \rightarrow Hence, the 0-1 knapsack problem is also a very hard problem. In other words, the 0-1 knapsack problem is also NP-complete.

# 10. The Fractional Knapsack Problem

  • Problem
    • Given two sets of positive integers \{w_1, w_2, ..., w_n\} and \{p_1, p_2, ..., p_n\} of size n and a positive integer W, find a set of ratios of \{r_1, r_2, ..., r_n\} (0 \leq r)i \geq 1) that maximizes \Sigma_{i} r_i \cdot p_i subject to \Sigma_{i} r_i \cdot w_i \leq W
  • A greedy approach
    1. Sort the items in nonincreasing order by profits per unit weight \frac{p_i}{w_i}.
    2. Choose the items, possibly partially, one by one until the knapsack is full.
  • Example: \{w_1, w_2, w_3\} = \{5, 10, 20\},\\ \{p_1, p_2, p_3\} = \{50, 60, 140\}, w = 30 \\ \frac{p_1}{w_1} = 10, \frac{p_2}{w_2} = 6, \frac{p_3}{w_3} = 7
    • Choose all of the 1st item: (5, 50)
    • Choose all of the 3rd item: (20, 140)
    • Choose half of the 2nd item: (\frac {10}{2}, \frac{60}{2})
  • Implementation 1 : $O(n \log n +n) = O(n \log n) $
    • Sort the items → $O(n \log n) $
    • Repeat the choice → O(n)
  • Implementation 2 : O(n + k \log n) = ?
    • Put the items in a heap → O(n)
    • Repeat the choice → O(k \log n)
    • \rightarrow Could be faster if only a small number of items are necessary to fill the knapsack.
  • The greedy method always find an optimal solution to the fractional Knapsack problem! \leftarrow Correctness
  • Does the greedy approach always find an optimal solution to the 0- 1 Knapsack problem?

# 0-1 Knapsack Example 2: n = 6, W = 10

  • 0-1 knapsack (dynamic programming)
    • Selected items: i = 3, 4
    • Obtained profit: 15
    • Time Complexity: O(nW) | | 1 | 2 | 3 | 4 | 5 | 6 | | ---- | ---- | ---- | ---- | ---- | ---- | ---- | | p_i | 4 | 5 | 12 | 3 | 4 | 3 | | w_i | 4 | 2 | 9 | 1 | 6 | 2 |
  • Fractional knapsack (greedy)
    • Selected items: i = 4, 2, 6, 3(5)
    • Obtained profit: 17.67
    • Time Complexity: O(n \log n) | | 4 | 2 | 6 | 3 | 1 | 5 | | ----- | ----- | ----- | ----- | ----- | ----- | ----- | | p_i | 3 | 5 | 3 | 12 | 4 | 4 | | w_i | 1 | 2 | 2 | 9 | 4 | 6 | | p_i/w_i | 3.000 | 2.500 | 1.500 | 1.333 | 1.000 | 0.667 |
  • 0-1 knapsack (greedy 1)
    • Selected items: $i = 4, 2, 6 $
    • Obtained profit: 11
    • Time Complexity: O(n \log n) | | 4 | 2 | 6 | 3 | 1 | 5 | | ----- | ----- | ----- | ----- | ----- | ----- | ----- | | p_i | 3 | 5 | 3 | 12 | 4 | 4 | | w_i | 1 | 2 | 2 | 9 | 4 | 6 | | p_i/w_i | 3.000 | 2.500 | 1.500 | 1.333 | 1.000 | 0.667 |
  • 0-1 knapsack (greedy 2)
    • Selected items: i = 3
    • Obtained profit: 12
    • Time Complexity: O(n \log n) | | 4 | 2 | 6 | 3 | 1 | 5 | | ----- | ----- | ----- | ----- | ----- | ----- | ----- | | p_i | 3 | 5 | 3 | 12 | 4 | 4 | | w_i | 1 | 2 | 2 | 9 | 4 | 6 | | p_i/w_i | 3.000 | 2.500 | 1.500 | 1.333 | 1.000 | 0.667 |