ECE 250 Algorithms and Data Structures
Dynamic programming
Douglas Wilhelm Harder, M.Math. LEL
Department of Electrical and Computer Engineering
University of Waterloo
Waterloo, Ontario, Canada
ece.uwaterloo.ca
dwharder@alumni.uwaterloo.ca
© 2006-2013 by Douglas Wilhelm Harder. Some rights reserved.
Dynamic programming
2
Dynamic programming
To begin, the word programming is used by mathematicians to
describe a set of rules which must be followed to solve a problem
– Thus, linear programming describes sets of rules which must be solved
a linear problem
– In our context, the adjective dynamic describes how the set of rules
works
Dynamic programming
3
Outline
We will begin by looking at two problems:
– Fibonacci numbers
– Newton polynomial coefficients
– Determining connectedness in a DAG
We will examine the naïve top-down approach
– We will implement the naïve top-down approach
– We will discuss the run-times of these algorithms
– We will consider the application of memoization
– We will discuss the benefits of a bottom-up approach
Dynamic programming
4
Fibonacci numbers
Consider this function:
double F( int n ) {
return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
}
The run-time of this algorithm is
(1) n 1
T n
T n 1 T n 2 (1) n 1
Dynamic programming
5
Fibonacci numbers
Consider this function calculating Fibonacci numbers:
double F( int n ) {
return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
}
The runtime is similar to the actual definition of Fibonacci numbers:
(1) n 1 1 n 1
T n F F n 1 F n 2 1 n 1
n
T n 1 T n 2 (1) n 1
Therefore, T(n) = W(F(n)) = W(fn)
T n
– In actual fact, T(n) = Q(fn), only lim 2
n F n
Dynamic programming
6
Fibonacci numbers
To demonstrate, consider:
#include <iostream> double F( int n ) {
return ( n <= 1 ) ? 1.0 : F(n - 1) + F(n - 2);
#include <ctime>
}
using namespace std;
int main() {
cout.precision( 16 ); // print 16 decimal digits of precision for
doubles
// 53/lg(10) = 15.95458977...
for ( int i = 33; i < 100; ++i ) {
cout << "F(" << i << ") = "
<< F(i) << '\t' << time(0) << endl;
}
return 0;
}
Dynamic programming
7
Fibonacci numbers
The output:
F(33) = 5702887 1206474355
F(34) = 9227465 1206474355 F(34), and F(35) in 1 s
F(33),
F(35) = 14930352 1206474355
F(36) = 24157817 1206474356
F(37) = 39088169 1206474358
F(38) = 63245986 1206474360
F(39) = 102334155 1206474363
F(40) = 165580141 1206474368
F(41) = 267914296 1206474376
F(42) = 433494437 1206474389
F(43) = 701408733 1206474411
F(44) = 1134903170 1206474469 ~1 min to calculate F(44)
Dynamic programming
8
Fibonacci numbers
Problem:
– To calculate F(44), it is necessary to calculate F(43) and F(42)
– However, to calculate F(43), it is also necessary to calculate F(42)
– It gets worse, for example
• F(40) is called 5 times
• F(30) is called 620 times
• F(20) is called 75 025 times
• F(10) is called 9 227 465 times
• F(0) is called 433 494 437 times
Surely we don’t have to recalculate F(10) almost ten million times…
Dynamic programming
9
Fibonacci numbers
Here is a possible solution:
– To avoid calculating values multiple times, store intermediate
calculations in a table
– When storing intermediate results, this process is called memoization
• The root is memo
– We save (memoize) computed answers for possible later reuse, rather
than re-computing the answer multiple times
Dynamic programming
10
Fibonacci numbers
Once we have calculated a value, can we not store that value and
return it if a function is called a second time?
– One solution: use an array
– Another: use an associative hash table
Dynamic programming
11
Fibonacci numbers
static const int ARRAY_SIZE = 1000;
double * array = new double[ARRAY_SIZE];
array[0] = 1.0;
array[1] = 1.0;
ray?
t
// use 0.0 to indicate we have not yet calculated
of gers?
he ar inatevalue
for ( int i = 2; i < ARRAY_SIZE;t he end) { exed by
++i
array[i] = 0.0; bey ond no t ind
f you go
oblem is
} ti
Wha at if our pr
s:
lem Whn ) {
Probdouble F( int
if ( array[n] == 0.0 ) {
array[n] = F( n – 1 ) + F( n – 2 );
}
return array[n];
}
Dynamic programming
12
Fibonacci numbers
Recall the characteristics of an associative container:
template <typename S, typename T>
class Hash_table {
public:
// is something stored with the given key?
bool member( S key ) const;
// returns value associated with the inserted key
T retrieve( S key ) const;
void insert( S key, T value );
// ...
};
Dynamic programming
13
Fibonacci numbers
This program uses the Standard Template Library:
#include <map>
/* calculate the nth Fibonacci number */
double F( int n ) {
static std::map<int, double> memo;
// shared by all calls to F
// the key is int, the value is double
if ( n <= 1 ) {
return 1.0;
} else {
if ( memo[n] == 0.0 ) {
memo[n] = F(n - 1) + F(n - 2);
}
return memo[n];
}
}
Dynamic programming
14
Fibonacci numbers
This prints the output up to F(1476):
int main() {
std::cout.precision( 16 );
for ( int i = 0; i < 1476; ++i ) {
std::cout << "F(" << i << ") = " << F(i) << std::endl;
}
return 0;
}
The last two lines are
F(1475) = 1.306989223763399e+308
F(1476) = inf
Exact value: F(1475) =
13069892237633993180363115538027198309839244390741264072
6006659460192793070479231740288681087777017721095463154979012276234322246936939647185366706368489362660844147
4499413484628009227558189696347433489829164249540627441359698656154072764924106537217745906695448014908376491
61732095972658064630033793347171632
Dynamic programming
15
Top-down and bottom-up algorithms
This also allows for another approach:
– Our implementation is top-down
– From the very top, we break the problem into sub-problems
– All divide-and-conquer algorithms we have seen are top-down
An alternative approach is bottom-up approach
– Solve smaller problems and use these to build a solution for the problem
– Merge sort could be implemented bottom-up:
• Divide the array into groups of size m and sort each group with insertion sort
• Merge adjacent pairs into groups of size 2m where possible
• Repeat this successively until the entire list is sorted
Dynamic programming
16
Top-down and bottom-up algorithms
Here is a bottom-up approach to calculating Fibonacci numbers:
double F( int n ) {
if ( n <= 1 ) {
return 1.0;
}
double a = 1.0, b = 1.0;
for ( int i = 2; i <= n; ++i ) {
a += b;
b = a - b;
}
return a;
}
Dynamic programming
17
Newton polynomials
The coefficient of a Newton polynomial which allows a polynomial of
degree n – 1 passing through n points may be calculated by:
yk ,...,2 yk 1,...,1
yk ,...,1
xk x1
where k = 1, …, n and the coefficients are defined recursively:
yi ,..., j 1 yi 1,..., j
yi ,..., j
x j xk
y j 1 y j
y j , j 1
x j 1 x j
Dynamic programming
18
Newton polynomials
A recursive implementation for calculating Newton polynomials
coefficients would have a run time of Q(2n)
double Newton_polynomials( int i, int j, double *xs, double *ys )
{
if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] – ys[i] )/( xs[j] – xs[i] );
} else {
return (
Newton_polynomials( i + 1, j, xs, ys )
- Newton_polynomials( i, j - 1, xs, ys )
) / (xs[j] - x[i]);
}
}
Dynamic programming
19
Newton polynomials
n n 1
Note, however, that there are only coefficients
2
– This shouldn’t require such recursion
x0 y0
y1 y0
y1, 0
x1 x0 y2 ,1 y1, 0
x1 y1 y2 ,1, 0
y2 y1 x2 x0 y3, 2 ,1 y2 ,1, 0
y2 ,1 y3, 2 ,1, 0
x2 x1 y3 , 2 y2 ,1 x3 x0
x2 y2 y3, 2 ,1
y3 y 2 x3 x1
y3 , 2
x3 x2
x3 y3
Dynamic programming
20
Newton polynomials
Here is an array-based implementation:
#include <cmath>
static const int ARRAY_SIZE = 1000;
double * array = new double[ARRAY_SIZE]
[ARRAY_SIZE];
for ( int i = 0; i < ARRAY_SIZE; ++i ) {
for ( int j = 0; j < ARRAY_SIZE; ++j ) {
array[i][j] = nan( "" );
}
} double Newton_polynomials( int i, int j, double *xs, double
*ys ) {
if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] – ys[i] )/( xs[j] – xs[i] );
} else {
if ( isnan( array[i][j] ) {
array[i][j] = (
Newton_polynomials( i + 1, j, xs, ys )
- Newton_polynomials( i, j - 1, xs, ys )
) / (xs[j] - x[i]);
}
return array[i][j];
}
Dynamic programming
21
Newton polynomials
Here is a hash-table based implementation:
#include <map>
#include <cmath>
#include <utility>
double Newton_polynomials( int i, int j, double *xs, double *ys, bool clear = true
) {
static std::map< std::pair< int, int >, double > memo;
if ( clear ) {
memo.clear();
}
if ( i == j ) {
return ys[i];
} else if ( i + 1 == j ) {
return ( ys[j] - ys[i] )/( xs[j] - xs[i] );
} else {
if ( memo.find( std::pair<int, int>( i, j ) ) == memo.end() ) {
memo[std::pair<int, int>( i, j )] = (
Newton_polynomials( i + 1, j, xs, ys, false )
- Newton_polynomials( i, j - 1, xs, ys, false )
) / (xs[j] - xs[i]);
}
return memo[std::pair<int, int>( i, j )];
}
Dynamic programming
22
Newton polynomials
Here is a bottom-up approach:
void Newton_polynomials( int n, double *xs, double *ys, double *coeffs ) {
for ( int i = 0; i < n; ++i ) {
coeffs[i] = ys[i];
}
for ( int i = 1; i < n; ++i ) {
for ( int j = n - 1; j >= i; --j ) {
coeffs[j] = ( coeffs[j] – coeffs[j – 1] ) / ( xs[j] – xs[j – i] );
}
}
}
Note that the memory required is now Q(n) and not Q(n2)
Dynamic programming
23
Connected
Determining if two vertices are connected in a DAG, we could
implement the following:
bool Weighted_graph::connected( int i, int j ) {
if ( adjacent( i, j ) ) {
return true;
}
for ( int v : neighbors( i ) ) {
if ( connected( v, j ) ) {
return true;
}
}
return false;
}
– What are the issues with this implementation?
Dynamic programming
24
Dynamic programming
In solving optimization problems, the top-down approach may
require repeatedly obtaining optimal solutions for the same sub-
problem
– Mathematician Richard Bellman initially formulated the concept of
dynamic programming in 1953 to solve such problems
– This isn’t new, but Bellman formalized defined this process
Dynamic programming
25
Dynamic programming
Dynamic programming is distinct from divide-and-conquer, as the
divide-and-conquer approach works well if the sub-problems are
essentially unique
– Storing intermediate results would only waste memory
If sub-problems re-occur, the problem is said to have overlapping
sub-problems
Dynamic programming
26
Matrix chain multiplication
Suppose A is k × m and B is m × n
– Then AB is k × n and calculating AB is Q(kmn)
– The number of multiplications is given exactly kmn
Suppose we are multiplying three matrices ABC
– Matrix multiplication is associative so we may choose (AB)C or A(BC)
– The order of the multiplications may significantly affect the run time
For example, if A and B are n × n matrices and v is an n-dimensional
column vector:
– Calculating (AB)v is Q(n3)
– Calculating A(Bv) is Q(n2)
Dynamic programming
27
Matrix chain multiplication
Suppose we want to multiply four matrices ABCD
– There are may ways of parenthesizing this product:
((AB)C)D
(AB)(CD)
(A(BC))D
A((BC)D)
A(B(CD))
– Which has the least number of operations?
Dynamic programming
28
Matrix chain multiplication
For example, consider these four:
Matrix Dimensions
A 20 × 5
B 5 × 40
C 40 × 50
D 50 × 10
Dynamic programming
29
Matrix chain multiplication
Considering each order:
((AB)C)D
The required number of multiplications is:
AB 20 × 5 × 40 = 4000
(AB)C 20 × 40 × 50 = 40000
((AB)C)D 20 × 50 × 10 = 10000
This totals to 54000 multiplications
Dynamic programming
30
Matrix chain multiplication
Considering the next order:
(AB)(CD)
The required number of multiplications is:
AB 20 × 5 × 40 = 4000
CD 40 × 50 × 10 = 20000
(AB)(CD) 20 × 40 × 10 = 8000
This totals to 32000 multiplications
Dynamic programming
31
Matrix chain multiplication
Considering the next order:
(A(BC))D
The required number of multiplications is:
BC 5 × 40 × 50 = 10000
A(BC) 20 × 5 × 50 = 5000
(A(BC))D 20 × 50 × 10 = 10000
This totals to 25000 multiplications
Dynamic programming
32
Matrix chain multiplication
And the the next order:
A((BC)D)
The required number of multiplications is:
BC 5 × 40 × 50 = 10000
(BC)D 5 × 50 × 10 = 2500
A((BC)D) 20 × 5 × 10 = 1000
This totals to 13500 multiplications
Dynamic programming
33
Matrix chain multiplication
Repeating this for the last, we get the following table:
Order Multiplications
((AB)C)D 54000
(AB)(CD) 32000
(A(BC))D 25000
A((BC)D) 13500
A(B(CD)) 23000
The optimal run time uses A((BC)D)
Dynamic programming
34
Matrix chain multiplication
Thus, the optimal run time may be found by calculating the product
in the order
A((BC)D)
Problem: What if we are multiply n matrices?
Dynamic programming
35
Matrix chain multiplication
Can we generate a greedy algorithm to achieve this?
– Greedy by the smallest number of operations?
• Unfortunately, 2 × 1 × 2 + 2 × 2 × 3 = 16 > 12 = 1 × 2 × 3 + 2 × 1 × 3
even though 2×1×2= 4<6 =1×2×3
3 3 8 9
6 4
2
1 0 5
– Greedy by generating the smallest matrices (sum of dimensions)?
• Unfortunately, 2 × 1 × 2 + 2 × 2 × 4 = 20 > 16 = 1 × 2 × 4 + 2 × 1 × 4
even though 2+2= 4<5 =1+4
3 2 7 9 4
6 4
2
0 8 1 5
Dynamic programming
36
Matrix chain multiplication
If we are multiplying A1A2A3A4⋅⋅⋅An, starting top-down, there are n –
1 different ways of parenthesizing this sequence:
(A1)(A2A3A4 ⋅⋅⋅An)
(A1A2)(A3A4 ⋅⋅⋅An)
(A1A2A3)(A4 ⋅⋅⋅An)
⋅⋅⋅
(A1⋅⋅⋅An – 2)(An – 1An)
(A1⋅⋅⋅An – 2An – 1)(An)
For each one we must ask:
– What is the work required to perform this multiplication
– What is the minimal amount of work required to perform both of the
other products?
Dynamic programming
37
Matrix chain multiplication
For example, in finding the best product of
(A1⋅⋅⋅Ai)(Ai + 1 ⋅⋅⋅An)
the work required is:
– The product columns(A1) rows(Ai) rows(An)
• Note that rows(Ai) and columns(Ai + 1) must be equal
– The minimal work required to multiply A1⋅⋅⋅Ai
– The minimal work required to multiply Ai + 1⋅⋅⋅An
Dynamic programming
38
Matrix chain multiplication
int matrix_chain( Matrix *ms, int i, int j ) {
// There is only one matrix
if ( i + 1 == j ) {
return 0;
} else if ( i + 2 == j ) {
assert( ms[i].columns() == ms[i + 1].rows() );
return ms[i].rows() * ms[i].columns() * ms[i + 1].columns();
}
// We are multiplying at least three matrices
// Start with calculating the work for M[i] * (M[i + 1] * ... * M[j - 1])
assert( ms[i].columns() == ms[i + 1].rows() );
int minimum = matrix_chain( ms, i + 1, j ) + ms[i].rows() * ms[i].columns() * ms[j -
1].columns();
for ( int k = i + 2; k < j; ++k ) {
// Find the work for (M[i] * ... * M[k - 1]) * (M[k] * ... M[j - 1]) and update if it
is less
assert( ms[k - 1].columns() == ms[k].rows() );
int current = matrix_chain( ms, i, k ) + matrix_chain( ms, k, j ) +
+ ms[i].rows() * ms[k].rows() * ms[j - 1].columns();
if ( current < minimum ) {
minimum = current;
}
}
return minimum;
Dynamic programming
39
Matrix chain multiplication
Because of the recursive nature, we will on numerous occasions be
asking for the optimal behaviour of a given subsequence
– We will asked the optimal way to multiply A3 ⋅⋅⋅ An – 2 when we ask the
optimal way to multiply any of the following:
A1 (A2 (( (A3 ⋅⋅⋅ An – 2) An – 1)An))
A1 (A2 ( (A3 ⋅⋅⋅ An – 2) (An – 1 An)))
A1 ((A2 (A3 ⋅⋅⋅ An – 2) ) (An – 1An))
A1 ((A2 ( (A3 ⋅⋅⋅ An – 2) An – 1)) An)
A1 ((A2 (A3 ⋅⋅⋅ An – 2) ) An – 1) An)
(A1 A2) (( (A3 ⋅⋅⋅ An – 2) An – 1) An)
(A1 A2) ( (A3 ⋅⋅⋅ An – 2) (An – 1 An))
(A1 (A2 ( (A3 ⋅⋅⋅ An – 2) )) (An – 1 An)
((A1 A2) (A3 ⋅⋅⋅ An – 2) )(An – 1 An)
(A1 (A2 (A3 ⋅⋅⋅ An – 2) An – 1))) An
(A1 ((A2 (A3 ⋅⋅⋅ An – 2) )An – 1)) An
((A1 A2) ( (A3 ⋅⋅⋅ An – 2) An – 1) An
((A1 (A2 (A3 ⋅⋅⋅ An – 2) )) An – 1) An
(((A1 A2) (A3 ⋅⋅⋅ An – 2) )An – 1) An
Dynamic programming
40
Matrix chain multiplication
The actual number of possible orderings is given by the following
recurrence relation:
1 n 1
n 1
Pn Pn Pn k n 1
k 1
Without proof, this recurrence relation is solved by P(n) = C(n – 1)
where C(n – 1) is the (n – 1)th Catalan number
C(1) = 1 C(6) = 132
1 2n 4n C(2) = 2 C(7) = 429
C n 3/2 C(3) = 5 C(8) = 1430
n 1 n n C(4) = 14 C(9) = 4862
C(5) = 42 C(10) = 16796
Dynamic programming
41
Matrix chain multiplication
The number of function calls of the implementation is given by
1 n 1, 2
T n n 3
3n
4 3 n 3
Can we speed this up?
– Memoization: once we’ve found the optimal number of solutions for a
given sequence, store it
Dynamic programming
42
Matrix chain multiplication
#include <map>
#include <utility>
int matrix_chain_memo( Matrix *ms, int i, int j, bool clear = true ) {
static std::map< std::pair< int, int >, int > memo;
if ( clear ) {
memo.clear();
} Associate a pair (i, j) with an integer
if ( i + 1 == j ) {
return 0;
} else if ( memo[std::pair<int, int>(i, j)] == 0 ) {
if ( i + 2 == j ) {
memo[std::pair<int, int>(i, j)] = ms[i].rows() * ms[i].columns() * ms[i + 1].columns();
} else {
int minimum = matrix_chain_memo( ms, i + 1, j, false )
+ ms[i].rows() * ms[i].columns() * ms[j - 1].columns();
for ( int k = i + 2; k < j; ++k ) {
int current = matrix_chain_memo( ms, i, k, false ) + matrix_chain_memo( ms, k, j, false )
+ ms[i].rows() * ms[k].rows() * ms[j - 1].columns();
if ( current < minimum ) {
minimum = current;
}
}
memo[std::pair<int, int>(i, j)] = minimum;
}
}
return memo[std::pair<int, int>(i, j)];
}
Dynamic programming
43
Matrix chain multiplication
Our memoized version now runs in
1 n 1
T n n
2
n 1n 1 n 2
This is a top-down implementation
– Can we implement a bottom-up version?
Dynamic programming
44
Matrix chain multiplication
For a bottom-up implementation, we need a matrix that stores our
best current solution to Ai to Aj:
j=1 2 3 4 ··· n
i=1 0
2 0 ···
3 0 ···
4 0 ···
·
··
n 0
Dynamic programming
45
Matrix chain multiplication
As we calculate the minimum number of multiplications required for
a specific sequence Ai···Aj, we fill the entry aij in the table
j=1 2 3 4 ··· n
i=1 0
2 0 ···
3 0 ···
4 0 ···
·
··
n 0
Dynamic programming
46
Matrix chain multiplication
This table has n2 entries, and therefore our run time must be at least
W(n2)
However, at each step, the actual run time is Q(n3)
Dynamic programming
47
Matrix chain multiplication
For example, given the previous example
Matrix Dimensions
A1 20 × 5
A2 5 × 40
A3 40 × 50
A4 50 × 10
we can calculate the table as follows...
Dynamic programming
48
Matrix chain multiplication
We can calculate the off-diagonal easily:
1 2 3 4
Matrix Dimensions 1 0 4000
A1 20 × 5 20×5 20×40
A2 5 × 40 2 0 10000
5×40 5×50
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10
4 0
50×10
Dynamic programming
49
Matrix chain multiplication
Next, we may calculate either:
A1(A2A3) 20 × 5 × 50 + 10000 =
15000
(A1A2)A3 4000 + 20 × 40 × 50 =
44000
1 2 3 4
Matrix Dimensions 1 0 4000 15000
A1 20 × 5 20×5 20×40 20×50
A2 5 × 40 2 0 10000
5×40 5×50
A3 40 × 50
3 0 24000
A4 50 × 10 40×50 40×10
4 0
50×10
Dynamic programming
50
Matrix chain multiplication
We continue:
A2(A3A4) 5 × 40 × 10 + 20000 =
22000 (A2A3)A4 10000 + 5 × 50 × 10 =
12500
1 2 3 4
Matrix Dimensions 1 0 4000 15000
A1 20 × 5 20×5 20×40 20×50
A2 5 × 40 2 0 10000 12500
5×40 5×50 5×10
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10
4 0
50×10
Dynamic programming
51
Matrix chain multiplication
Finally we calculate:
A1((A2A3)A4) 20 × 5 × 10 + 12500 =
13500
(A1A2)(A3A4) 4000 + 20 × 40 × 10 + 20000 =
32000
1 2
(A1(A2A3))A4 15000 + 20 ×3 50 × 104 =
Matrix Dimensions 1 0 4000 15000 13500
25000
A 20 × 5 20×5 20×40 20×50 20×10
1
A2 5 × 40 2 0 10000 12500
5×40 5×50 5×10
A3 40 × 50
3 0 20000
A4 50 × 10 40×50 40×10
4 0
50×10
Dynamic programming
52
Matrix chain multiplication
Thus, counting the number of calculations required (each Q(1)):
n–1
(n – 2) 2
(n – 3) 3
which suggests the sum
n 1 n 1 n 1 3 2 3
n n n ( n 1)( 2 n 1) n n
i 1
( n i )i n i
i 1 i 1
i2
2
6
6
Therefore, the run time is Q(n3)
Dynamic programming
53
Matrix chain multiplication
int matrix_chain_iterative( Matrix *ms, int n ) {
int array[n][n];
for ( int i = 0; i < n; i++ ) {
array[i][i] = 0;
}
for ( int i = 1; i < n; i++ ) {
for ( int j = 0; j < n - i; j++ ) {
array[j][j + i] = array[j][j] + array[j + 1][j + i]
+ ms[j].rows()*ms[j + 1].rows()*ms[j + i].columns();
for ( int k = j + 1; k < j + i; k++) {
int current = array[j][k] + array[k + 1][j + i]
+ ms[j].rows()*ms[k + 1].rows()*ms[j + i].columns();
if ( current < array[j][j + i] ) {
array[j][j + i] = current;
}
}
}
}
return array[0][n - 1];
}
Dynamic programming
54
Matrix chain multiplication
How can you estimate run times?
– Which is faster—the top-down implementation with memoization or the
bottom-up implementation?
– You could plot run times…
– Question: what is the growth of
this plotted data?
• Quadratic?
• Cubic?
• Q(n2 ln(n))
Dynamic programming
55
Matrix chain multiplication
If a function grows in polynomial time, it is of the form:
T(n) = anb
Take the logarithm of both sides:
ln(T(n)) = ln(anb) = ln(a) + b ln(n)
– It grows linearly with a slope b
Top-down with memoization Bottom-up
b≈2 b≈3
Dynamic programming
56
Optimal polygon triangulation
In graphics and geometry, convex polygons are a basic unit
– Applications in graphics and finite-element methods
Dynamic programming
57
Optimal polygon triangulation
In graphics and geometry, convex polygons are a basic unit
– Dividing such a polygon into simpler triangles is a common operation
Dynamic programming
58
Optimal polygon triangulation
In graphics and geometry, convex polygons are a basic unit
– Some triangulations may be better than others
– For example, fewer thin triangles
Dynamic programming
59
Optimal polygon triangulation
Now, a convex quadrilateral (tetragon) can only be triangulated in
two different ways
Dynamic programming
60
Optimal polygon triangulation
A convex pentagon can be triangulated in five different ways
Dynamic programming
61
Optimal polygon triangulation
And a convex hexagon can be triangulated in 14 different ways
Dynamic programming
62
Optimal polygon triangulation
The sequence 2, 5, 13 may already appear familiar
– The Catalan numbers...
If we can put a weight on each generated triangle, can we find an
optimal triangulation?
– Can we come up with a good algorithm?
– Consider the previous problem of finding an optimal order for multiplying
matrices
Dynamic programming
63
Optimal polygon triangulation
Choose a side and begin numbering the sides in order
– Any two adjacent sides can be joined to create a triangle
– Any two adjacent matrices could be multiplied
Dynamic programming
64
Optimal polygon triangulation
Taking two adjacent sides and creating a triangle is similar to
bracketing
Dynamic programming
65
Optimal polygon triangulation
Instead of sides 1 and 2, we could choose 2 and 3
Dynamic programming
66
Optimal polygon triangulation
Or 3 and 4
Dynamic programming
67
Optimal polygon triangulation
Suppose the triangle 3/4 was optimal
– Next, do we add the side 2?
Dynamic programming
68
Optimal polygon triangulation
Suppose the triangle 3/4 was optimal
– Or do we add the side 5?
Dynamic programming
69
Optimal polygon triangulation
The analogy is not exact because there is no logic to bracketing and
multiplying matrices A1 and A7
Never-the-less, this strongly suggests that there is an efficient
algorithm based on dynamic programming that will find an optimal
triangulation
Dynamic programming
70
Interval scheduling
Process Interval
A problem we saw in the topic on
greedy algorithms was interval A 5 –
8
scheduling
B 10 –
13
C 6 –
9
D 12 –
15
E 3 –
7
F 8 –
11
G 1 –
6
H 8 –
12
J 3 –
Dynamic programming
71
Interval scheduling
Process Interval
The earliest-deadline-first greedy
algorithm will maximize the number A 5 –
8
of processes that can be scheduled
B 10 –
13
C 6 –
9
D 12 –
15
E 3 –
7
F 8 –
11
G 1 –
6
H 8 –
12
J 3 –
Dynamic programming
72
Interval scheduling
Process Interval Weight
Suppose, however, that there are
weights associated with the processes A 5 – 1.7
8
– Can we find the schedulable processes
B 10 – 1.3
that have maximal weight?
13
C 6 – 3.9
Currently, no greedy algorithm is known 9
that can solve this problem D 12 – 3.2
15
E 3 – 5.9
7
F 8 – 3.3
11
G 1 – 5.4
6
H 8 – 1.2
12
J 3 – 5.8
Dynamic programming
73
Interval scheduling
Process Interval Weight
Note: if we wanted to optimize based
on processor usage, the weight would A 5 – 3
8
be equal to the computation time
B 10 – 3
13
C 6 – 3
9
D 12 – 3
15
E 3 – 4
7
F 8 – 3
11
G 1 – 5
6
H 8 – 4
12
J 3 – 2
Dynamic programming
74
Interval scheduling
First, as before, we sort Place Process Interval Weight
the processes by their 1 K 2 – 4.8
deadlines 4
2 J 3 – 5.8
5
3 G 1 – 5.4
6
4 E 3 – 5.9
7
5 A 5 – 1.7
8
6 C 6 – 3.9
9
7 F 8 – 3.3
11
8 H 8 – 1.2
12
9 B 10 – 1.3
Dynamic programming
75
Interval scheduling
Process L could only be Place Process Interval Weight Previous
run if nothing after the 7th 1 K 2 – 4.8 0
process is chosen 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
76
Interval scheduling
Process M could only be Place Process Interval Weight Previous
run if nothing after the 6th 1 K 2 – 4.8 0
process is chosen 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
77
Interval scheduling
If a process must be run Place Process Interval Weight Previous
first, we mark its previous 1 K 2 – 4.8 0
process as 0 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
78
Interval scheduling
If a process must be run Place Process Interval Weight Previous
first, we mark its previous 1 K 2 – 4.8 0
process as 0 4
2 J 3 – 5.8 0
5
3 G 1 – 5.4 0
6
4 E 3 – 5.9 0
7
5 A 5 – 1.7 2
8
6 C 6 – 3.9 3
9
7 F 8 – 3.3 5
11
8 H 8 – 1.2 5
12
9 B 10 – 1.3 6
Dynamic programming
79
Interval scheduling
Thus, given process n, the optimal weight is either:
The weight of this process plus the optimal weight
of that previous process that could have been run
or
The optimal weight of the (n – 1)st process
This can be quickly converted to code:
int optimal_weight( int n ) const {
// The zeroeth process has weight 0
if ( n == 0 ) {
return 0;
}
return std::max(
weight[n] + optimal_weight( previous[n] ),
optimal_weight( n - 1 )
);
}
Dynamic programming
80
Interval scheduling
Unfortunately, consider the worst-case run-time analysis:
– Suppose it is true that
previous[n] == (n - 1)
for all n, the run time analysis is:
int optimal_weight( int n ) const {
if ( n == 0 ) {
return 0;
1 n 0
}
T n
2T n 1 1 n 0
return std::max(
weight[n] + optimal_weight( previous[n] ),
optimal_weight( n - 1 )
);
}
Dynamic programming
81
Interval scheduling
In the optimal case where we can run every process, the run time is
exponential:
1 n 0
T n 2
n
2T n 1 1 n 0
Dynamic programming
82
Interval scheduling
However, using memoization:
int optimal_weight_memo( int n, bool clear = true ) const {
static std::map<int, int> memo;
if ( clear ) {
memo.clear();
}
if ( n == 0 ) {
return 0;
} else if ( memo[n] == 0 ) {
memo[n] = std::max(
weight[n] + optimal_weight_memo( previous[n], false ),
optimal_weight_memo( n – 1, false )
);
}
return memo[n];
}
Dynamic programming
83
Interval scheduling
We can also implement an iterative bottom-up approach:
int optimal_weight_iterative( int n ) const {
int optimal[n + 1];
optimal[0] = 0;
for ( int k = 1; k <= n; ++k ) {
optimal[k] = max(
weight[n] + optimal[previous[n]],
optimal[n – 1]
);
}
return optimal[n];
}
Dynamic programming
84
Interval scheduling
There is only one loop: the run-time is T n n
int optimal_weight_iterative( int n ) const {
int optimal[n + 1];
optimal[0] = 0;
for ( int k = 1; k <= n; ++k ) {
optimal[k] = max(
weight[n] + optimal[previous[n]],
optimal[n – 1]
);
}
return optimal[n];
}
Dynamic programming
85
Project management
0/1 knapsack problem
Recall our problem of project management or 0/1 knapsack problem
– The constraint W is an integer value
– There are n items we could include:
• The nth item has an integral weight wk
• Its value is vk
– The optimal value of objects we can place in the knapsack is:
0 k 0
knapsack k , w knapsack k 1, w wk w
max knapsack k 1, w , knapsack k 1, w w v w w
k k k
– Again, this is a recursive optimization problem
Dynamic programming
86
Project management
0/1 knapsack problem
Here is a hash-table based implementation:
#include <map>
#include <cmath>
#include <utility>
double knapsack( int *weight, double *value, int k, int W, bool clear = true ) {
static std::map< std::pair< int, int >, double > memo;
if ( clear ) memo.clear();
if ( k == 0 ) {
return 0;
} else {
if ( memo.find( std::pair<int, int>( k, W ) ) == memo.end() ) {
if ( weight[k - 1] > W ) {
memo[std::pair<int, int>( k, W )] = knapsack( weight, value, k - 1, W,
false );
} else {
memo[std::pair<int, int>( k, W )] = std::max(
knapsack( weight, value, k - 1, W, false ),
knapsack( weight, value, k - 1, W - weight[k - 1], false ) + value[k
- 1]
);
}
}
return memo[std::pair<int, int>( k, W )];
Dynamic programming
87
Project management
0/1 knapsack problem
The run-time is O(nW)
– This is significantly slower than our greedy algorithm with O(n ln(n))
– The greedy algorithm also works if the weights are not integral
For our previous example, the output of
int main() {
int weight[10] = { 15, 12, 10, 9, 8, 7, 5, 4,
3, 1};
double value[10] = {210, 220, 180, 120, 160, 170, 90, 40,
60, 10};
std::cout << knapsack( weight, value, 10, 26 ) << std::endl;
is 520, return
which is0;the optimal result
}
Dynamic programming
88
Arrays versus hash tables
Both simple arrays and hash tables may be dynamically resized
As we saw before, this does not significantly affect the run time
Two problems with arrays are:
– They are indexed only by integers
– They are dense (we may not be solving all sub-problems)
Dynamic programming
89
Remember tables in Maple
Maple is a programming language with built-in memoization
– In Maple, memoization is achieved through the use of hash tables
termed remember tables
– To demonstrate, we will implement the calculation of Fibonacci numbers
in Maple
Dynamic programming
90
Remember tables in Maple
In Maple, we state that a function must have option remember
This gives the function remember tables
> F := proc(n)
option remember; Maple’s version of
printf( "Calling F(%d)\n", n ); “this” function
procname( n - 1 ) + procname( n - 2 );
end proc:
F( 0 ) := 1: # assign F(0) = 1 to remember table
F( 1 ) := 1: # assign F(1) = 1 to remember table
Dynamic programming
91
Remember tables in Maple
> F(5); # the first time it is called,
# we call F(5), F(4), F(3), and F(2)
Calling F(5)
Calling F(4)
Calling F(3)
Calling F(2)
8
> F(5); # the second time, the result comes
# from the remember table
8
> F(7); # should only call F(7) and F(6)
Calling F(7)
Calling F(6)
21
Dynamic programming
92
Remember tables in Maple
Recall however, that remember tables are hash tables (with 1024
bins)
Therefore, they can fill up, and therefore accessing previous results
can become expensive
Dynamic programming
93
Remember tables in Maple
Maple uses a static chained hash table with 1024 bins
– Making n accesses to a hash table should occur in Q(n) time
– The following shows the time (in seconds) it takes to calculate
F( 1024n ) for n up to 80
– Behaviour is initially linear, but it reverts back to quadratic growth
Dynamic programming
94
Applications
Other applications of dynamic programming include:
– The Floyd-Warshall algorithm solving the all-pairs shortest path problem
– Longest common subsequence between two strings
• This has applications to bioinformatics such as DNA matching
– Solves the Travelling salesman problem in O(n2 2n) time
– The Earley parser of context-free languages (ECE 351 Compilers)
– The Levenshtein distance between two strings (used in spell checkers)
These will be discussed in the homework
Dynamic programming
95
Applications
And finally:
– the Duckworth-Lewis method for calculating the target score in an
interrupted cricket match:
• On the 2006 One-Day International between India and Pakistan, India batted first and
was all out in the 49th over for 328
• Pakistan was 7 wickets down for 311 when bad light stopped play after the 47th over
• With three full overs left to play and three wickets in hand, the Duckworth-Lewis method
calculated the target to be 304, so the result of the match is officially listed as
“Pakistan won by 7 runs (D/L Method)”
Jono4174
http://en.wikipedia.org/wiki/Duckworth-Lewis_method
Dynamic programming
96
Summary
In this class, we have considered the algorithm design strategy of
dynamic programming
– Useful when recursive algorithms have overlapping sub-problems
– Storing calculated values allows significant reductions in time
• Memoization
– Considered top-down and bottom-up approaches to solving problems
– Looked at both simple and more complex solutions:
• Fibonacci numbers
• Newton polynomial coefficients
• Matrix chain multiplications
• Optimal polygon triangulation
• Interval scheduling
– Also considered Maple’s remember tables
Dynamic programming
97
References
Wikipedia, http://en.wikipedia.org/wiki/Topological_sorting
[1] Cormen, Leiserson, and Rivest, Introduction to Algorithms, McGraw Hill, 1990, §11.1, p.200.
[2] Weiss, Data Structures and Algorithm Analysis in C++, 3 rd Ed., Addison Wesley, §9.2, p.342-5.
These slides are provided for the ECE 250 Algorithms and Data Structures course. The
material in it reflects Douglas W. Harder’s best judgment in light of the information available to
him at the time of preparation. Any reliance on these course slides by any party for any other
purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for
damages, if any, suffered by any party as a result of decisions made or actions based on these
course slides for any other purpose than that for which it was intended.