Module 1 - 1
Complexity and Order Notations
and Binary Search
Instructor – Gajendra Shrimal
1
Algorithm
• An algorithm is a set of instructions to be followed to
solve a problem.
– There can be more than one solution (more than one
algorithm) to solve a given problem.
– An algorithm can be implemented using different
programming languages on different platforms.
• An algorithm must be correct. It should correctly solve
the problem.
– e.g. For sorting, this means even if (1) the input is
already sorted, or (2) it contains repeated elements.
• Once we have a correct algorithm for a problem, we
have to determine the efficiency of that algorithm.
2
Algorithmic Performance
There are two aspects of algorithmic performance:
• Time
• Instructions take time.
• How fast does the algorithm perform?
• What affects its runtime?
• Space
• Data structures take space
• What kind of data structures can be used?
• How does choice of data structure affect the runtime?
We will focus on time:
– How to estimate the time required for an algorithm
– How to reduce the time required
3
Analysis of Algorithms
• Analysis of Algorithms is the area of computer science that
provides tools to analyze the efficiency of different methods
of solutions.
• How do we compare the time efficiency of two algorithms that
solve the same problem?
Naïve Approach: implement these algorithms in a programming
language (C++), and run them to compare their time
requirements. Comparing the programs (instead of algorithms)
has difficulties.
– How are the algorithms coded?
• Comparing running times means comparing the implementations.
• We should not compare implementations, because they are sensitive to programming
style that may cloud the issue of which algorithm is inherently more efficient.
– What computer should we use?
• We should compare the efficiency of the algorithms independently of a particular
computer.
– What data should the program use?
• Any analysis must be independent of specific data.
4
Analysis of Algorithms
• When we analyze algorithms, we should employ
mathematical techniques that analyze algorithms
independently of specific
implementations, computers, or data.
• To analyze algorithms:
– First, we start to count the number of significant
operations in a particular solution to assess its
efficiency.
– Then, we will express the efficiency of algorithms
using growth functions.
5
The Execution Time of Algorithms
• Each operation in an algorithm (or a program) has a
cost.
Each operation takes a certain of time.
count = count + 1; take a certain amount of time, but it is constant
A sequence of operations:
count = count + 1; Cost: c1
sum = sum + count; Cost: c2
Total Cost = c1 + c2
6
The Execution Time of Algorithms (cont.)
Example: Simple If-Statement
Cost Times
if (n < 0) c1 1
absval = -n c2 1
else
absval c3 1
= n;
Total Cost <= c1 + max(c2,c3)
7
The Execution Time of Algorithms
Example: Simple Loop (cont.)
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*c5
The time required for this algorithm is proportional to
n
8
The Execution Time of Algorithms (cont.)
Example: Nested Loop
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) c5 n*(n+1)
{
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
} i = i +1; c8 n
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
The time required for this algorithm is proportional to n2
9
General Rules for Estimation
• Loops: The running time of a loop is at most the running
time of the statements inside of that loop times the number of
iterations.
• Nested Loops: Running time of a nested loop containing a
statement in the inner most loop is the running time of
statement multiplied by the product of the sized of all loops.
• Consecutive Statements: Just add the running times of those
consecutive statements.
• If/Else: Never more than the running time of the test plus
the larger of running times of S1 and S2.
10
Algorithm Growth Rates
• We measure an algorithm’s time requirement as a function of the
problem size.
– Problem size depends on the application: e.g. number of elements in a list for a
sorting algorithm, the number disks for towers of hanoi.
• So, for instance, we say that (if the problem size is n)
– Algorithm A requires 5*n2 time units to solve a problem of size n.
– Algorithm B requires 7*n time units to solve a problem of size n.
• The most important thing to learn is how quickly the algorithm’s
time requirement grows as a function of the problem size.
– Algorithm A requires time proportional to n2.
– Algorithm B requires time proportional to n.
• An algorithm’s proportional time requirement is known as
growth rate.
• We can compare the efficiency of two algorithms by comparing
their growth rates.
11
Algorithm Growth Rates
(cont.)
Time requirements as a function
of the problem size n
12
Common Growth Rates
Function Growth Rate Name
c Constant
log N Logarithmic
log2N Log-squared
N Linear
N log N
N2 Quadratic
N3 Cubic
2N Exponential
13
Figure 6.1
Running times for small inputs
14
Figure 6.2
Running times for moderate inputs
15
Input Size
● Input size (number of elements in the input)
● size of an array
● # of elements in a matrix
● # of bits in the binary representation of the input
● vertices and edges in a graph
16
Types of Analysis
● Worst case
● Provides an upper bound on running time
● An absolute guarantee that the algorithm would not run longer, no
matter what the inputs are
● Best case
● Provides a lower bound on running time
● Input is the one for which the algorithm runs the fastest
Lower Bound Running Time Upper Bound
● Average case
● Provides a prediction about the running time
● Assumes that the input is random
17
Types of Analysis: Example
● Example: Linear Search Complexity
● Best Case : Item found at the beginning: One comparison
● Worst Case : Item found at the end: n comparisons
● Average Case :Item may be found at index 0, or 1, or 2, . . . or n - 1
● Average number of comparisons is: (1 + 2 + . . . + n) / n =
(n+1) / 2
● Worst and Average complexities of common sorting algorithms
Method Worst Case Average Case Best Case
Selection Sort
n2 n2 n2
Insertion Sort
Merge Sort n2 n2 n
Quick Sort nlogn
nlogn nlogn
n2
nlogn
nlogn 18
How do we compare algorithms?
● We need to define a number of
objective measures.
(1) Compare execution times?
Not good: times arespecific to a particular
computer !!
(2) Count the number of statements executed?
Not good: number of statements vary with the
programming language as well as the style of the
individual programmer.
19
Ideal Solution
● Express running time as a function of
the input size n (i.e., f(n)).
● Compare different functions corresponding
to running times.
● Such an analysis is independent
of machine time, programming style, etc.
20
Example
● Associate a "cost" with each statement.
● Find the "total cost” by finding the total number of
times each statement is executed.
Algorithm 1 Algorithm 2
Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
... ...
arr[N-1] = 0; c1
c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2
21
Another Example
● Algorithm 3 Cost
sum = 0; c1
for(i=0; i<N; i++) c
for(j=0; j<N; j++) 2
sum += arr[i] c2
[j];
c
c1 + c2 x (N+1) + c2 x N x (N+1) + c3 3 x
N2
22
Asymptotic Analysis
● To compare two algorithms with running
times f(n) and g(n), we need a rough
measure that characterizes how fast each
function grows.
● Hint: use rate of growth
● Compare functions in the limit, that is,
asymptotically!
(i.e., for large values of n)
23
Rate of Growth
● Consider the example of buying elephants
and
goldfish:
Cost: cost_of_elephants + cost_of_goldfish
Cost ~ cost_of_elephants (approximation)
● insignificant
The low order terms
for large n in a function are
relatively n4 + 100n2 + 10n + 50
n4
~
i.e., we say that n4 + 100n2 + 10n + 50 and have the
n4
same rate of growth
24
Rate of Growth
25
Rate of Growth
26
27
Common orders of magnitude
28
Asymptotic Notation
● O notation: asymptotic “less than”:
● f(n)=O(g(n)) implies: f(n) “≤”
g(n)
● notation: asymptotic “greater than”:
● f(n)= (g(n)) implies: f(n) “≥” g(n)
● notation: asymptotic “equality”:
● f(n)= (g(n)) implies: f(n) “=” g(n)
29
Big-O Notation
● We say fA(n)=30n+8 is order n, O (n)
or It is, at most, roughly
proportional to n.
at most,
● fB(n)=n2+1 is order n2, or O(n2). It
is, roughly proportional to n2.
● In general, any O(n2) function is faster-
growing than any O(n) function.
30
Visualizing Orders of Growth
● On a graph, as
you go to the
Value of function
right, a faster fA(n)=30n+8
growing
function
eventually
becomes fB(n)=n2+1
larger...
Increasing n
31
More Examples …
● n4 + 100n2 + 10n + 50 is
O(n4)
● 10n3 + 2n2 is O(n3)
● n3 - n2 is O(n3)
● constants
● 10 is O(1)
● 1273 is O(1)
32
Back to Our Example
Algorithm 1 Algorithm 2
Cost Cost
arr[0] = 0; c1 for(i=0; i<N; i++) c2
arr[1] = 0; c1 arr[i] = 0; c1
arr[2] = 0; c1
...
arr[N-1] = 0; c1
c1+c1+...+c1 = c1 x N (N+1) x c2 + N x c1 =
(c2 + c1) x N + c2
● Both algorithms are of the same order: O(N)
33
Example (cont’d)
Algorithm 3 Cost
sum = 0; c1
for(i=0; i<N; i++) c2
for(j=0; j<N; j++) c2
sum += arr[i] c3
[j];
c1 + c2 x (N+1) + c2 x N x (N+1) + c3 x N2 =
O(N2)
34
Asymptotic notations
● O-
notation
35
Big-O Visualization
O(g(n)) is the set of
functions with smaller
or same order of
growth as g(n)
36
Examples
● 2n = O(n ):
2 3 2n2 ≤ cn3 2 ≤ cn c = 1 and n0= 2
● n =2
n2 ≤ cn2 c ≥ 1 c = 1 and n0= 1
O(n2):
● 1000n2+1000n = O(n2):
1000n2+1000n ≤ 1000n2+ n2 =1001n2 c=1001 and n0 = 1000
● n = O(n2): n ≤ cn2 cn ≥ 1 c = 1 and n0= 1
37
More Examples
● Show that 30n+8 is O(n).
● Show c,n0: 30n+8 cn, n>n0 .
● Let c=31, n0=8. Assume n>n0=8.
Then
cn = 31n = 30n + n > 30n+8, so 30n+8 < cn.
38
Big-O example, graphically
● Note 30n+8 isn’t
less than n cn =
anywhere (n>0). 31n 30n+8
Value of function
● It isn’t even
less than 31n
everywhere. 30n+8
n
● But it is less than O(n)
31n everywhere to
the right of n=8. n>n0=8
Increasing n
39
No Uniqueness
● There is no unique set of values for and c in proving the
n0
asymptotic bounds
● Prove that 100n + 5 = O(n2)
● 100n + 5 ≤ 100n + n = 101n ≤ 101n2
for all n ≥ 5
n0 = 5 and c = 101 is a
solution
● 100n + 5 ≤ 100n + 5n = 105n ≤
105n2
Must find SOME constants c and n0 that satisfy the asymptotic notation relation
for all n ≥ 1
40
Asymptotic notations (cont.)
● -
notation
(g(n)) is the set of functions
with larger or same order of
growth as g(n)
41
Examples
● 5n2 = (n)
c, n0 such that: 0 cn 5n2 cn 5n2 c = 1 and n0 = 1
● 100n + 5 ≠ (n2)
c, n0 such that: 0 cn2 100n + 5
100n + 5 100n + 5n ( n 1) = 105n
cn2 105n n(cn – 105) 0
Since n is positive cn – 105 0
n 105/c
contradiction: n cannot be smaller
than a constant 42
Asymptotic notations (cont.)
● -
notation (g(n)) is the set of functions
with the same order of growth
as g(n)
43
Examples
● n2/2 –n/2 = (n2)
● ½ n2 - ½ n ≤ ½ n2 n ≥ 0 c2= ½
● ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼
n2 c1=
¼
● n ≠ (n2): c1 n2 ≤ n ≤ c2
n2
only holds for: n ≤ 1/c1
44
Examples
● 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤
c2 n2
only holds for: n ≤ c2 /6
● n ≠ (logn): c1 logn ≤ n ≤ c2 logn
c2 ≥ n/logn, n≥ n0
– impossible
45
Relations Between Different Sets
● Subset relations between order-of-growth sets.
RR
O( f ) ( f )
•f
( f )
46
Logarithms and properties
● Inalgorithm analysis we often use the notation “log
n” without specifying the base
Binary logarithm lg n log2 n log x y y log
x
log xy log
Natural logarithm ln n loge n x log
y
lgk n (lg n)k x
log y log x log
lg lg n lg(lg y
n) alogb x xlogb a
log b x log x
logaa b
47
More Examples
● For each of the following pairs of functions, either f(n) is
● O(g(n)),
f(n) = log nf(n) is= Ω(g(n)),
2; g(n) log n + 5 or f(n) = Θ(g(n)). Determine
● which relationship2is correct.
f(n) = n; g(n) = log n
f(n) = (g(n))
● f(n) = log log n; g(n) = log n
f(n) = (g(n))
● f(n) = n; g(n) = log2 n
f(n) = O(g(n))
● f(n) = n log n + n; g(n) = log n
f(n) = (g(n))
● f(n) = 10; g(n) = log 10
f(n) = (g(n))
● f(n) = 2 ; g(n) = 10n
n 2
f(n) = (g(n))
● f(n) = 2n; g(n) = 3n
f(n) = (g(n))
f(n) = O(g(n))
48
Properties
● Theorem:
f(n) = (g(n)) f = O(g(n)) and f = (g(n))
● Transitivity:
● f(n) = (g(n)) and g(n) = (h(n)) f(n) = (h(n))
● Same for O and
● Reflexivity:
● f(n) = (f(n))
● Same for O and
● Symmetry:
(g(n)) if and only if g(n) = (f(n))
● f(n) =
● Transpose symmetry:
● f(n) = O(g(n)) if and only if g(n) = (f(n))
49
Order-of-Magnitude Analysis and Big
O Notation
• If Algorithm A requires time proportional to f(n), Algorithm A
is said to be order f(n), and it is denoted as O(f(n)).
• The function f(n) is called the algorithm’s growth-rate
function.
• Since the capital O is used in the notation, this notation is
called the Big O notation.
• If Algorithm A requires time proportional to n2, it is O(n2).
• If Algorithm A requires time proportional to n, it is O(n).
50
Definition of the Order of an
Definition: Algorithm
Algorithm A is order f(n) – denoted as
O(f(n)) – if constants k and n0 exist such that A
requires
no more than k*f(n) time units to solve a
problem
of size n n0.
• The requirement of n n0 in the definition of O(f(n))
formalizes the notion of sufficiently large problems.
– In general, many values of k and n can satisfy this definition.
51
Order of an
•
Algorithm
If an algorithm requires n –3*n+10 seconds to solve a problem
2
size n. If constants k and n0 exist such that
k*n2 > n2–3*n+10 for all n n0 .
the algorithm is order n2 (In fact, k is 3 and n0 is 2)
3*n2 > n2–3*n+10 for all n
Thus, the algorithm requires2 no
. more than k*n2 time units for n
n0 ,
So it is O(n2)
52
Order of an Algorithm
(cont.)
53
A Comparison of Growth-Rate
Functions
54
A Comparison of Growth-Rate
Functions (cont.)
55
Growth-Rate
O(1)
Functions
Time requirement is constant, and it is independent of the problem’s size.
O(log2n) Time requirement for a logarithmic algorithm increases increases slowly
as the problem size increases.
O(n) Time requirement for a linear algorithm increases directly with the size
of the problem.
O(n*log2n) Time requirement for a n*log2n algorithm increases more rapidly than
a linear algorithm.
O(n2) Time requirement for a quadratic algorithm increases rapidly with the
size of the problem.
O(n3) Time requirement for a cubic algorithm increases more rapidly with the
size of the problem than the time requirement for a quadratic algorithm.
O(2n) As the size of the problem increases, the time requirement for an
exponential algorithm increases too rapidly to be practical.
56
Growth-Rate
Functions
• If an algorithm takes 1 second to run with the problem size
8, what is the time requirement (approximately) for that
algorithm with the problem size 16?
• If its order is:
O(1) T(n) = 1 second
O(log2n) T(n) = (1*log216) / log28 = 4/3 seconds
O(n) T(n) = (1*16) / 8 = 2 seconds
O(n*log2n) T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
O(n2) T(n) = (1*162) / 82 = 4 seconds
O(n3) T(n) = (1*163) / 83 = 8 seconds
O(2n) T(n) = (1*216) / 28 = 28 seconds = 256 seconds
57
Properties of Growth-Rate
Functions
1. We can ignore low-order terms in an algorithm’s growth-rate
function.
– If an algorithm is O(n3+4n2+3n), it is also O(n3).
– We only use the higher-order term as algorithm’s growth-rate function.
2. We can ignore a multiplicative constant in the higher-order term
of an algorithm’s growth-rate function.
– If an algorithm is O(5n3), it is also O(n3).
3. O(f(n)) + O(g(n)) = O(f(n)+g(n))
– We can combine growth-rate functions.
– If an algorithm is O(n3) + O(4n), it is also O(n3 +4n2) So, it is O(n3).
– Similar rules hold for multiplication.
58
Some
Mathematical Facts
• Some mathematical equalities are:
n
n*(n 2
i 1 2 ... n
1) n
i 1 2 2
i
2
n * ( n 1) * ( 2 n n
3
1 4 ... n 2
1)
i 1 6 3
n 1
2
i
0 1 2 ... 2 n 1 2 n
i0 1
59
Growth-Rate Functions – Example1
Cost Times
i = 1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
i = i + 1; c4 n
sum = sum + i; c5 n
}
T(n) = c1 + c2 + (n+1)*c3 + n*c4 + n*c5
= (c3+c4+c5)*n + (c1+c2+c3)
= a*n + b
So, the growth-rate function for this algorithm is O(n)
60
Growth-Rate Functions – Example2
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) { c5 n*(n+1)
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
i = i +1; c8 n
}
T(n) = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
= (c5+c6+c7)*n2 + (c3+c4+c5+c8)*n + (c1+c2+c3)
= a*n2 + b*n + c
So, the growth-rate function for this algorithm is O(n2)
61
Growth-Rate Functions – Example3
Cost Times
for (i=1; i<=n; i++) c1 n+1
n
for (j=1; j<=i; j++) c2 ( j 1)
j 1
n j
for (k=1; k<=j; k++) c3 ( k 1)
n k 1 j
j 1
x=x+1; c4 k
j 1 k 1
n n
n j
T(n) = c1*(n+1) + c2*( ( j 1) j
) + c4*( k )
j 1 ) + c3* ( (k 1) j 1 k 1
j 1 k 1
= a*n3 + b*n2 + c*n + d
So, the growth-rate function for this algorithm is O(n3)
62
Growth-Rate Functions – Recursive
Algorithms
void hanoi(int n, char source, char dest, char spare) { Cost
if (n > 0) { c1
hanoi(n-1, source, spare, dest); c2
cout << "Move top disk from pole " << c3
source
<< " to pole " << dest << c4
} } endl; hanoi(n-1, spare, dest,
source);
• The time-complexity function T(n) of a recursive algorithm is
defined in terms of itself, and this is known as recurrence equation
for T(n).
• To find the growth-rate function for a recursive algorithm, we have
to solve its recurrence relation.
63
Growth-Rate Functions –
Hanoi Towers
• What is the cost of hanoi(n,’A’,’B’,’C’)?
when n=0
T(0) = c1
when n>0
T(n) = c1 + c2 + T(n-1) + c3 + c4 + T(n-1)
= 2*T(n-1) + (c1+c2+c3+c4)
= 2*T(n-1) + c recurrence equation for the
growth-rate
function of hanoi-towers algorithm
• Now, we have to solve this recurrence equation to find the growth-rate
function of hanoi-towers algorithm
64
Growth-Rate Functions – Hanoi
Towers (cont.)
• There are many methods to solve recurrence equations, but we will use a
simple method known as repeated substitutions.
T(n) = 2*T(n-1) + c
= 2 * (2*T(n-2)+c) + c
= 2 * (2* (2*T(n-3)+c) + c) + c
= 23 * T(n-3) + (22+21+20)*c (assuming n>2)
when substitution repeated i-1th times
= 2i * T(n-i) + (2i-1+ ... +21+20)*c
when i=n
= 2n * T(0) + (2n-1+ ... +21+20)*c
n 1
= 2n * c1 + ( 2 )*ci
i0
= 2n * c1 + ( 2n-1 )*c = 2n*(c1+c) – c So, the growth rate function is O(2n)
65
What to
• An algorithm can requireAnalyze
different times to solve different
problems of the same size.
– Eg. Searching an item in a list of n elements using sequential search.
Cost: 1,2,...,n
• Worst-Case Analysis –The maximum amount of time that an
algorithm require to solve a problem of size n.
– This gives an upper bound for the time complexity of an algorithm.
– Normally, we try to find worst-case behavior of an algorithm.
• Best-Case Analysis –The minimum amount of time that an
algorithm require to solve a problem of size n.
– The best case behavior of an algorithm is NOT so useful.
• Average-Case Analysis –The average amount of time that an
algorithm require to solve a problem of size n.
– Sometimes, it is difficult to find the average-case behavior of an algorithm.
– We have to look at all possible data organizations of a given size n, and their
distribution probabilities of these organizations.
– Worst-case analysis is more common than average-case analysis.
66
What is
Important?
• An array-based list retrieve operation is O(1), a linked-
list- based list retrieve operation is O(n).
• But insert and delete operations are much easier on a linked-
list-
based list implementation.
When selecting the implementation of an Abstract Data
Type (ADT), we have to consider how frequently particular ADT
operations occur in a given application.
• If the problem size is always small, we can probably ignore
the algorithm’s efficiency.
– In this case, we should choose the simplest algorithm.
67
What is Important?
(cont.)
• We have to weigh the trade-offs between an algorithm’s time
requirement and its memory requirements.
• We have to compare algorithms for both style and efficiency.
– The analysis should focus on gross differences in efficiency and not reward coding
tricks that save small amount of time.
– That is, there is no need for coding tricks if the gain is not too much.
– Easily understandable program is also important.
• Order-of-magnitude analysis focuses on large problems.
68
Sequential
Search
int sequentialSearch(const int a[], int item, int n)
{ for (int i = 0; i < n && a[i]!= item; i++);
if (i == n)
return –1;
return i;
}
Unsuccessful Search: O(n)
Successful Search:
Best-Case: item is in the first location of the array O(1)
Worst-Case: item is in the last location of the array
O(n) Average-Case: The number of key comparisons 1,
2, ...,
n
n
i
(n n)/
2
i 1
2 O(n)
n n
69
Binary Search
int binarySearch(int a[], int size, int x) {
int low =0;
int high = size –1;
int mid; // mid will be the
index of
// target when it’s found.
while (low <= high) {
mid
= (low + high)/2; if
(a[mid] < x)
low = mid + 1;
else if (a[mid] > x)
high = mid – 1;
else
return mid;
}
return –1;
}
70
Binary Search –
Analysis
• For an unsuccessful search:
– The number of iterations in the loop is
log2n + 1
O(log2n)
• For a successful search: O(1)
– Best-Case: The number of iterations is 1. O(log2n)
– Worst-Case: The number of iterations is log2n +1 O(log2n)
– Average-Case: The avg. # of iterations
0 <1log2n2 3 4 5 6 7 an array with size
8
3 2 3 1 3 2 3 4 # of iterations
The average # of iterations = 21/8 < log28
71
How much better is O(log2n)?
n O(log2n)
16 4
64 6
256 8
1024 (1KB) 10
16,384 14
131,072 17
262,144 18
524,288 19
1,048,576 (1MB) 20
1,073,741,824 (1GB) 30
72