Algorithms Analysis Lecture 6 Quicksort
Quick Sort 88 14 9825 62 52 79 30 23 31 Divide and Conquer
Quick Sort 88 14 9825 62 52 79 30 23 31 Partition set into two using randomly chosen pivot 14 25 30 2331 88 98 62 79 ≤ 52 ≤
Quick Sort 14 25 30 2331 88 98 62 79 ≤ 52 ≤ 14,23,25,30,31 sort the first half. 62,79,98,88 sort the second half.
Quick Sort 14,23,25,30,31 62,79,88,98 52 Glue pieces together. 14,23,25,30,31,52,62,79,88,98
Quicksort • Quicksort pros [advantage]: – Sorts in place – Sorts O(n lg n) in the average case – Very efficient in practice , it’s quick • Quicksort cons [disadvantage]: – Sorts O(n2 ) in the worst case – And the worst case doesn’t happen often … sorted
Quicksort • Another divide-and-conquer algorithm: • Divide: A[p…r] is partitioned (rearranged) into two nonempty subarrays A[p…q-1] and A[q+1…r] s.t. each element of A[p…q-1] is less than or equal to each element of A[q+1…r]. Index q is computed here, called pivot. • Conquer: two subarrays are sorted by recursive calls to quicksort. • Combine: unlike merge sort, no work needed since the subarrays are sorted in place already.
Quicksort • The basic algorithm to sort an array A consists of the following four easy steps: – If the number of elements in A is 0 or 1, then return – Pick any element v in A. This is called the pivot – Partition A-{v} (the remaining elements in A) into two disjoint groups: • A1 = {x ∈ A-{v} | x ≤ v}, and • A2 = {x ∈ A-{v} | x ≥ v} – return • { quicksort(A1) followed by v followed by quicksort(A2)}
Quicksort • Small instance has n ≤ 1 – Every small instance is a sorted instance • To sort a large instance: – select a pivot element from out of the n elements • Partition the n elements into 3 groups left, middle and right – The middle group contains only the pivot element – All elements in the left group are ≤ pivot – All elements in the right group are ≥ pivot • Sort left and right groups recursively • Answer is sorted left group, followed by middle group followed by sorted right group
Quicksort Code P: first element r: last element Quicksort(A, p, r) { if (p < r) { q = Partition(A, p, r) Quicksort(A, p , q-1) Quicksort(A, q+1 , r) } } • Initial call is Quicksort(A, 1, n), where n in the length of A
Partition • Clearly, all the action takes place in the partition() function – Rearranges the subarray in place – End result: • Two subarrays • All values in first subarray ≤ all values in second – Returns the index of the “pivot” element separating the two subarrays
Partition Code Partition(A, p, r) { x = A[r] // x is pivot i = p - 1 for j = p to r – 1 { do if A[j] <= x then { i = i + 1 exchange A[i] ↔ A[j] } } exchange A[i+1] ↔ A[r] return i+1 } partition() runs in O(n) time
Partition Example A = {2, 8, 7, 1, 3, 5, 6, 4{ 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp ji 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp i j rp i j 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp i j 8822 77 11 33 55 66 44 rp j 1122 77 88 33 55 66 44 i rp j 1122 33 88 77 55 66 44 i rp j 1122 33 88 77 55 66 44 i rp 1122 33 88 77 55 66 44 i rp 1122 33 44 77 55 66 88 i 22 22 22 22 22 22 22 11 11 33 33 11 33 11 33
Partition Example Explanation • Red shaded elements are in the first partition with values ≤ x (pivot) • Gray shaded elements are in the second partition with values ≥ x (pivot) • The unshaded elements have no yet been put in one of the first two partitions • The final white element is the pivot
Choice Of Pivot Three ways to choose the pivot: • Pivot is rightmost element in list that is to be sorted – When sorting A[6:20], use A[20] as the pivot – Textbook implementation does this • Randomly select one of the elements to be sorted as the pivot – When sorting A[6:20], generate a random number r in the range [6, 20] – Use A[r] as the pivot
Choice Of Pivot • Median-of-Three rule - from the leftmost, middle, and rightmost elements of the list to be sorted, select the one with median key as the pivot – When sorting A[6:20], examine A[6], A[13] ((6+20)/2), and A[20] – Select the element with median (i.e., middle) key – If A[6].key = 30, A[13].key = 2, and A[20].key = 10, A[20] becomes the pivot – If A[6].key = 3, A[13].key = 2, and A[20].key = 10, A[6] becomes the pivot
Worst Case Partitioning • The running time of quicksort depends on whether the partitioning is balanced or not. ∀ Θ(n) time to partition an array of n elements • Let T(n) be the time needed to sort n elements • T(0) = T(1) = c, where c is a constant • When n > 1, – T(n) = T(|left|) + T(|right|) + Θ(n) • T(n) is maximum (worst-case) when either |left| = 0 or |right| = 0 following each partitioning
Worst Case Partitioning
Worst Case Partitioning • Worst-Case Performance (unbalanced): – T(n) = T(1) + T(n-1) + Θ(n) • partitioning takes Θ(n) = [2 + 3 + 4 + …+ n-1 + n ]+ n = = [∑k = 2 to n k ]+ n = Θ(n2 ) • This occurs when – the input is completely sorted • or when – the pivot is always the smallest (largest) element )(2/)1(...21 2 1 nnnnk n k Θ=+=+++=∑=
Best Case Partition • When the partitioning procedure produces two regions of size n/2, we get the a balanced partition with best case performance: – T(n) = 2T(n/2) + Θ(n) = Θ(n lg n) • Average complexity is also Θ(n lg n)
Best Case Partitioning
Average Case • Assuming random input, average-case running time is much closer to Θ(n lg n) than Θ(n2 ) • First, a more intuitive explanation/example: – Suppose that partition() always produces a 9-to-1 proportional split. This looks quite unbalanced! – The recurrence is thus: T(n) = T(9n/10) + T(n/10) + Θ(n) = Θ(nlg n)? [Using recursion tree method to solve]
Average Case ( ) ( /10) (9 /10) ( ) ( log )!T n T n T n n n n= + + Θ = Θ log2n = log10n/log102
Average Case • Every level of the tree has cost cn, until a boundary condition is reached at depth log10 n = Θ( lgn), and then the levels have cost at most cn. • The recursion terminates at depth log10/9 n= Θ(lg n). • The total cost of quicksort is therefore O(n lg n).
Average Case • What happens if we bad-split root node, then good-split the resulting size (n-1) node? – We end up with three subarrays, size • 1, (n-1)/2, (n-1)/2 – Combined cost of splits = n + n-1 = 2n -1 = Θ(n) n 1 n-1 (n-1)/2 (n-1)/2 ( )nΘ (n-1)/2 (n-1)/2 n ( )nΘ )1( −Θ n
Intuition for the Average Case • Suppose, we alternate lucky and unlucky cases to get an average behavior ( ) 2 ( / 2) ( ) lucky ( ) ( 1) ( ) unlucky we consequently get ( ) 2( ( / 2 1) ( / 2)) ( ) 2 ( /2 1) ( ) ( log ) L n U n n U n L n n L n L n n n L n n n n = + Θ = − + Θ = − + Θ + Θ = − + Θ = Θ The combination of good and bad splits would result in T(n) = O (n lg n), but with slightly larger constant hidden by the O-notation.
Randomized Quicksort • An algorithm is randomized if its behavior is determined not only by the input but also by values produced by a random-number generator. • Exchange A[r] with an element chosen at random from A[p…r] in Partition. • This ensures that the pivot element is equally likely to be any of input elements. • We can sometimes add randomization to an algorithm in order to obtain good average-case performance over all inputs.
Randomized Quicksort Randomized-Partition(A, p, r) 1. i ← Random(p, r) 2. exchange A[r] ↔ A[i] 3. return Partition(A, p, r) Randomized-Quicksort(A, p, r) 1. if p < r 2. then q ← Randomized-Partition(A, p, r) 3. Randomized-Quicksort(A, p , q-1) 4. Randomized-Quicksort(A, q+1, r) pivot swap
Review: Analyzing Quicksort • What will be the worst case for the algorithm? – Partition is always unbalanced • What will be the best case for the algorithm? – Partition is balanced
Summary: Quicksort • In worst-case, efficiency is Θ(n2 ) – But easy to avoid the worst-case • On average, efficiency is Θ(n lg n) • Better space-complexity than mergesort. • In practice, runs fast and widely used

Quick sort Algorithm Discussion And Analysis

  • 1.
  • 2.
  • 3.
    Quick Sort 88 14 9825 62 52 79 30 23 31 Partition setinto two using randomly chosen pivot 14 25 30 2331 88 98 62 79 ≤ 52 ≤
  • 4.
    Quick Sort 14 25 30 2331 88 98 62 79 ≤ 52≤ 14,23,25,30,31 sort the first half. 62,79,98,88 sort the second half.
  • 5.
    Quick Sort 14,23,25,30,31 62,79,88,98 52 Glue piecestogether. 14,23,25,30,31,52,62,79,88,98
  • 6.
    Quicksort • Quicksort pros[advantage]: – Sorts in place – Sorts O(n lg n) in the average case – Very efficient in practice , it’s quick • Quicksort cons [disadvantage]: – Sorts O(n2 ) in the worst case – And the worst case doesn’t happen often … sorted
  • 7.
    Quicksort • Another divide-and-conqueralgorithm: • Divide: A[p…r] is partitioned (rearranged) into two nonempty subarrays A[p…q-1] and A[q+1…r] s.t. each element of A[p…q-1] is less than or equal to each element of A[q+1…r]. Index q is computed here, called pivot. • Conquer: two subarrays are sorted by recursive calls to quicksort. • Combine: unlike merge sort, no work needed since the subarrays are sorted in place already.
  • 8.
    Quicksort • The basicalgorithm to sort an array A consists of the following four easy steps: – If the number of elements in A is 0 or 1, then return – Pick any element v in A. This is called the pivot – Partition A-{v} (the remaining elements in A) into two disjoint groups: • A1 = {x ∈ A-{v} | x ≤ v}, and • A2 = {x ∈ A-{v} | x ≥ v} – return • { quicksort(A1) followed by v followed by quicksort(A2)}
  • 9.
    Quicksort • Small instancehas n ≤ 1 – Every small instance is a sorted instance • To sort a large instance: – select a pivot element from out of the n elements • Partition the n elements into 3 groups left, middle and right – The middle group contains only the pivot element – All elements in the left group are ≤ pivot – All elements in the right group are ≥ pivot • Sort left and right groups recursively • Answer is sorted left group, followed by middle group followed by sorted right group
  • 10.
    Quicksort Code P: firstelement r: last element Quicksort(A, p, r) { if (p < r) { q = Partition(A, p, r) Quicksort(A, p , q-1) Quicksort(A, q+1 , r) } } • Initial call is Quicksort(A, 1, n), where n in the length of A
  • 11.
    Partition • Clearly, allthe action takes place in the partition() function – Rearranges the subarray in place – End result: • Two subarrays • All values in first subarray ≤ all values in second – Returns the index of the “pivot” element separating the two subarrays
  • 12.
    Partition Code Partition(A, p,r) { x = A[r] // x is pivot i = p - 1 for j = p to r – 1 { do if A[j] <= x then { i = i + 1 exchange A[i] ↔ A[j] } } exchange A[i+1] ↔ A[r] return i+1 } partition() runs in O(n) time
  • 13.
    Partition Example A ={2, 8, 7, 1, 3, 5, 6, 4{ 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp ji 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp i j rp i j 2 8 7 1 3 5 62 8 7 1 3 5 6 44 rp i j 8822 77 11 33 55 66 44 rp j 1122 77 88 33 55 66 44 i rp j 1122 33 88 77 55 66 44 i rp j 1122 33 88 77 55 66 44 i rp 1122 33 88 77 55 66 44 i rp 1122 33 44 77 55 66 88 i 22 22 22 22 22 22 22 11 11 33 33 11 33 11 33
  • 14.
    Partition Example Explanation •Red shaded elements are in the first partition with values ≤ x (pivot) • Gray shaded elements are in the second partition with values ≥ x (pivot) • The unshaded elements have no yet been put in one of the first two partitions • The final white element is the pivot
  • 15.
    Choice Of Pivot Threeways to choose the pivot: • Pivot is rightmost element in list that is to be sorted – When sorting A[6:20], use A[20] as the pivot – Textbook implementation does this • Randomly select one of the elements to be sorted as the pivot – When sorting A[6:20], generate a random number r in the range [6, 20] – Use A[r] as the pivot
  • 16.
    Choice Of Pivot •Median-of-Three rule - from the leftmost, middle, and rightmost elements of the list to be sorted, select the one with median key as the pivot – When sorting A[6:20], examine A[6], A[13] ((6+20)/2), and A[20] – Select the element with median (i.e., middle) key – If A[6].key = 30, A[13].key = 2, and A[20].key = 10, A[20] becomes the pivot – If A[6].key = 3, A[13].key = 2, and A[20].key = 10, A[6] becomes the pivot
  • 17.
    Worst Case Partitioning •The running time of quicksort depends on whether the partitioning is balanced or not. ∀ Θ(n) time to partition an array of n elements • Let T(n) be the time needed to sort n elements • T(0) = T(1) = c, where c is a constant • When n > 1, – T(n) = T(|left|) + T(|right|) + Θ(n) • T(n) is maximum (worst-case) when either |left| = 0 or |right| = 0 following each partitioning
  • 18.
  • 19.
    Worst Case Partitioning •Worst-Case Performance (unbalanced): – T(n) = T(1) + T(n-1) + Θ(n) • partitioning takes Θ(n) = [2 + 3 + 4 + …+ n-1 + n ]+ n = = [∑k = 2 to n k ]+ n = Θ(n2 ) • This occurs when – the input is completely sorted • or when – the pivot is always the smallest (largest) element )(2/)1(...21 2 1 nnnnk n k Θ=+=+++=∑=
  • 20.
    Best Case Partition •When the partitioning procedure produces two regions of size n/2, we get the a balanced partition with best case performance: – T(n) = 2T(n/2) + Θ(n) = Θ(n lg n) • Average complexity is also Θ(n lg n)
  • 21.
  • 22.
    Average Case • Assumingrandom input, average-case running time is much closer to Θ(n lg n) than Θ(n2 ) • First, a more intuitive explanation/example: – Suppose that partition() always produces a 9-to-1 proportional split. This looks quite unbalanced! – The recurrence is thus: T(n) = T(9n/10) + T(n/10) + Θ(n) = Θ(nlg n)? [Using recursion tree method to solve]
  • 23.
    Average Case ( )( /10) (9 /10) ( ) ( log )!T n T n T n n n n= + + Θ = Θ log2n = log10n/log102
  • 24.
    Average Case • Everylevel of the tree has cost cn, until a boundary condition is reached at depth log10 n = Θ( lgn), and then the levels have cost at most cn. • The recursion terminates at depth log10/9 n= Θ(lg n). • The total cost of quicksort is therefore O(n lg n).
  • 25.
    Average Case • Whathappens if we bad-split root node, then good-split the resulting size (n-1) node? – We end up with three subarrays, size • 1, (n-1)/2, (n-1)/2 – Combined cost of splits = n + n-1 = 2n -1 = Θ(n) n 1 n-1 (n-1)/2 (n-1)/2 ( )nΘ (n-1)/2 (n-1)/2 n ( )nΘ )1( −Θ n
  • 26.
    Intuition for theAverage Case • Suppose, we alternate lucky and unlucky cases to get an average behavior ( ) 2 ( / 2) ( ) lucky ( ) ( 1) ( ) unlucky we consequently get ( ) 2( ( / 2 1) ( / 2)) ( ) 2 ( /2 1) ( ) ( log ) L n U n n U n L n n L n L n n n L n n n n = + Θ = − + Θ = − + Θ + Θ = − + Θ = Θ The combination of good and bad splits would result in T(n) = O (n lg n), but with slightly larger constant hidden by the O-notation.
  • 27.
    Randomized Quicksort • Analgorithm is randomized if its behavior is determined not only by the input but also by values produced by a random-number generator. • Exchange A[r] with an element chosen at random from A[p…r] in Partition. • This ensures that the pivot element is equally likely to be any of input elements. • We can sometimes add randomization to an algorithm in order to obtain good average-case performance over all inputs.
  • 28.
    Randomized Quicksort Randomized-Partition(A, p,r) 1. i ← Random(p, r) 2. exchange A[r] ↔ A[i] 3. return Partition(A, p, r) Randomized-Quicksort(A, p, r) 1. if p < r 2. then q ← Randomized-Partition(A, p, r) 3. Randomized-Quicksort(A, p , q-1) 4. Randomized-Quicksort(A, q+1, r) pivot swap
  • 29.
    Review: Analyzing Quicksort •What will be the worst case for the algorithm? – Partition is always unbalanced • What will be the best case for the algorithm? – Partition is balanced
  • 30.
    Summary: Quicksort • Inworst-case, efficiency is Θ(n2 ) – But easy to avoid the worst-case • On average, efficiency is Θ(n lg n) • Better space-complexity than mergesort. • In practice, runs fast and widely used

Editor's Notes

  • #27 larger constant in the O notation