Encyclopedia > Quicksort

Article Content

Quicksort

Quicksort is a sorting algorithm which, on average, needs Θ(n log(n)) comparisons to sort n items. It can operate in place, meaning that it requires only a constant amount of additional storage space. Its inner loop is inherently very fast on nearly all computers, which makes it significantly faster than other O(n log n) algorithms that can sort in place in the average case. Quicksort's worst-case performance is O(n²): much worse than some other sorting algorithms such as heapsort or merge sort. Because of its good average performance and simple implementation, Quicksort is one of the most popular sorting algorithms in use. Quicksort was invented by C. A. R. Hoare.

The Quicksort algorithm uses a recursive divide and conquer strategy to sort a list. The steps are:

pick a pivot element from the list.
reorder the list so that all elements less than the pivot precede all elements greater than the pivot.
recursively sort the sub-list of elements less than the pivot and the sub-list of elements greater than the pivot. If one of the sub-lists is empty, it can be ignored.

The inner loop which performs the partition is often very amenable to optimization as all comparisons are being done with a single pivot element. This is one reason why Quicksort is often the fastest sorting algorithm, at least on average. A second optimization is to switch to a different sorting algorithm once the list becomes small. For example, selection sort might be inefficient for large data sets, but it is often faster than Quicksort on small lists. An optimized implementation of Quicksort can choose a size at which it switches from using Quicksort to using another sorting algorithm, such as selection sort.

The most crucial problem of Quicksort is the choice of pivot element. A naive implementation of Quicksort, like the ones below, will be terribly inefficient for certain inputs. For example, if the pivot always turns out to be the smallest element in the list, then Quicksort degenerates to Selection sort with Θ(n²) running time. For reasonably large data sets this running time becomes unacceptable. A secondary problem is the recursion depth. This becomes linear, and the stack requires O(n) extra space. Quicksort can be re-written to be partially recursive. After the partitioning phase there are two sub-lists to be sorted. The shorter sub-list is sorted by a recursive call to Quicksort. The larger one is sorted by the main function using a tail-recursion elimination loop. This limits the addition storage of Quicksort to O(log n).

Simple rules for choosing the pivot, like picking the first, last, or middle element, work fine if the input data is random, but result in horrible performance on certain kinds of (realistic) input data. It is possible to select a pivot in linear time that is always good, the median of the data, but these linear-time median algorithms are so complex and slow that this is never done in practice.

The best practical method of choosing a pivot, when dealing with such data, is to choose it randomly. This requires a random number generator, but ensures that bad running times are extremely unlikely. One can also choose 3 or 5 random elements, and let the pivot be the median of the selected elements.

A final problem is the situation in which the input is provided by an adversary. This is quite possible in, for example, web services. Maybe the data to be sorted was carefully selected with no other aim than to exhaust the CPU time of the server for a denial of service attack. Randomised Quicksort is only safe against these attacks if the random generator is a cryptographically strong random generator. Standard random generators supplied with the programming language are almost never good enough for this purpose. For this reason, most Quicksort routines in current use are not safe against adversarially selected inputs. (See competitive analysis.)

The most direct competitor of Quicksort is heapsort. Heapsort is typically somewhat slower than Quicksort, but the worst-case running time is always O(n log n). Quicksort is usually faster, but needs lots of extra care to avoid the worst-case behaviour problems.

A simple implementation of Quicksort on an array of integers in C:

void quicksort(int * low, int * high)
{
   /* I naively use the first value in the array as the pivot */
   /* this will not give good performance real usage */
   
   int * lowbound = low + 1;       /* the high boundary of the low subarray */
   int * highbound = high - 1;     /* the low boundary of the high subarray */
   int temp;
   
   while(lowbound < highbound + 1) /* partition the array */
   {
      if(*lowbound < *low)         /* compare to pivot */
        lowbound++;                /* move lowbound toward the middle */
      else
      {
         temp = *lowbound;         /* swap *lowbound and *highbound */
         *lowbound = *highbound;
         *highbound = temp;
         highbound--;              /* move highbound toward the middle */
      }
   }
   
   highbound++;                    /* move bounds back to the correct positions */
   lowbound--;
   
   temp = *low;                    /* move the pivot into the middle */
   *low = *lowbound;
   *lowbound = temp;
   
   if(low != lowbound)             /* recurse on the subarrays */
     quicksort(low, lowbound);
   if(high != highbound)
     quicksort(highbound, high);
}

Here's an implementation in Python:

def partition(array, start, end, cmp):
    while start < end:
        # at the top of this loop, our partition element is at 'start'
        while start < end:
            if cmp(array[start], array[end]):
                (array[start], array[end]) = (array[end], array[start])
                break
            end = end - 1
        # at the top of this loop, our partition element is at 'end'
        while start < end:
            if cmp(array[start], array[end]):
                (array[start], array[end]) = (array[end], array[start])
                break
            start = start + 1
    return start

def quicksort(array, cmp=lambda x, y: x > y, start=None, end=None):
    """The fastest exchange sort for most purposes."""
    if start is None: start = 0
    if end is None: end = len(array)
    if start < end:
        i = partition(array, start, end-1, cmp)
        quicksort(array, cmp, start, i)
        quicksort(array, cmp, i+1, end)

Here's a very short version written in the functional programming language Haskell:

 quicksort :: Ord a => [a] -> [a]
 quicksort [] = []
 quicksort (h:t) = (quicksort [y| y<-t, y<h])  ++ [h] ++(quicksort [ y| y<-t,y>h])

Note the use of list comprehensions[?], and also that this uses the head of the list as a pivot element, so it does not give optimum results if the list is already nearly sorted. This version is very easy to understand, as it simply recursively sorts the lists formed by filtering all the items less than the head, and all the items greater than the head, and inserts the singleton list containing the head item in between. The code is almost self explanatory.

External link

Demonstration and comparison of Quicksort with Bubble sort, Merge sort and Radix sort implemented in JavaScript: http://he.net/~mmahoney/sort/sort

All Wikipedia text is available under the terms of the GNU Free Documentation License

Search Encyclopedia

Search over one million articles, find something about almost anything!