Encyclopedia > Talk:Sort algorithm

  Article Content

Talk:Sort algorithm

Here's a Python module for testing comparison-based sort routines written in Python. (Obviously this won't work for radix and trie sorts, but it should work for most others.) The purpose of this module is to verify that code presented as an implementation of a sort algorithm does, in fact, sort correctly. (Whether it is an implementation of the right algorithm is a little more difficult to verify mechanically.)

Reasons for giving algorithm implementations in Python rather than pseudocode are described on talk:Algorithm.

I call this module sorttest.py.

"""This is a Python module for testing sort routines.  The sort routines
must obey the following interface:

def sort(array, cmp=lambda x, y: x > y):

It doesn't matter what it returns, but it must not throw an exception
in any of the tests.

Note that in Python, the parameter names are part of the interface, so
the parameters must be called 'array' and 'cmp'.  (There can be
additional optional parameters, though.)

When it returns, it should have mutated the array 'array' so that it
contains the same items (in the sense of 'is'), but possibly in a
different order.  The new order must be such that, for any i in
range(len(array-1)), cmp(array[i], array[i+1]) is false.  So, by
default, it sorts ascending.

It must not mutate anything the array points to --- that's cheating
--- but this is hard to test.
"""

import random

def bubblesort(array, cmp=lambda x, y: x > y):
    """The simplest working sort routine, for testing purposes.

    This is here to make it possible to test sort-testing routines."""
    indices = range(len(array))
    indices.reverse()
    for j in indices:
        for i in range(j):
            if cmp(array[i], array[i+1]):
                (array[i], array[i+1]) = (array[i+1], array[i])

OutOfOrder = "sorttest.OutOfOrder"
ScrewedUpArray = "sorttest.ScrewedUpArray"

def brokensort1(array, cmp=lambda x, y: x > y):
    """Broken 'sort' routine that overwrites the whole array with 1 element."""
    for i in range(len(array)): array[i] = array[0]

def brokensort2(array, cmp=lambda x, y: x > y):
    """Broken 'sort' routine that bubblesorts backwards."""
    bubblesort(array, lambda x, y, cmp=cmp: cmp(y, x))

def testsort_onearray(array, sort, cmp=None):
    """Given a sort routine and an array, raise an exception if it sorts wrong.
    """
    arraycopy = array[:]  # copy array
    if cmp is None: sort(array)
    else: sort(array, cmp)

    # verify that the array is in order
    if cmp is None: cmp = lambda x, y: x > y
    for i in range(len(array)-1):
        if cmp(array[i], array[i+1]):
            raise OutOfOrder, (i, arraycopy, array, array[i], array[i+1])

    # verify that it consists of the elements of the original array:
    # doesn't contain any fewer elements:
    if len(array) != len(arraycopy):
        raise ScrewedUpArray, ("array length changed", arraycopy, len(array),
                               len(arraycopy))
    # and doesn't contain any more copies of any element than the original
    # array:
    idmap = {}
    for i in arraycopy:
        if not idmap.has_key(id(i)): idmap[id(i)] = 0
        idmap[id(i)] = idmap[id(i)] + 1
    for i in array:
        if not idmap.has_key(id(i)):
            raise ScrewedUpArray, ("element wasn't in original array",
                                   arraycopy, i)
        idmap[id(i)] = idmap[id(i)] - 1
        if idmap[id(i)] < 0:
            raise ScrewedUpArray, ("element was in original array fewer times",
                                   arraycopy, i)

def qwi(string):
    """Turn a string containing a list of ints into a list of ints."""
    return map(int, string.split())

def testsort(sort):
    """Test a sort routine on a variety of inputs.  Main entry point."""
    
    def backwards(x, y): return x < y
    
    # simplest case: empty array
    testsort_onearray([], sort)
    testsort_onearray([], sort, backwards)
    # sorting a short already-in-order array
    testsort_onearray([1, 2, 3], sort)
    testsort_onearray([3, 2, 1], sort, backwards)
    # an actual array that needs some sorting
    testsort_onearray([4, 2, 5, 3, 6, 0], sort)
    testsort_onearray([4, 2, 5, 3, 6, 0], sort, backwards)
    # and with duplicate elements
    testsort_onearray(qwi('0 0 1 2 2 3 3 3 4 5 5'), sort)
    testsort_onearray(qwi('5 5 5 4 3 2 1 1'), sort, backwards)
    testsort_onearray(qwi('0 0 1 2 2 3 3 3 4 5 5'), sort, backwards)
    testsort_onearray(qwi('5 5 5 4 3 2 1 1'), sort)
    # more more-or-less random tests with lists of integers
    testsort_onearray(qwi('1 33 1 3 1 3 42 1 5 5 1 -1 17 40'), sort)
    testsort_onearray(qwi('1 1 1 1 1'), sort)
    testsort_onearray([1], sort)
    # I'd like to use a bigger random list, but O(N^2) sorts get too slow
    rg = random.Random()
    testsort_onearray([rg.randint(0, 1000) for x in range(100)], sort)
    # verify that it works on things other than integers
    testsort_onearray('vow to bring enlightenment to all sentient beings'
                      .split(), sort)
    testsort_onearray(map(lambda x: [x], qwi('1 3 1 4 5')), sort,
                      lambda x, y: x[0] > y[0])

def test():
    """Test routine to verify that sort-testing routines work.

    This routine runs when the module loads to ensure that it still
    works correctly.

    """

    testsort_onearray([], bubblesort)
    testsort_onearray([], bubblesort, lambda x, y: x < y)
    testsort_onearray([1, 2, 3], bubblesort)
    testsort_onearray([1, 2, 3], bubblesort, lambda x, y: x < y)
    testsort_onearray([3, 2, 1], bubblesort)
    testsort_onearray([3, 2, 1], bubblesort, lambda x, y: x < y)
    testsort_onearray(map(int, '2 3 3 1 41 31 1 0 1'.split()), bubblesort)

    ok = 0
    try: testsort_onearray([1, 2], brokensort2)
    except: ok = 1
    assert ok

    ok = 0
    try: testsort_onearray([1, 2], brokensort1)
    except: ok = 1
    assert ok

    testsort(bubblesort)

    ok = 0
    try: testsort(brokensort1)
    except: ok = 1
    assert ok

    ok = 0
    try: testsort(brokensort2)
    except: ok = 1
    assert ok

test()


Could somebody elaborate a bit on the remark to Selection sort please?

works in-place, but loses stability or gains complexity

--HJH

I think they mean the following: if you use the obvious in-place implementation of selection sort (i.e.: find the smallest element in the list of not-yet-sorted elements, then swap it to the correct position), then it won't stable anymore. If you want to keep it stable, you need to use a more complex algorithm. AxelBoldt 18:39 Oct 17, 2002 (UTC)

This is true of all unstable sorting algorithms, so I don't think it's necessary. Martin


Combsort comments moved to talk:comb sort


I considered using an unordered list instead of a table - please take a look and decide which is (A) easiest to read and (B) easiest to edit... Martin

  • Bubble sort: O(n²), but O(n) on already sorted data; works in-place; stable
  • Cocktail sort (bidirectional bubble sort): O(n²), but O(n) on already sorted data; works in-place; stable
  • Comb sort: O(n log n); works in-place; stable
  • Selection sort: O(n²); works in-place, but loses stability or gains complexity; unstable
  • Straight insertion sort: O(n²), but O(n) on already sorted data; works in-place; stable
  • Bucket sort: O(n); requires O(n) extra memory; stable
  • Counting sort: O(n+k); requires O(n+k) extra memory; stable
  • Heapsort: O(n log n); works in-place; unstable
  • Smoothsort: O(n log n), but O(n) on already sorted data; works in-place; unstable
  • Merge sort: O(n log n); requires O(n) extra memory; stable
  • Quicksort: O(n log n), but O(n²) in worst case; requires O(log n) extra memory; unstable
  • binary tree sort: O(n log n), but O(n²) on already sorted data; requires O(n) extra memory, typically several pointers per input item; stable
  • Pigeonhole sort: O(n+k); requires O(k) extra memory; can only be used when all keys are unique
  • Radix sort: O(n log k); requires O(n) extra space; stable
  • Shell sort: Worst case O(n1.5), Average case O(n1.25, Best case O(n log n) on already sorted data; works in-place; stable
  • Bogosort: Worst case O(n!), Average case O((n/2)!), Best case O(n), depending on luck; requires O(n) extra space; unstable

The table is easier to read and not that difficult to edit (in fact it may be easier). The list gains flexibility, but it's not really needed. Verbose descriptions should be handled in the individual articles. This is just a quick overview and comparison, which tables are well suited for in this case. -nknight 10:25 Apr 26, 2003 (UTC)

Thanks for the feedback - I was uncertain, hence why I wasn't bold... :)
Another suggestion - what if the complexity columns were merged into one column? Might save space/width/... Martin 13:31 Apr 26, 2003 (UTC)

I found and corrected a Big Fat omission from the list of mathematical topics: It did not link to this page! Now it does, but it does not link to most of the separate articles on individual sort algorithms. Since there are only 24 hours in a day, I will not fix that now, so this notice is your chance to beat me to it. Michael Hardy 23:39 May 1, 2003 (UTC)

Is sort algorithm a mathematical topic? I think it is a computer science topic. Anyway, about table. I don't like the current table, which is not easy to read (remember rendering of tables heavily depends on your environment) and not eay to edit. But I don't see the list version you posted better. I will edit the article by my version, which looks better to me. -- Taku 00:10 May 2, 2003 (UTC)

It is a computer science topic, but it is also a mathematical topic. Michael Hardy 00:15 May 2, 2003 (UTC)



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
Kings Park, New York

... Park is located at 40°53'19" North, 73°14'33" West (40.888497, -73.242582)1. According to the United States Census Bureau, the town has a total area of 16.3 ...

 
 
 
This page was created in 70 ms