Compare sorting algorithms in C#, article 1 of 5

I'm going to be off the grid for a while so I'm posting a series of articles about sorting algorithms. Much of the text originally appeared in the December 1995 issue of Visual Basic Developer magazine, a publication that seems to no longer exist. All of these posts include the same download: an example program that demonstrates all of the algorithms described. You can use the example to compare the different algorithms under difference circumstances. Keep in mind that some of the algorithms may take a very long time depending on the kind of data you are sorting. For example, Bubblesort will take a very long time to sort 1 million randomly arranged items.

Why So Many Algorithms?

The .NET Framework's Array class provides a Sort method that sorts an array very quickly, apparently by using Quicksort. In most cases, you can simply use Array.Sort to get very good performance. The Framework's implementation is quite good so you are unlikely to beat it's performance for "random" data, so why should you bother to learn how these algorithms work?

Algorithm books spend a great deal of time on sorting algorithms. My old book Visual Basic Algorithms (which was written for Visual Basic 5/6), for example, spends an entire chapter describing different sorting algorithms. There are several reasons why sorting algorithms are so popular.

First, sorting is a common task in computer applications. Almost any data is more meaningful if you can sort it and display it in various ways.

Second, sorting algorithms demonstrate many important algorithmic techniques such as binary subdivision, recursion, and linear interpolation. Studying sorting algorithms will teach you techniques that you may be able to use in other parts of your applications.

Third, different sorting algorithms behave differently for different data so no algorithm is best under all circumstances. Often Quicksort provides excellent performance, but under certain circumstances Bubblesort or Countingsort can give you much better performance.

Finally complex algorithms are just plain fun! Okay, I admit I'm an algorithm nerd but I think of these things as puzzles. It's more fun for me to get a complex algorithm working correctly than to work a Sudoku puzzle.

This series of posts describe several important sorting algorithms, including a Quicksort algorithm that is similar to the one used by Array.Sort. Depending on the circumstances you can pick the algorithm that best suits your data.

Array.Sort

The Array class's Sort method sorts arrays very quickly. It's easy to use and you don't need to debug it so I recommend that you start by using it and decide whether it is fast enough for your needs. If Array.Sort is fast enough, then you don't need to implement, debug, and maintain a more complicated algorithm. If you have performance problems and know something about your data that will let one of the other algorithms give you better performance, then you can try implementing another algorithm.

The example program uses Array.Sort as a benchmark for the other algorithms. The following code shows how the program runs a test of the Array.Sort method.

// Sort with Array.Sort.
private void btnArraySort_Click(object sender, EventArgs e)
{
    RunAlgorithm(txtTimeArraySort, ArraySort);
}

// Run an algorithm.
private delegate void Algorithm(int[] values);
private void RunAlgorithm(TextBox time_textbox, Algorithm algorithm)
{
    time_textbox.Clear();
    Cursor = Cursors.WaitCursor;
    Refresh();
    int num_trials = int.Parse(txtNumTrials.Text);

    // Prepare the data.
    PrepareData();

    TimeSpan elapsed = new TimeSpan(0);
    DateTime stop_time, start_time;
    for (int trial = 0; trial < num_trials; trial++)
    {
        // Copy the unsorted data into the SortedValues array.
        Array.Copy(Values, SortedValues, NumItems);

        // Run the algorithm.
        start_time = DateTime.Now;
        algorithm(SortedValues);
        stop_time = DateTime.Now;

        // Add the elapsed time.
        elapsed += (stop_time - start_time);
    }

    // Display the total time.
    time_textbox.Text = elapsed.TotalSeconds.ToString("0.00") + " sec";

    // Verify that the algorithm worked.
    Verify();

    Cursor = Cursors.Default;
}

When you click the Array.Sort button, the button's event handler calls the RunAlgorithm method passing it the TextBox where it should display the elapsed time and an algorithm.

The next piece of code declares a delegate named Algorithm that represents a method that takes an array of integers as a parameter.

The RunAlgorithm method takes a TextBox and an Algorithm as parameters. It performs some setup and calls PrepareData to make the data that will be sorted.

PrepareData is relatively straightforward so it isn't shown here. It just generates random numbers as specified by the parameters you enter on the form. If the Sorted box is checked, the method sorts the data and then switches some items to make "# Unsorted" of them be unsorted. Download the example and look at the code to see how it works.

Next the RunAlgorithm method performs the trials. For each trial the code copies the unsorted values into the SortedValues array and then calls the current algorithm to sort the SortedValues array.

The method finishes by displaying the total elapsed time.

Note that the timing is approximate. Many trials can even out small random fluctuations but for extremely fast tests such as sorting 5 items, the timing may not be very good. In most cases the results are reasonable, however, and you can use them to compare different algorithms so I'm not going to worry about more accurate timers.

The following code shows how the program responds when you click the Array.Sort button.

// Sort with Array.Sort.
private void btnArraySort_Click(object sender, EventArgs e)
{
    RunAlgorithm(txtTimeArraySort, ArraySort);
}

This code simply calls RunAlgorithm, passing it the txtTimeArraySort TextBox and the ArraySort method to be tested. The code for the other algorithms is similar. Their event handlers just pass different result TextBoxes and algorithm methods.

The following code shows the ArraySort method.

// Sort with Array.Sort.
private void ArraySort(int[] values)
{
    // Sort.
    Array.Sort(values);
}

This code simply calls the Array class's Sort method. Nothing could be simpler!

If you look closely at the picture above, you'll see that Array.Sort beat Quicksort by a fair margin and did much better than Countingsort. (Bubblesort and Selectionsort would have taken forever so I didn't run them on this data.) It's important to note, however, that this test used 10,000 items with values between 0 and 1 million. Different algorithms perform differently with different kinds of data so Array.Sort does isn't always fastest. It's often fastest, however, and very easy to use so it's a good first choice.

The following articles describe the other algorithms in greater detail so you can learn when to use which.

   

 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments

  • 6/18/2012 7:29 AM Ben wrote:
    It's difficult to judge relative performance when you're also timing test setup (the Array.Copy call in the RunAlgorithm loop).
    Reply to this
    1. 6/24/2012 4:42 PM Rod Stephens wrote:
      Actually if you look at the loop, you'll see that the timing code doesn't include the Array.Copy, only the call to the algorithm.

      You could include Array.Copy without messing up the results too much as long as all algorithms use Array.Copy the same number of times. The way it is now, it's still a bit uncertain because you start and stop timing so often but for large tests and special cases you can see the general effects.

      If you want better tests, you could make a big array of arrays ahead of time to sort. And/or you could use higher-performance timers.
      Reply to this
      1. 7/2/2012 1:27 PM Ben wrote:
        Of course your timings include the call to Array.Copy -- even in the first iteration. Read it again!
        Reply to this
        1. 7/6/2012 9:54 AM Rod Stephens wrote:
          Sorry about that. I've fixed it. I think the original version had pieces of the two approaches.
          Reply to this
  • 6/19/2012 4:35 AM Dave Gordon wrote:
    It is strange that so few people ever talk about code optimizations these days. It was not that long ago that those who did not optimized their code would have products that simply were not purchased.

    Thanks for your article - it is nice to see someone returning to the basics.

    I would love to have a blog dedicated to optimizations on MSDN - however, I will just have to search 1001 different blogs for snippets here and there.
    Reply to this
    1. 6/24/2012 4:49 PM Rod Stephens wrote:
      Back in the days when programmer time was cheap and memory, disk, and CPU cycles were expensive, this was a big issue. Now days if something takes too much memory you buy more and if something's too slow you wait a year and buy a new computer.

      You still need to be careful for some problems but for a hug number of general problems a relational database and some sloppy coding will get you by.

      But I still think it's worthwhile understanding these algorithms so you can apply their techniques to your own code.

      And in all fairness, the .NET Framework provides some pretty good algorithmic classes such as Array.Sort, Hashtable, List, etc.
      Reply to this
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.