ToDADS errata: Chapter 5: Sorting

published: Tue, 17-Jun-2003   |   updated: Sat, 20-Aug-2016

1. Per Larsen thought that the section on shuffling in the book could do with a little more explanation, specifically why the code in Listing 5.3 should produce more varied results than the code in Listing 5.2.

The book, in the section on shuffling (pages 136-138), states that the simple shuffle code in Listing 5.2 is wrong because it produces 1 permutation out of a possible n^n permutations whereas the normal way of viewing shuffling is to select one particular permutation out of the n! possible permutations, and that's done by Listing 5.3. But does that make Listing 5.2 produce any less random shufflings?

The answer is yes. Let's see how. Take a deck of 3 cards. Listing 5.2 will produce 27 possible shuffles (3^3). To see this, start with the deck ordered as ABC.

After the first time through the loop, we'll have either ABC, BAC, or CBA. In other words, A will either be swapped with itself to give ABC, swapped with B to give BAC, or swapped with C to give CBA.

After the second time through the loop we'll have 9 possible variants:

  • BAC, ABC, ACB from the first loop's first possible order, ABC.
  • ABC, BAC, BCA from the first loop's second possible order, BAC.
  • BCA, CBA, CAB from the first loop's third possible order, CBA.

After the third and final time through the loop, we have an explosion of 27 possibilities:

  • CAB, BCA, BAC from the first possibility of the second loop, BAC.
  • CBA, ACB, ABC from the second possibility of the second loop, ABC.
  • BCA, ABC, ACB from the third possibility of the second loop, ACB.
  • CBA, ACB, ABC from the fourth possibility of the second loop, ABC.
  • CAB, BCA, BAC from the fifth possibility of the second loop, BAC.
  • ACB, BAC, BCA from the sixth possibility of the second loop, BCA.
  • ACB, BAC, BCA from the seventh possibility of the second loop, BCA.
  • ABC, CAB, CBA from the eighth possibility of the second loop, CBA.
  • BAC, CBA, CAB from the ninth possibility of the second loop, CBA.

However there are only really 6 possible orderings of 3 cards: ABC, ACB, BAC, BCA, CAB, CBA. So where do the other 21 shuffles come from? Well, obviously they're just repeats of the real 6. Unfortunately, since 6 does not divide 27, it must mean that certain orderings will be produced more often than others.

From the above I get

  • ABC 4 times
  • ACB 5 times
  • BAC 5 times
  • BCA 5 times
  • CAB 4 times
  • CBA 4 times

In other words, if you use listing 5.2 to shuffle, you will produce certain orders more often than others and that's even before you start worrying about how biased your random-number generator is. Listing 5.3 will produce non-biased shufflings, at least as unbiased as your random-number generator is unbiased.

Another way of thinking about it is this: would you trust Listing 5.2 as a die thrower algorithm, or Listing 5.3? For an array of 3 items, they'll both produce one permutation out of six, just like a die throw. Sneakily, though, listing 5.2 produces throws biased towards three values out of the six...

Thanks to Per Larsen.

2. The code for QSInsertionSort shown in Listing 5.18 works fine so long as aFirst is 0. If it isn't, then it does some truly wierd and wacky things. The bug is in the initial calculation of j: instead of being set to the cut-off value it should be set to aFirst plus the cut-off value. This will give a proper estimate for upper bound for the initial subset.

procedure QSInsertionSort(...);
var
  ...
begin
  {find the smallest element in the first QSCutOff items and put it in
   the first position}
  IndexOfMin := aFirst;
  j := aFirst + QSCutOff; {!!.01}
  if (j > aLast) then
    j := aLast;
  ...

Thanks to Peter Evans.

3. The code for TDListShuffle shown in Listing 5.3 has a daft bug. The upper limit for the shuffling loop starts at (aEnd - aStart) whereas it should start at aEnd. In essence, if aStart is 0 everything works just fine (and guess what? that's what I used in my testing), but when aStart has a positive non-zero value, the last part of the list doesn't get shuffled.

procedure TDListShuffle(aList : TList;
                        aStart, aEnd : integer);
var
  ...
begin
  TDValidateListRange(aList, aStart, aEnd, 'TDListShuffle');
  {for each element, counting from the right..,.}
  for Inx := aEnd downto aStart + 1 do begin
    ...

Thanks to Simon Wong.