Writing a priority queue in C# (part 2)

published: Tue, 27-Jan-2004   |   updated: Tue, 20-Apr-2004

In the previous installment of this occasional series, we started writing a priority queue implementation in C# for .NET programmers. My intent is to show you what needs to be done in order to implement a successful container class, rather than one that minimally works.

Last time we'd essentially implemented the basic heap algorithm version of a priority queue, and ensured its proper functioning by writing the code in a TDD (Test-Driven Development) manner. We could enqueue an item with its priority into the queue, and we could dequeue the item with the highest priority. The items we were adding to the queue were simple objects, the priority was an integer value. We also discussed which of the .NET Framework interfaces we should implement, although we made no progress towards properly implementing those interfaces (indeed, all we did was to insert throw statements in the methods to warn us if we decided to try using them(.

We'll start off in this installment by implementing IEnumerable's GetEnumerator method. We start off, of course, by writing the test first. We'll add three items to the priority queue, call the GetEnumerator method, and check to see that we'll get the items back when using the IEnumerator instance.

The issue here is to decide how we're going to return the items. It is my contention that to return them in priority order will be too expensive in terms of time, instead we should just treat the priority queue as a bag, a container that doesn't impose an order on its contained items. That way we are free to (essentially) return the items in the order they appear in the internal array. This does however pose a small problem for our test code: we'll be adding the items in one order, but we'll get them back in another; making it harder to work out what we've seen or not seen as we go through the enumerator.

[Test] public void TestEnumerator() {
    string s;
    // use a hashtable to check contents of PQ
    Hashtable ht = new Hashtable(); 
    for (int i = 0; i < 5; i++) {
        s = "item: " + i.ToString();
        ht.Add(s, null);
        pq.Enqueue(s, i * 2);
    }
    foreach(string str in pq) {
        Assertion.AssertEquals("String from PQ wasn't found in test hashtable",
            true, ht.Contains(str));
        ht.Remove(str); // so we don't see it again
    }
    Assertion.AssertEquals("the hashtable should be empty at the end",
        0, ht.Count);
}

(Note: in this test I'm using the fact that C#'s foreach operator will get the enumerator and iterate through it, without me having to explicitly code it all.)

Now we have the test and a failing test at that (it fails with the NotImplementedException, of course), we can turn to the implementation of GetEnumerator(). It would be lovely to just pass back the result of calling GetEnumerator() on our internal array, but that won't give us what we want. Why? Because our array has extra capacity in the form of 'empty' elements. Our caller would get too many items using this enumerator. So, there's nothing for it but to write a complete IEnumerator implementation. Fortunately, it's not that hard; mostly trivial work

public class PriorityQueue : ICollection {
    ...
    public IEnumerator GetEnumerator() {
        return new PriorityQueueEnumerator(this);
    }
    ...
    private class PriorityQueueEnumerator : IEnumerator {
        private int index;
        private PriorityQueue pq;

        public PriorityQueueEnumerator(PriorityQueue pq) {
            this.pq = pq;
            Reset();
        }

        public void Reset() {
            index = -1;
        }

        public object Current {
            get { return pq.heap[index].Item; }
        }

        public bool MoveNext() {
            if (index + 1 == pq.count) 
                return false;
            index++;
            return true;
        }
    }
}

The enumerator we've written should really check for the priority queue changing underneath as the enumerator is used, so that the caller doesn't get any suddenly missing items or items returned twice. The tests (and there are two) will get the enumerator, use it once and then call Enqueue() or Dequeue(). The next use of the enumerator should throw an exception (InvalidOperationException to be precise).

[ExpectedException(typeof(InvalidOperationException))]
[Test] public void TestEnumeratorWithEnqueue() {
    pq.Enqueue("one", 42);
    IEnumerator ie = pq.GetEnumerator();
    ie.MoveNext();
    pq.Enqueue("two", 13);
    ie.MoveNext(); // should fail
    Assertion.Fail("The previous call to MoveNext() should fail");
}

[ExpectedException(typeof(InvalidOperationException))]
[Test] public void TestEnumeratorWithDequeue() {
    pq.Enqueue("one", 42);
    IEnumerator ie = pq.GetEnumerator();
    ie.MoveNext();
    pq.Dequeue();
    ie.MoveNext(); // should fail
    Assertion.Fail("The previous call to MoveNext() should fail");
}

And the code to do this employs a neat trick: nested classes can access the private members of their outer, owning class. So, we define an int version member in the priority queue, increment it for Enqueue() and Dequeue(), and check it in the enumerator.

public class PriorityQueue : ICollection {
    ...    
    private int version;
    ...
    public object Dequeue() {
        if (count == 0) 
            throw new InvalidOperationException();
        
        object result = heap[0].Item;
        count--;
        trickleDown(0, heap[count]);
        version++;
        return result;
    }

    public void Enqueue(object item, int priority) {
        if (count == capacity)  
            growHeap();
        count++;
        bubbleUp(count - 1, new HeapEntry(item, priority));
        version++;
    }
    ...
    private class PriorityQueueEnumerator : IEnumerator {
        private int index;
        private PriorityQueue pq;
        private int version;

        public PriorityQueueEnumerator(PriorityQueue pq) {
            this.pq = pq;
            Reset();
        }

        private void checkVersion() {
            if (version != pq.version)
                throw new InvalidOperationException();
        }

        public void Reset() {
            index = -1;
            version = pq.version;
        }

        public object Current {
            get { 
                checkVersion();
                return pq.heap[index].Item; 
            }
        }

        public bool MoveNext() {
            checkVersion();
            if (index + 1 == pq.count) 
                return false;
            index++;
            return true;
        }
    }
}

Now we have the enumerator, let's take a look at the other interface we wanted to implement, ICollection. We've already implemented the Count property (and have been testing it in our tests), so the next one we should look at is the CopyTo() method.

For CopyTo(), the main usage I could think of was to get an array of items and their priorities; in other words, an array of HeapEntry items. There is precedence here: Hashtable.CopyTo() returns an array of DictionaryEntry items.

First, the test:

[Test] public void TestCopyTo() {
    string s;
    // use a hashtable to check contents of returned array
    Hashtable ht = new Hashtable(); 
    for (int i = 0; i < 5; i++) {
        s = "item: " + i.ToString();
        ht.Add(s, null);
        pq.Enqueue(s, i * 2);
    }

    HeapEntry[] heArray = new HeapEntry[6];

    pq.CopyTo(heArray, 1);

    foreach(HeapEntry he in heArray) {
        if (he.Item != null) {
            Assertion.AssertEquals("String from PQ wasn't found in test hashtable",
                true, ht.Contains(he.Item));
            ht.Remove(he.Item); // so we don't see it again
        }
    }
    Assertion.AssertEquals("the hashtable should be empty at the end",
        0, ht.Count);
}

Nothing too surprising. Here's the code that enables the test to pass; again nothing too controversial:

public class PriorityQueue : ICollection {
    ...
    public void CopyTo(Array array, int index) {
        System.Array.Copy(heap, 0, array, index, count);
    }
    ...
}

Finally we're left with SyncRoot and IsSyncronized. To be honest, the best way of dealing with synchronization of a priority queue is to leave them to the usual defaults: SyncRoot returns the object itself and IsSynchronized returns false. If we decide to write a fully synchronized version later, it'll be a different class.

At this point we've got a fully functional, fully tested, member of the collections family in .NET. (Well, at least it implements ICollection, which is the deciding factor.) To get to the next level of functionality we should do a couple of things: first, rethink what we mean by priority; second, work out how to serialize a priority queue, so that we can move one around between app domains or even PCs. But that will have to wait until next time.

The current, up-to-date, code as at the end of this article (can be found here).