Priority Queue For Limited Priorities

published: Tue, 25-Oct-2005   |   updated: Sun, 30-Oct-2005

Recently, my brother-in-law, who programs in VB.NET, needed a priority queue that also had some behaviors of a normal queue: dequeue should remove the highest priority item as usual, but if there were two or more objects with the same priority, they should be removed in arrival order (also known as FIFO or first-in-first-out order). The normal heap implementation of a priority queue could not do that since it takes no account of arrival order; the only ordering is by priority and if there were several items with the same priority, they'd be returned in some non-deterministic order.

I pressed him a little more on what behaviors he was looking to fulfill (he'd already downloaded my C# implementation of a heap-based priority queue and found out that it didn't work in the way he wanted). It seems that the application he was writing had only a few priorities, say half a dozen or so, maybe less.

I thought a little. From his description, it seemed as if he wanted to marry a normal queue with a priority queue. But a single queue wouldn't do it; essentially he needed a queue per priority. And that gave me the insight to propose a solution.

Suppose we had an array of queues, one queue for each priority value. To enqueue an item, you would find the queue for the item's priority and then enqueue the item on that queue. To dequeue an item you would visit each queue in the array in priority order until you found one that contained at least one item, and then you'd dequeue the item from that queue.

Let's analyze the situation a little and see why it applies to his situation but is not applicable to every case (and why the heap is still valuable).

For the enqueue operation, finding the right queue happens in constant time (it doesn't matter how many priority values there are, finding the right queue is a simple index into the array of queues). Enqueuing the item in the queue is also a constant time operation, on the average. (I say on the average, since the queue may sometimes have to be grown. This is an O(n) operation since all items have to be copied over to the new larger queue. However, it happens rarely enough to be noise in the run-time statistics, especially in normal use.)

So, all in all, enqueuing an item in this kind of structure is a constant time operation. Wow, we're beating the heap implementation already (enqueuing in this case is an O(log(n)) operation).

Unfortunately, as usual, it's the dequeue operation that takes the most time. The first operation we must do is to sequentially look through the array of queues, looking for one that has at least one item. This is an O(n) operation: the time taken is proportional to the number of queues. The dequeuing of the item from the queue that's found is a simple constant time operation since it doesn't matter how many items are in the queue.

So, overall, dequeuing an item is O(n). This is worse than the heap where dequeuing is O(log(n)).

But note one thing that is hidden by use of this notation. The big-Oh notation is a measure of the performance characteristics of a method as the number of items gets very large. It's essentially a proportionality measure. For small numbers of items the notation can break down.

For example, suppose that our dequeue operation takes time (k * n) where k is some constant (that's what O(n) means), and let's assume that the heap's dequeue operation takes (m * log(n)). If k is small compared to m, for example it's 1 to m's 10, then for 5 items the new container's dequeue would take 5 units of time to the heap's dequeue's 7 units of time (that is, 10 * log(5)). It's faster, in other words. Note though that if there are 10 items, each method takes 10 units of time, and thereafter, for increasing numbers of items, the heap gets faster and faster.

Hence, for a small number of priorities, the sequential search through the queues is actually faster than performing the operations on the heap (the search is much less complex than all the shenanigans you go through in removing an item from a heap). If you like, the constant of proportionality for the former dequeue is smaller than the constant for the latter dequeue (you can view the constant as being a measure of the complexity of the method). As the number of priorities grows though, the constant value gets swamped by the number of items. So, at some point it makes sense to switch to a heap-based approach.

But for my brother-in-law, this data structure suited him perfectly. In fact it was such a simple algorithm that he was able to write it in VB.NET for his specific need before I could (I'm just not very conversant with VB).

The biggest problem with the algorithm I proposed to him is this one: if you know the values of your priorities up front and, if necessary, a way to convert them to an integer value from 0 to the number of priorities minus one, it's a piece of cake to write.

But what if you don't? What if you wanted to write a reusable priority queue class that used this algorithm?

As I see it there are three unknowns here. What is a priority (in other words, of what type is it)? How many actual priority values are there? How do you map a priority value onto an integer, such that the highest priority value maps to 0 and the lowest maps to the total number of priorities minus one?

My eventual solution (I took an iterative approach to getting this answer using my favorite methodology, test-driven development, colloquially known as TDD) was to create an generic interface that encapsulated this behavior. The interface thus has two methods: the first, PriorityCount, returns the number of possible priorities; the second, Convert, converts a priority value to an integer value between 0 and PriorityCount - 1, with 0 reserved for the highest priority of course.

Here's the interface and a simple implementation of it: it uses the string priorities "high", "medium", and "low" (mapping these values semantically onto 0, 1, and 2).

  public interface IPriorityConverter<P> {
    int Convert(P priority);
    int PriorityCount { get; }
  }

  public class StringPriorityConverter : IPriorityConverter<string> {

    int IPriorityConverter<string>.Convert(string priority) {
      if (priority == null) {
        throw new ArgumentNullException("Priority value should be set", "priority");
      }
      switch (priority.ToLower(CultureInfo.InvariantCulture)) {
        case ("high"): {
            return 0;
          }
        case ("medium"): {
            return 1;
          }
        default: {
            return 2;
          }
      }
    }

    int IPriorityConverter<string>.PriorityCount {
      get { return 3; }
    }
  }

Having this interface, it's now simple enough to write the priority queue, and it too lends itself to a generic implementation.

  public class LimitedPriorityQueue<T,P> {

    private IPriorityConverter<P> converter;
    private Queue<T>[] queueList;

    public LimitedPriorityQueue(IPriorityConverter<P> converter) {
      this.converter = converter;
      this.queueList = new Queue<T>[converter.PriorityCount];
      for (int i = 0; i < queueList.Length; i++) {
        queueList[i] = new Queue<T>();
      }
    }

    public void Enqueue(T item, P priority) {
      this.queueList[converter.Convert(priority)].Enqueue(item);
    }

    public T Dequeue() {
      foreach (Queue<T> q in queueList) {
        if (q.Count != 0) {
          return q.Dequeue();
        }
      }
      return default(T);
    }
  }

There's nothing too difficult here: when the priority queue is created the constructor accepts an IPriorityConverter<P> instance and queries that to find out how big the internal array of generic queues should be. Once the array has been created, each queue is created as an element of the array.

And that, believe it or not, is the extent of the difficult code in this class. Enqueue queries the IPriorityConverter<P> instance to find out the numerical equivalent of the incoming priority and enqueues the new item to the relevant queue in the array. Dequeue searches for the first queue with anything in it and then dequeues that item. If there are no items, the default value for the base type is returned (this could be an error condition instead, of course).

Having written this extremely trivial class, I was struck with a thought: making it lock-free for a multithreaded environment wouldn't be that difficult.

But you'll have to wait 'til next time for that.