Tell me more ×
Programmers Stack Exchange is a question and answer site for professional programmers interested in conceptual questions about software development. It's 100% free, no registration required.

Why Redis for queuing?

I'm under the impression that Redis can make a good candidate for implementing a queueing system. Up until this point we've been using our MySQL database with polling, or RabbitMQ. With RabbitMQ we've had many problems - the client libraries are very poor and buggy and we'd like not to invest too many developer-hours into fixing them, a few problems with the server management console, etc. And, for the time being at least, we're not grasping for milliseconds or seriously pushing performance, so as long as a system has an architecture that supports a queue intelligently we are probably in good shape.

Okay, so that's the background. Essentially I have a very classic, simple queue model - several producers producing work and several consumers consuming work, and both producers and consumers need to be able to scale intelligently. It turns out a naive PUBSUB doesn't work, since I don't want all subscribers to consume work, I just want one subscriber to receive the work. At first pass, it looks to me like BRPOPLPUSH is an intelligent design.

Can we use BRPOPLPUSH?

The basic design with BRPOPLPUSH is you have one work queue and a progress queue. When a consumer receives work it atomically pushes the item into the progress queue, and when it completes the work it LREM's it. This prevents blackholing of work if clients die and makes monitoring pretty effortless - for instance we can tell if there is a problem causing consumers to take a long time to perform tasks, in addition to telling if there is a large volume of tasks.

It ensures

  • work is delivered to exactly one consumer
  • work winds up in a progress queue, so it can't blackhole if a consumer

The drawbacks

  • It seems rather strange to me that the best design I've found doesn't actually use PUBSUB since this seems to be what most blog posts about queuing over Redis focus on. So I feel like I'm missing something obvious. The only way I see to use PUBSUB without consuming tasks twice is to simply push a notification that work has arrived, which consumers can then non-blocking-ly RPOPLPUSH.
  • It's impossible to request more than one work item at a time, which seems to be a performance problem. Not a huge one for our situation, but it rather obviously says this operation was not designed for high throughput or this situation
  • In short: am I missing anything stupid?

Also adding node.js tag, because that's the language I'm mostly dealing with. Node may offer some simplifications in implementing, given its single-threaded and nonblocking nature, but furthermore I'm using the node-redis library and solutions should or can be sensitive to its strengths and weaknesses as well.

share|improve this question

1 Answer

I've run into some difficulties thus far I'd like to document here.

How do you handle reconnect logic?

This is a hard problem and an especially hard problem in designing and implementing a message queue. Messages must be able to queue up somewhere when consumers are offline, so a simple pub-sub is not strong enough, and consumers need to reconnect in a listening state. Blocking pops are difficult state to maintain, because they are a non-idempotent listening state. Listening should be an idempotent operation, yet when dealing with a disconnect with respect to a blocking pop, you have the pleasure of thinking very hard about whether the disconnect happened just after the operation succeeded or just before the operation failed. This isn't insurmountable, but it's undesirable.

Furthermore, the listening operation should be as simple as possible. Ideally it should have these properties:

  • Listening is idempotent.
  • The consumer is always listening, and throttling logic is processed outside of the listening logic code. RabbitMQ encapsulates this by letting the consumer bound the number of unacked messages it can have.
    In particular I went with a poor design in which re-entering a blocking pop was contingent on success of previous operations, which was brittle and required thinking hard.

I'm now favoring a Redis PUBSUB + RPOPLPUSH solution. This decouples notification of work from consumption of work, which lets us factor out a clean listening solution. The PUBSUB is only responsible for notification of work. The atomic nature of RPOPLPUSH is responsible for consumption, and delegating work to exactly one consumer. At first this solution seemed needlessly complicated compared to a blocking pop, but now I see that the complication was not needless at all; it was solving a hard problem.

However this solution isn't quite trivial:

  • consumers should also check for work on reconnect.
  • consumers may want to do a poll for new work anyway, for redundancy. Should the poll actually succeed a warning should be emitted, since this should only happen between the consumption on the PUBSUB and the poll on an RPOPLPUSH. Therefore many poll successes indicate a broken subscription system.

Note that the PUBSUB/RPOPLPUSH design also has scaling problems. Every consumer receives a lightweight notification of every message, which means this has an unnecessary bottleneck. I suspect it's possible to use channels to shard the work but this is probably a tricky design to work out well.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.