RabbitMQ: Automatic retries with increasing delay

Aug 29, 2024

I have finally implemented a solution to automatic retries for a RabbitMQ message queue. I have pointed out the problem of retrying failed messages with a delay in a previous post.

I am still sad there is no ready-to-use solution and you have to write code yourself to make this work, but well. At least it is a solution.

You can find an example project in python here: https://github.com/andreas-mausch/rabbitmq-with-retry-and-delay

The concept is heavily inspired by https://devcorner.digitalpress.blog/rabbitmq-retries-the-new-full-story/.

1. Breakdown

So we have a bunch of extra exchanges and queues for this approach.

The main difference to Cyril's solution is that we don't re-send the message to a queue directly, but rather to it's original exchange with it's routing key.

One good thing though is: We can use this retry-block for as my queues as we like. It doesn't have to be repeated for each queue. Phew.

So we can set up a number of retries in our code, each having a delay time. When a message still couldn't be delivered after the last delay, it is finally sent to the retry-dead-letter-exchange. There, an admin can look regulary which messages had problems to be consumed.

In the example code the retry delays are:

5 seconds
2 minutes
30 minutes
6 hours
2 days

My example also covers the case of having a quorum queue, and here it is a bit disappointing: Even though you can still use the retry mechanism via x-delivery-limit, the retries will still be immediate. If you want to have increasing delay here as well, you must skip the automatic quorum queue mechanism by setting requeue=False in case of an error, so the dead-letter-exchange (retry-error-exchange) will be used instead.

2. Why do we need a delay exchange?

First, I was surprised we need that second exchange in the retry-block. Why not just have a single one?

The reason is: In order for x-delay to work, we need an exchange of the type x-delayed-message. So we have the choice of either making our normal exchanges all of this type, or rather have only a single place for this exception.

And the retry-error-exchange cannot be the exchange to be x-delay-message, because at this point the message doesn't have the x-delay header yet.

3. Error handling

So we have covered the case something goes wrong inside our consumer.

In case there is a problem in the consumers of the retry-error-queue or the retry-delay-queue themselves, we define a dead letter exchange for them as well, and it is also the retry-dead-letter-exchange.

Now we keep all non-consumable messages for analysis: even if all retries failed and even if there was a processing problem inside the retry handlers.

4. RabbitMQ Delay Plugin

One more downside to this solution: We cannot just use the basic RabbitMQ docker image anymore, because we need to have the delayed-message-exchange enabled.

See Scheduling Messages with RabbitMQ.

I use heidiks/rabbitmq-delayed-message-exchange:3.13.3-management for now, but we'll see how well this is maintained over time.