Kafka-Based Design for a Delayed Message Queue

Translation wujiuye 216 0 2021-08-15

This article is a translation of the original text, which can be found at the following link: https://www.wujiuye.com/article/a43f90d2332049008d5753395d5b83cd

Author: wujiuye
Link: https://www.wujiuye.com/article/a43f90d2332049008d5753395d5b83cd
Source: 吴就业的网络日记
This article is an original work by the blogger and is not allowed to be reproduced without the blogger's permission.

Since Kafka does not support delayed messages, and our company’s technology stack uses Kafka as the message middleware, the business side wants to use RocketMQ to meet the delayed message scenario. However, introducing another message middleware just for delayed messaging would increase operational and maintenance costs. Against this background, we hope to extend the Kafka client to support delayed messages.

This article will introduce four delayed message implementation schemes, analyzing their advantages and disadvantages.

Scheme One: Time Wheel Algorithm

Each producer holds a time wheel delayed message queue, with messages stored in memory.

Screenshot 2021-08-14 20.48.25.png

Disadvantage analysis:

Scheme Two: Single-Round Time Wheel Algorithm + File Storage (Improved Version of Scheme One)

Using a single-round time wheel algorithm, with each slot pointing to a file.

Screenshot 2021-08-14 21.13.21.png

Disadvantage analysis:

Scheme Three: Multi-Level Partitioning + Automatic Downgrading

Divide messages into multiple partitions based on remaining delay time (expected send time - current time), downgrading messages to a specific partition until reaching the actual topic.

Screenshot 2021-08-14 21.03.19.png

Disadvantage analysis:

For instance, if the delay times in a queue are 50, 40, 36, and 56 minutes, each message must be consumed and rewritten to the same level partition to avoid affecting subsequent delayed messages. In the worst case, a high-delay message might require over 100 sends and subscriptions.

Scheme Four: Multi-Level Delay, Without Supporting Arbitrary Time Precision (Improved Version of Scheme Three)

Referencing RocketMQ’s delayed message design, which does not support arbitrary time precision, but instead supports specific delay levels. Divide delay levels into 1s, 5s, 10s, 30s, 1m, 2m, 3m, 4m, 5m, 6m, 7m, 8m, 9m, 10m, 20m, 30m, 1h, and 2h, with 18 levels in total. Create a delayed topic with 18 partitions, each corresponding to a different delay level.

Screenshot 2021-08-14 20.48.59.png

Example:

Advantages:

We ultimately adopted Scheme Four. In our implementation, we started a KafkaConsumer for each process, using regular expressions to subscribe to topics ending with ‘.delay’, reducing thread resource consumption. When sending messages to the delayed topic, we used the delay level as the message key and stored the original message key in the message header. When sending to the actual topic, we retrieved the real key and real topic from the delayed message header.