Monitoring ChaosSlack

Are you ready to get a taste of what it’s like to work on an event processing project which is adaptable to many real-life scenarios? Apart from a few guidelines, you get the complete authority of choosing your toolset to create ChaosStack’s Slackbot. Grab a coffee, and let’s get down to business.

The popularity of microservices is on a constant rise as this pattern has many favorable properties that make it attractive for startups and larger companies alike. To name a few: it’s a lot easier to build and maintain applications; you gain flexibility in using technologies and scalability; services can be organized around business capabilities; productivity and speed can improve; and you can have autonomous, cross-functional teams working together even when they are geographically distributed. Microservices are many times event-driven, only doing work when there is a need to process some incoming payload, then they become dormant, listening for new events. With the rise of DevOps, where software development practices are combined with operations, the need for versatile engineers is higher than ever. In this challenge, we’ll explore many of the features of this new era, and you can show that you’re up to the task.

Another team in another timezone created an API library which you can find here:

You can find the competitor credentials shared in the ChaosStack Slack at the start of the day.

These endpoints are our source of events and also the destination where we send back the processed messages. At the core of our data pipeline, we’ll use Kafka ( as our message queue. Producers feed Kafka topics, and consumers read them, using a Rdkafka ( implementation in your preferred programming language is recommended. You might need to implement a guard logic at the beginning of your pipeline to filter out already seen or sent events before entering the queue. Redis ( can come in handy to avoid such infinite loops.

For the sake of simplicity, we are going to monitor the ChaosStack Slack, its custom-built gateway infrastructure, and send back messages, so eventually, we are creating a chatbot (Slackbot) with limited functionality. In real life, the source and the destination could be anything: when a new order is packed in a webshop, you could send the parcel data to a courier service; when an error is thrown in an app, you could create a ticket in an issue tracking system; with multiple producers and consumer groups working on the same topics, you can create event buses triggering multiple actions; a consumer and a producer packed into the same service to post events back to some other Kafka topics based on conditions result in an event router, and so on. The possibilities are limitless.

All events coming from Slack are cached in the gateway for up to an hour to give you a decent amount of user-generated data to work on, and there is a metrics endpoint sharing statistics about the VM hosting the gateway which you can use, e.g., alerting on high load, memory consumption. Other teams are querying the API besides you, and although the API can handle up to 80 simultaneous requests, please don’t brute force it deliberately. Polling the events/metrics endpoints in 5-10 seconds should be sufficient.

microservices slackbot kafka


  • An architecture diagram of the implemented data pipeline is presented:
  • 1 point
  • 1 or more Kafka producer is implemented to scrape data from the gateway:
  • 1 point
  • Filtering logic is used to only process relevant new events:
  • 1 point
  • 1 or more Kafka consumers are implemented to process events and send meaningful messages back to the gateway:
  • 1 point
  • 1 or more Kafka topics are used with meaningful separation of events:
  • 2 points
  • 2 or more programming languages are being used in producer/consumer implementations:
  • 1 point
  • Microservices are dockerized:
  • 1 point
  • Slack messages are handcrafted, and either useful or funny:
  • 1 point


When you are ready, you should show the elements of your running data pipeline to the person responsible for this task (Immánuel Fodor).

A submission can earn partial points if some of the services are configured, running, and their purpose is fulfilled. Check out the criteria for the sum of 10 points.