PyCon Nigeria Annual Conference

Real-Time Analytics with Change Data Capture: A Python, PostgreSQL, Debezium, and RabbitMQ Approach

speaker-foto

Nyior Clement

Nyior is a developer Advocate at 84codes the company behind CloudAMQP – Since joining the company, he has been constantly creating educational content around RabbitMQ, message queues in general and their practical application in distributed architectures. Prior to 84codes, Nyior has also worked as a full-time Python Software Engineer creating open-source packages like django-rest-cli and django-rest-paystack

Description

In today's business environment, the ability to access and analyze data in real-time has become crucial. Change Data Capture (CDC) stands out as a powerful strategy to ensure that data from various sources is instantly available for analysis, eliminating delays in insight generation. This presentation delves into CDC's core principles and its vital role in supporting real-time analytics.

Abstract

In this talk, we'll focus on a hands-on example that illustrates setting up a CDC pipeline. Specifically, we'll demonstrate how to capture database changes from PostgreSQL in an upstream Python application, then stream these changes to a downstream Python application for immediate processing, utilizing Debezium for change detection and RabbitMQ for message queuing.

Attendees will leave with a practical guide on leveraging Python, PostgreSQL, Debezium, and RabbitMQ to create a robust CDC pipeline, empowering their analytics platforms to deliver insights with minimal latency.

This is what the outline would look like:

1. Introduction

  • Briefly introduce the concept of real-time analytics and its importance in modern business operations.
  • Highlight the role of Change Data Capture (CDC) in achieving real-time data processing.

2. Understanding Change Data Capture (CDC)

  • Define CDC, discuss some of its benefits and practical use-cases.
  • Explain how CDC works at a high level and demonstrate what typical CDC pipeline would look like.

3. Introducing the Demo CDC Pipeline

  • Introduce the demo Python application we will be working with in our CDC pipeline
  • Show a complete illustration of our CDC pipeline with all the different components pieced together: Python applications, PostgreSQL, Debezium server, and RabbitMQ
  • Briefly introduce Debezium server, RabbitMQ and roles they play in our CDC pipeline

4. Running the CDC pipeline

  • Here, participants get to see CDC in action — we will run the demo pipeline,

5. Conclusion

  • A recap of everything
Audience level: Intermediate or Advanced