Using Apache Beam to process CDC Streams

Speaker(s): Pablo Estrada & Dylan Hercher

In this talk, we will talk about our experience building pipelines to process data coming from Change-data-capture (CDC) systems.

We will review different features, including dynamically adding new tables, managing schema evolution, support for user-provided map functions, dead-letter queue designs, and running DML to ensure consistency.

We will share lessons that we’ve learned, and pointers on how to try out our solution.