PyFlink on Beam: How does it actually work?

(Aug-27 15:30 UTC)
-----

Beam’s portability framework introduces well-defined, language-neutral data structures and protocols between the SDK and runner. It ensures that SDKs and runners can work with each other uniformly. At the execution layer, the Fn API is provided which is for language-specific user-defined function execution. The Fn API is highly abstract and it includes several generic components such as control service, data service, state service, logging service, etc which make it not only available for Beam, but also third part projects which require multi-language support. PyFlink is such a project which is built on top of Beam’s portability framework that aims to provide Python language support for Apache Flink. So, I would like to talk about how does PyFlink on Beam actually work.

Jincheng Sun
ASF Member, Flink PMC, Beam Committer and Staff Engineer at Alibaba