Scaling Python Portable Pipelines in Linkedin

Speaker(s): Daniel Chen

Stream processing has always been a core component of Linkedin infrastructure - with over a trillion messages processed each day. In this talk we will share how we expanded our stream processing capabilities by adopting the Beam Python Portability framework with Samza Beam Runner to bring stream processing in Python to a variety of new use cases. We will start with an overview of the Beam Samza Runner and the progress we made so far with integrating the Beam Portable pipeline. We will then present highlights and challenges from the dozens of new production use cases for Python stream processing using Samza Runner. Finally, we will conclude with the future planned work to enrich the experience for Python stream processing at Linkedin.