Workshop: Implement a streaming data pipeline with Google Cloud Dataflow

(Aug-25 16:00 UTC)

Join us for an exciting workshop to see Google Cloud Dataflow applied to a real life application with a retail demo!

The first part of the workshop will provide an overview of Google Cloud Dataflow (Google’s fully managed scalable Apache Beam runner) followed by an in-depth simulation of a retail application that will showcase the powerful features of Dataflow.

The second part of the workshop will consist of hands-on labs where participants will interact with a real Google Cloud project and implement a Google Cloud Dataflow pipeline. You will build a batch Extract-Transform-Load pipeline in Apache Beam from scratch, which takes data from Google Cloud Storage and writes it to Google BigQuery. You will also run and build the pipeline on Google Cloud Dataflow. The pipeline will introduce the concepts of ParDo, Beam Schemas, Pcollections, and IO transforms in Beam using a weblog scenario.


  • Level: Beginner/Intermediate.
  • Labs will be written in Java, attendees should have rudimentary knowledge of Java and/or other similar languages, building tools like Maven and basic knowledge of cloud platforms.
  • We will be using Qwiklabs for running the labs. Please sign up at .


This workshop has a limited capacity of 60 attendees. If you are commited to participating, please register at

Slack channel

If you have any questions about this workshop or need assistance please join the #beam-summit-gcp channel in the ASF slack workspace.

Reza Rokni
Dev advocate Google Dataflow
David Sabater Dinter
EMEA Solutions Lead for Streaming Analytics at Google
Wei Hsia
Smart Analytics Specialist & Customer Engineer at Google Cloud