Count-distinct using HLL++ algorithm

(Aug-28 17:40 UTC)
-----

This talk goes through how to write a Beam pipeline to efficiently count the number of distinct elements in a massive data set using the HyperLogLog++ algorithm.

Robin Qiu
Software Engineer @ Google