Distributed Distinct Elements

Emin Karayel 📧

April 3, 2023


This entry formalizes a randomized cardinality estimation data structure with asymptotically optimal space usage. It is inspired by the streaming algorithm presented by Błasiok in 2018. His work closed the gap between the best-known lower bound and upper bound after a long line of research started by Flajolet and Martin in 1984 and was to first to apply expander graphs (in addition to hash families) to the problem. The formalized algorithm has two improvements compared to the algorithm by Błasiok. It supports operation in parallel mode, and it relies on a simpler pseudo-random construction avoiding the use of code based extractors.


BSD License


Session Distributed_Distinct_Elements