Distributed Distinct Elements

Emin Karayel 📧

April 3, 2023

Abstract

This entry formalizes a randomized cardinality estimation data structure with asymptotically optimal space usage. It is inspired by the streaming algorithm presented by Błasiok in 2018. His work closed the gap between the best-known lower bound and upper bound after a long line of research started by Flajolet and Martin in 1984 and was to first to apply expander graphs (in addition to hash families) to the problem. The formalized algorithm has two improvements compared to the algorithm by Błasiok. It supports operation in parallel mode, and it relies on a simpler pseudo-random construction avoiding the use of code based extractors.

License

BSD License

Topics

Session Distributed_Distinct_Elements