Menu

Bigdata

The Beautiful Math of Bloom Filters

Probabilistic functions can model many algorithms and procedures. They can help us optimize procedures to produce the best results. Experienced software engineers know that at some point, almost any software will reach some level of non-determinism, where a solution is not absolute but approaches the best results if using the optimal configuration. Mathematically, such a solution usually boils down to finding minima, maxima or limits of some probabilistic functions. In this article, we will explore the beautiful math behind Bloom filters. We will explore the accuracy and trade-offs of Bloom filters and see why a Bloom filter can be an excellent choice for some cases, especially in Big Data, OLAP systems, dealing with huge and fairly static datasets.