Bloom filters

This article by Broder and Mitzenmacher gives a good description of how bloom filters work and what they can do for you. The bloom filter basically replaces a dataset with a filter that can tell you if an item is a member of that set or not. It will not give false negatives, but it might give false positives. In practise, this is a negative property that can be outweighted by the space savings a bloom filter introduces; after all, you do not need to query the dataset to determine membership. The most important and summarizing quote you should remember from the article:

The Bloom filter principle: Wherever a list or set is used, and space is at a premium, consider using a Bloom filter if the effect of false positives can be mitigated.

The article also gives a number of examples in which bloom filters are used. E.g. to aid resource location in P2P and cache systems.

This entry was posted on Tuesday, March 3rd, 2009 at 11:19 pm and is filed under Research. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a reply

Name (*)
Mail (will not be published) (*)
URI
Comment