Archive

Archive for March, 2009

Bloom filters

This article by Broder and Mitzenmacher gives a good description of how bloom filters work and what they can do for you. The bloom filter basically replaces a dataset with a filter that can tell you if an item is a member of that set or not. It will not give false negatives, but it might give false positives. In practise, this is a negative property that can be outweighted by the space savings a bloom filter introduces; after all, you do not need to query the dataset to determine membership. The most important and summarizing quote you should remember from the article:

The Bloom filter principle: Wherever a list or set is used, and space is at a premium, consider using a Bloom filter if the effect of false positives can be mitigated.

The article also gives a number of examples in which bloom filters are used. E.g. to aid resource location in P2P and cache systems.

Map-Reduce in the browser

Someone had to do it: a Map-Reduce system build around the browser. Just point your browser to a URL and you are instantly helping someone to solve large problems by taking part in the process and running a number of jobs. If you think about this, it can even be used to replace advertisements. Instead of looking at flashy ads, a site can load a few tasks in the background (a frame would be best) and use some of your CPU power :-) That would probably even be cheaper for the visitor than running the CPU power drain called “Abobe Flash” to show the usual “OMG you just won an iPod!!!” ads.