Note to self: randomly drop lines in a text file

If you ever need to drop lines from a stream of text randomly, you can use this simple and short awk command:

Example: cat file | awk '{if (int(rand()*100) < 10) print $0;}'

This example keeps only 10%. You can change the 10 to any other percentage to drop more or less.

As an example, I use this to warmup my MediaWiki installation before doing a real WikiBench benchmark:

cat benchmarks/1pct.trace | head -n 100000 | grep "\-$" | \
awk '{if (int(rand()*100) < 10) print $0;}' | ./ -verbose

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>