timings so far (on a 350 MB web log, on a 3 ghz core2duo): Implementation Mem/MB Time C++/Boost 381 7.36s user 1.55s system 99% cpu 8.921 total C++ (g++ 4.3.2) 390 11.30s user 0.35s system 100% cpu 11.653 total C (gcc 4.3.2) 171 12.78s user 0.24s system 99% cpu 13.022 total PHP 5.2.7 455 13.36s user 0.39s system 99% cpu 13.768 total - Java 1.6.0_11 56 14.88s user 0.18s system 100% cpu 14.963 total Perl 5.10.0 578 14.87s user 0.49s system 100% cpu 15.360 total OCaml 3.11.0 305 23.43s user 0.40s system 99% cpu 23.886 total C# (mono 2.0.1) 498 25.53s user 3.12s system 99% cpu 28.749 total sh 25.63s user 1.43s system 100% cpu 27.011 total Python 2.6.1 529 27.47s user 0.38s system 100% cpu 27.849 total Lua 5.1.4 1293 30.44s user 1.44s system 99% cpu 31.900 total Tcl 8.5.5 1917 41.31s user 2.07s system 99% cpu 43.388 total Ruby 1.9.1preview1 537 54.10s user 0.32s system 99% cpu 54.439 total Scheme (MZ 4.1.3) 948 60.50s user 1.91s system 99% cpu 1:02.46 total - Java (gcj 4.3.2) 72.98s user 0.39s system 99% cpu 1:13.38 total Ruby 1.8.7pl72 643 80.57s user 0.39s system 99% cpu 1:21.02 total Haskell (ghc 6.10.1) 1450 50.34s user 0.74s system 99% cpu 51.112 total mawk 1.3.3 225 135.40s user 0.67s system 100% cpu 2:16.05 total Please note that this is not a fair contest because I left the definition of "word" vague. For comparison, implementations that use a definition other than "characters between whitespace" are marked with a - here. Currently only Java.