Wadler's Blog: Haskell in Production: Bdellium

6.6.15

Haskell in Production: Bdellium

At Medium, Fredrik (@ique) describes using Haskell in anger.

At the start of the products’ life we mostly analyzed small retirement plans with 100 to 500 plan participants. As time went on we started seeing plans with 2,000 participants and even 10,000 participants. At these numbers the execution time for the processing started going outside of acceptable bounds. Luckily, every participant could be analyzed in isolation from the others and so the problem was embarrassingly parallel.

I changed one line of code from
map outputParticipant parts
to
map outputParticipant parts `using` parListChunk 10 rdeepseq
and execution times were now about 3.7x faster on our 4-core server. That was enough to entirely fix the issue for another 6 months, during which time we could focus on further enhancing the product instead of worrying about performance. Later, we did do extensive profiling and optimization of the code to further reduce execution times significantly, but we didn’t want to prematurely optimize anything.

Spotted via Manual Chakravarty @TacticalGrace.

3 comments:

David said...: Chak's twitter is @TacticalGrace. You have the URL right but the label in your article says "TechnicalGrace". Feel free to delete this comment once updated.; 7/7/15 10:33 AM
Philip Wadler said...: Fixed, thanks!; 20/7/15 8:22 AM
Michał said...: Parallel parsing was added to hPDB over a weekend, after the reviewer´s request, and has made it the fastest PDB parser among a wide array of languages.
http://www.biomedcentral.com/1756-0500/6/483/table/T3; 23/8/15 6:32 AM