Wednesday, 8 April 2009

Hash table woes

Recent performance studies have shown that Haskell's defacto-standard compiler, GHC, still bundles an extremely inefficient hash table implementation. Specifically, the benchmark showed optimized Haskell compiled to native code running even slower than interpreted Python and 32× slower than F#.

Surprisingly, the poor performance turned out to be due to a design flaw in the implementation of GHC's garbage collector that renders it incapable of handling updates to arrays of boxed types efficiently.

Contrary to a statement in the book "Real World Haskell", trees are incapable of providing comparable performance for the majority of use cases. Therefore, it is extremely important that this deficiency is addressed because, without a decent hash table implementation, almost all dictionary data structures will be over an order of magnitude slower in Haskell than other functional languages such as OCaml and F#. Suffice to say, hash tables are an extremely valuable data structure and failing to provide a decent implementation is a serious shortcoming.

Although Haskell programmers have been complaining about the poor performance of its hash tables for at least three years, Donald Stewart has just promised to address the issue by implementing a new and reasonably-performant hash table for GHC.


Lennart Augustsson releases his embedded BASIC

Lennart Augustsson, author of the world's first Haskell compiler, has released the much-anticipated source code to his BASIC DSL written in Haskell that uses LLVM for JIT compilation.

Lennart has been teasing us with a series of fascinating articles describing the elegant DSL capabilities offered by his latest work and it is great to finally have something to play with!

This project demonstrates how the Haskell programming language combined with FFI bindings to LLVM's C API can be used to construct new language implementations quickly and easily.

Functional languages excel at metaprogramming and there are several similar compilers written in functional languages that are buildinng upon LLVM including our own HLVM project that is written in OCaml and provides a complete high-performance garbage collected virtual machine in under 1,000 lines of OCaml code.