Monday was something of a landmark for the
Haskell community, as the 500th Haskell package was added to the Arch Linux distribution. You can
see all
the Haskell packages here (excepting some packages in the core
system, like xmonad and
ghc). All these packages are built from source hosted on http://hackage.haskell.org.
Similar efforts to comprehensively package Haskell natively are underway
on Debian/Ubuntu, Gentoo and Fedora.
And the benefit of automation is stark. Instead of a developer
maintaining, say, 5 or even 50 packages, with automation and a
declarative build specification such as Cabal provides, one person can
construct and maintain 500 packages with relative ease. That's an order
of magnitude improvement in productivity!
Hackage is kicking along. In total, 257
Haskell developers have uploaded 723
Haskell applications, tools and libraries to Hackage since January
2007 when Hackage went live. That's 36 new packages a month on average.
In July 2008, we had 160 packages update, and there's been 120 already
in August.
Some choice statistics about Hackage:
-
723 unique packages
-
Over a million lines of Haskell code
-
2600 pages of API documentation
(And to get a sense of what an achievement it is to have 500 Haskell
package in Arch Linux native format, Arch has, in contrast, only 6
Erlang packages, 21
OCaml packages, 28
Lisp packages, 101
Ruby packages, and 495
Python packages).
Open source works.
In this post, I want to give a bit of an overview of the variety of
Haskell software now available in the native package system, how the
process of native packaging was automated, and future directions for the
Haskell platform.
So what do we get to play with
Here's ten cool Haskell packages, in no particular order, that I like
(and clearly I'm biased towards developer-oriented stuff), that you
might not have known about, all now available in the Arch package
system. Later, we'll look at how all this was produced, how we connect
Cabal and Hackage to a distro package system.
0. WWW: feed-cli
feed-cli is
a great little command line tool, written in Haskell, for generating, or
appending to, RSS feeds. I use it to generate RSS feeds of cron job
events (such as package uploads). It uses the comprehensive RSS and Atom
feed
generation and parsing library.
1. Development: ghc-core
ghc-core, a
pager-like program for displaying GHC Haskell intermediate structures,
and generated assembly, in a human readable way. If you're working on
low level numerics libraries, or high performance code, ghc-core makes
it easy to get a sense for the generated Haskell code. It uses the PCRE regex
bindings and hscolour, a
syntax colouriser for Haskell, to generate nice output.
2. Science: haskell-blas, haskell-hmatrix and haskell-fftw
Numerics libraries have long been a missing link for Haskell, and in the
past few months we've seen several emerge, in particular haskell-blas,
a binding to BLAS and LAPACK, and haskell-matrix,
efficient bindings to the GSL (Fortran!) code. There's also a haskell-fftw
binding now. And Cabal takes care of all the tedium of linking against C
and Fortran, and you can just use the nice high level interface.
3. Data structures: haskell-bloomfilter
Bloom filters
are unusual data structures. They're set-like, and are highly efficient
in their use of space, and they only support two operations: insert and
membership querying. And unlike normal data structures, bloomfilters can
give incorrect answers! However, they have a low rate of false positives
for membership, which is tolerable for some applications (say, traffic
shaping). The haskell-bloomfilter package gives us fast and efficient
bloomfilters for Haskell, with both pure and impure interfaces.
4. Types: haskell-dimensional
No more mixing up yards and metres, or cubic centimetres and
millilitres, the dimensional
library encodes the standard units of measurement in the Haskell type
system, so that it becomes a compile time error to mix up units. The
library encourages best practices for unit usage, and shows how an
entire class of bugs can be eliminated via an expressive type system.
5. Network: haskell-download
For a long time there was no convenient way to use arbitrary network
resources from Haskell. While Ruby had convenient openURI functions, the
Haskell community had no such equivalent. haskell-download-curl
and haskell-download
answer that, providing a single function, openURI,
for opening network resources, getting the result back as a bytestring.
It also provides convenient wrappers for lazy downloading, or treating
the content as XML, Atom or RSS, or unstructured HTML tags.
6. Graphics: haskell-googlechart
Everyone likes charts and graphs, right? Now there's a Haskell interface
to Google's charts API, so you can construct those pretty graphs
from Haskell, like this one, for the proportion of packages in each
category:
7. Physics: haskell-hipmunk
2D physics engines are just plain awesome. A new Haskell interface, hipmunk, gives
you high level, efficient access to the C chipmunk 2D
physics engine. Hours of fun building engines, gears and levers, and
watching the physics at work.
8. Languages: haskell-lua, haskell-perl5, haskell-language-c
Comprehensive support for interacting with other languages is another
sign of a robust community. Haskell should play well with others, and
there's now, alongside the standard C api, new bindings to Lua and Perl, as well
a the rather stunning Language.C, a
library for parsing, analysis and generation of C with GCC extensions.
(Want to hunt for bugs in the Linux kernel? This is how you'd do it).
You can even generate Flash code, if
you like. Or if you love assembly, script LLVM from
Haskell, or just generate x86
code directly.
9. Games: hback
Finally, and I'm not sure why, but there's been a bit of small
game development in Haskell recently (possibly due to having good
OpenGL tutorials?). Anyway, one of cooler games, in my opinion, is hback, a "dual
N-back" memory training game, complete with audio and bells. A recent
research paper claimed that following the puzzle protocol implemented in
this game would improve your fluid intelligence. It's fun, but hard.
And there are 490 other Haskell in Arch (and on Hackage!) to play with.
So how did we get all these into the system?
Arch Linux + Haskell
The effort to modernise Haskell on Arch began in early June, with half a
dozen Haskell/Arch people formed an IRC channel, a mailing list, and
started a wiki page to plan and coordinate efforts. A key decision was
made to automate as much as possible, after watching other distros
struggle to keep up with the pace of change.
Several factors came into play, making it actually feasible to
automatically package Haskell source:
-
Central package hosting on Hackage.
-
A single, build system, Cabal.
-
And the use of a declarative dependency specification.
Central hosting meant that all packages can be found from one place,
making it easy to track lots of packages. Having a single shared build
system means that tools can rely on a common API for bundling software
(no need to teach the tool about lots of crazy build strategies). All
that helps.
The most important factor by far though, is that Cabal declares
dependency information in a purely declarative way. Dependencies are
stated explicitly, and can be analysed statically. This is in stark
contrast to tools like autoconf, which require initialisation on the
target machine to determine the actual build dependencies.
All an automated package tool needs to do for Cabal is translate the
names of Haskell and C dependencies into the names of native packages on
the system, and then spit those results into the native package format.
For Arch, we wrote cabal2arch
to do just this.
cabal2arch
Given the url of a Haskell package's .cabal file, cabal2arch spits out a
ready to use Arch package for the same package, Like so:
$ cabal2arch
http://hackage.haskell.org/packages/archive/hmp3/1.5.2.1/hmp3.cabal
Using /tmp/tmp.YyRxcmyxTX/hmp3.cabal
Fetching http://hackage.haskell.org/packages/archive/hmp3/1.5.2.1/hmp3-1.5.2.1.tar.gz
Created /tmp/hmp3.tar.gz
And that's it. The result, hmp3.tar.gz, is a package we can then upload
into the Arch
Linux repository.
The input .cabal file contains the following relevant information:
executable hmp3
build-depends: unix,
zlib >= 0.4,
binary >= 0.4,
pcre-light >= 0.3,
mersenne-random >= 0.1
if flag(small_base)
build-depends: base >= 3,
bytestring >= 0.9,
containers,
array,
old-time,
directory,
process
else
build-depends: base < 3
extra-libraries: curses
Which cabal2arch analyses, translating the name of each Haskell package
to its Arch equivalent, and looking up the correct names for the C
dependencies, yielding a native package specification of the following
form:
# Contributor: Arch Haskell Team
# Package generated by cabal2arch 0.3.8.2
pkgname=hmp3
pkgrel=1
pkgver=1.5.2.1
pkgdesc="An ncurses mp3 player written in Haskell"
url="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/hmp3"
license=('GPL')
arch=('i686' 'x86_64')
makedepends=('ghc'
'haskell-cabal'
'haskell-binary>=0.4'
'haskell-mersenne-random>=0.1'
'haskell-pcre-light>=0.3'
'haskell-zlib>=0.4'
'ncurses')
depends=('gmp' 'ncurses')
options=('strip')
source=(http://hackage.haskell.org/packages/archive/hmp3/1.5.2.1/hmp3-1.5.2.1.tar.gz)
md5sums=('4f72ab118929a9137ae1339c740b4581')
build() {
cd $startdir/src/hmp3-1.5.2.1
runhaskell Setup configure --prefix=/usr || return 1
runhaskell Setup build || return 1
runhaskell Setup copy --destdir=$startdir/pkg || return 1
}
All the hard work is done for us, and the translation itself was
pretty straightforward to write (a few
hundred lines of Haskell, to download cabal files, parse them,
resolve dependencies, construct a valid Arch package spec, write that to
disk, and tar up the results into a bundle.
"The same thing we do every day, Pinky: try to take over the world"
Not really, but there are two clear steps forward from here. Automation
tools for othe distributions (cabal-debian
for example, will be crucial for that platform). The other big step is
to standardise a set of all these packages, to give us a comprehensive,
trusted, high quality base, for future applications. That is, Haskell:
Batteries Included.
If we do this right, and the large
Haskell community continues to work efficiently, scaling up the
benefits of pure, polymorphic components to larger and larger
collections of systems, who knows? An open source, purely functional,
well-typed lambda for every child? :-)
/home ::
/haskell ::
permalink ::
rss
2008-07-29