(A really quick post today). Following a recent post, Run
length encoding in Python, I thought it would be nice to look at
this simple encoding in Haskell. The RLE idea is simple, given some input:
"aaaabbaaa"
compress it by taking the length of each run of characters:
"4a2b3a"
Running this in GHCi:
> encode "aaaabbaaa"
[(4,'a'),(2,'b'),(3,'a')]
> decode (encode "aaaabbaaa")
"aaaabbaaa"
So there's a simple QuickCheck property we'll want to establish.
First, let's look at the Python implementations:
def encode(l):
return [(len(list(group)),name) for name, group in groupby(l)]
def decode(l):
return sum([length * [item] for length,item in l],[])
Ok, so group each run into a list, then pair up the length, with the
list item, and to decode, replicate the character by its length, and concatenate.
Easy in Haskell:
encode = map (length &&& head) . group
decode = concatMap (uncurry replicate)
Who says static typing implies verbosity?
We also get a generic implementation for free, given that Haskell infers
that the code never inspects the elements in the list directly, our
`encode' function will just as happily encode Strings as lists of
numbers or any other list. That is, 'encode' will work on any type `a'
that looks like it can do (==).
encode :: Eq a => [a] -> [(Int, a)]
So:
> encode [1,1,1,1,2,2,2,7,7]
[(4,1),(3,2),(2,7)]
> encode ["foo", "foo", "bar", "foo", "foo", "foo"]
[(2,"foo"),(1,"bar"),(3,"foo")]
Polymorphism rocks.
The original poster used a couple of unit tests, but in Haskell we'd
usually specify generic properties the function must satisfy:
prop_id :: String -> Bool
prop_id xs = decode (encode xs) == xs
QuickCheck then generates input for us:
> quickCheck prop_id
OK, passed 100 tests.
So go out and sketch your little algorithms in (shorter, faster,
safer) pseudo-Python :-)
Thanks to psykotic for the idea!
/home ::
/haskell ::
permalink ::
rss