DData is a library of efficient data structures and algorithms for Haskell. I noticed that I used these data structures (and algorithms) a lot but surprisingly, there are not many "off the shelf" libraries available – so, I documented and extended my personal libraries and made them available. The modules are implemented to be easy to use (just import it), efficient (use balanced trees) and fairly complete (lots of operations) and I hope you find them useful for your own projects too.
| Generic collections: | Map, Set, and MultiSet. The implementations are based on size balanced trees. |
| Integer collections: | IntMap, IntSet, and IntBag. The implementations are based on big-endian patricia trees. |
| Generic sequences: | Queue and Seq (catenable sequences). |
| Algorithms: | Scc (calculate the strongest-connected-components). |
IntSet and IntMap
modules use low-level bit operations on integers and not all haskell systems
handle that correctly. Because of this, I made three different versions of
those modules:
IntMap.union that sometimes behaved right-biased instead of left-biased.Word to Nat in the IntMap and IntSet modules, to avoid a name-clash on the newest Hugs distribution (dec 2001).
Map/Set modules compare to the Data.FiniteMap/Set
modules in GHC?Data.FiniteMap/Set
modules is that they are thorougly tested and tuned for performance. On the
other hand, the operations are fairly constrained and the code is riddled
with GHC extensions. I tried to
implement a more complete and consistent set of operations like partition
and adjust. Furthermore, the library uses the module system to
distinguish names (ie. empty instead of emptyFM),
it implements the more efficient
hedge union algorithm, and the Set module is directly
implemented instead of using a finite map with unit elements.
IntMap versus Edison.PatriciaLoMap?IntMap module was actually implemented with Edison.PatriciaLoMap
as an example. However, the IntMap module is based on a big-endian
tree instead of a little-endian tree which is generally more
efficient, especially on ordered data. Big endian trees were not often used
in practice since it was fairly expensive to calculate the highest bit position in an
integer. The IntMap module uses a clever algoritm from
Jorg Arndts bit
wizardry page to determine the highest bit position efficiently. "Oct 18 2005"