Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
DData
[go: Go Back, main page]

DData

DData is a library of efficient data structures and algorithms for Haskell. I noticed that I used these data structures (and algorithms) a lot but surprisingly, there are not many "off the shelf"  libraries available – so, I documented and extended my personal libraries and made them available. The modules are implemented to be easy to use (just import it), efficient (use balanced trees) and fairly complete (lots of operations) and I hope you find them useful for your own projects too.

Generic collections:   Map, Set, and MultiSet. The implementations are based on size balanced trees.
Integer collections: IntMap, IntSet, and IntBag. The implementations are based on big-endian patricia trees.
Generic sequences: Queue and Seq (catenable sequences).
Algorithms: Scc (calculate the strongest-connected-components).

Download

Documentation [html, zip (~75k)]
Haddock'ed documentation of the libraries. (Other haddock users may find the haddock interface file for the libraries useful).
Sources [std.zip, hugs.zip, ghc.zip (~50k, 18 Nov 2002)]
The sources of the libraries. The IntSet and IntMap modules use low-level bit operations on integers and not all haskell systems handle that correctly. Because of this, I made three different versions of those modules:
  • The ghc.zip sources are optimized for use with GHC (these are also the root sources for the other versions).
  • The hugs.zip sources are specialised for (older) Hugs versions and circumvent integer conversion bugs.
  • The std.zip sources are specialised for standard Haskell98 systems.

History

18 nov 2002
Fixed bug in IntMap.union that sometimes behaved right-biased instead of left-biased.
Changed type definition of Word to Nat in the IntMap and IntSet modules, to avoid a name-clash on the newest Hugs distribution (dec 2001).
2 sep 2002
First release

Faq

Shouldn't these modules be part of Edison?
Yes, they should. Unfortunately, I can't find time right now to adapt them for Edison, but it should be fairly easy to do. The only disadvantage that I can see, is that it will be somewhat harder to use these modules (you need to import the Edison prelude for example).
How do the Map/Set modules compare to the Data.FiniteMap/Set modules in GHC?
Both implementations are based on size-balanced trees and will probably have the same performance. The advantage of the Data.FiniteMap/Set modules is that they are thorougly tested and tuned for performance. On the other hand, the operations are fairly constrained and the code is riddled with GHC extensions. I tried to implement a more complete and consistent set of operations like partition and adjust. Furthermore, the library uses the module system to distinguish names (ie. empty instead of emptyFM), it implements the more efficient hedge union algorithm, and the Set module is directly implemented instead of using a finite map with unit elements.
And what about the IntMap versus Edison.PatriciaLoMap?
The IntMap module was actually implemented with Edison.PatriciaLoMap as an example. However, the IntMap module is based on a big-endian tree instead of a little-endian tree which is generally more efficient, especially on ordered data. Big endian trees were not often used in practice since it was fairly expensive to calculate the highest bit position in an integer. The IntMap module uses a clever algoritm from Jorg Arndts bit wizardry page to determine the highest bit position efficiently.

"Oct 18 2005"