Arroyuelo et al., 2006 - Google Patents
Reducing the space requirement of LZ-indexArroyuelo et al., 2006
View PDF- Document ID
- 5907190644828799347
- Author
- Arroyuelo D
- Navarro G
- Sadakane K
- Publication year
- Publication venue
- Annual Symposium on Combinatorial Pattern Matching
External Links
Snippet
The LZ-index is a compressed full-text self-index able to represent a text P 1... m, over an alphabet of size and with k-th order empirical entropy H k (T), using 4 uH k (T)+ o (u log σ) bits for any k= o (log σ u). It can report all the occ occurrences of a pattern P 1... m in T in O …
- 230000000875 corresponding 0 description 21
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
- G06F17/30625—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G06F17/30961—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
- G06F17/30324—Vectors, bitmaps or matrices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30908—Information retrieval; Database structures therefor; File system structures therefor of semistructured data, the undelying structure being taken into account, e.g. mark-up language structure data
- G06F17/30914—Mapping or conversion
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/42—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Claude et al. | Improved grammar-based compressed indexes | |
| Arroyuelo et al. | Reducing the space requirement of LZ-index | |
| Nishimoto et al. | Optimal-time queries on BWT-runs compressed indexes | |
| Jansson et al. | Ultra-succinct representation of ordered trees with applications | |
| Jansson et al. | Ultra-succinct representation of ordered trees | |
| Munro et al. | Space efficient suffix trees | |
| Gagie et al. | Colored range queries and document retrieval | |
| Bille et al. | Random access to grammar-compressed strings | |
| Mäkinen et al. | Implicit compression boosting with applications to self-indexing | |
| Kreft et al. | Self-indexing based on LZ77 | |
| Golynski et al. | On the size of succinct indices | |
| Arroyuelo et al. | Stronger Lempel-Ziv based compressed text indexing | |
| Belazzougui | Succinct dictionary matching with no slowdown | |
| Prezza | Optimal rank and select queries on dictionary-compressed text | |
| Díaz-Domínguez et al. | An LMS-based grammar self-index with local consistency properties | |
| Claude et al. | Self-indexed text compression using straight-line programs | |
| Jansson et al. | Linked dynamic tries with applications to LZ-compression in sublinear time and space | |
| Prezza | On locating paths in compressed tries | |
| Golynski et al. | On the redundancy of succinct data structures | |
| Tsuruta et al. | Grammar-compressed self-index with Lyndon words | |
| Navarro et al. | Faster top-k document retrieval in optimal space | |
| Välimäki et al. | Engineering a compressed suffix tree implementation | |
| Gupta et al. | A framework for dynamizing succinct data structures | |
| Arroyuelo et al. | Space-efficient construction of Lempel–Ziv compressed text indexes | |
| Navarro | Indexing text using the Ziv-Lempel trie |