Claude et al., 2011 - Google Patents
Self-indexed grammar-based compressionClaude et al., 2011
- Document ID
- 10524438809448201225
- Author
- Claude F
- Navarro G
- Publication year
- Publication venue
- Fundamenta Informaticae
External Links
Snippet
Self-indexes aim at representing text collections in a compressed format that allows extracting arbitrary portions and also offers indexed searching on the collection. Current self- indexes are unable of fully exploiting the redundancy of highly repetitive text collections that …
- 238000007906 compression 0 title abstract description 48
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
- G06F17/3033—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30613—Indexing
- G06F17/30619—Indexing indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G06F17/30961—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30908—Information retrieval; Database structures therefor; File system structures therefor of semistructured data, the undelying structure being taken into account, e.g. mark-up language structure data
- G06F17/30914—Mapping or conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Claude et al. | Self-indexed grammar-based compression | |
| Kreft et al. | On compressing and indexing repetitive sequences | |
| Claude et al. | Improved grammar-based compressed indexes | |
| Navarro et al. | Compressed full-text indexes | |
| Munro et al. | Space-efficient construction of compressed indexes in deterministic linear time | |
| Kociumaka et al. | Toward a definitive compressibility measure for repetitive sequences | |
| Mäkinen et al. | Rank and select revisited and extended | |
| Navarro | Compact data structures: A practical approach | |
| Mäkinen et al. | Dynamic entropy-compressed sequences and full-text indexes | |
| Fischer et al. | Faster entropy-bounded compressed suffix trees | |
| Babenko et al. | Wavelet trees meet suffix trees | |
| Kreft et al. | Self-indexing based on LZ77 | |
| Bille et al. | Random access to grammar-compressed strings | |
| Barbay et al. | Efficient fully-compressed sequence representations | |
| Belazzougui et al. | Access, rank, and select in grammar-compressed strings | |
| Barbay et al. | Alphabet partitioning for compressed rank/select and applications | |
| Navarro et al. | Faster compressed suffix trees for repetitive collections | |
| Belazzougui et al. | Representing the suffix tree with the CDAWG | |
| Arroyuelo et al. | Stronger Lempel-Ziv based compressed text indexing | |
| Claude et al. | Self-indexed text compression using straight-line programs | |
| Russo et al. | Fully-compressed suffix trees | |
| Sirén | Compressed Full-Text Indexes for Highly Repetitive Collections. | |
| Navarro et al. | Practical Indexing of Repetitive Collections Using Relative Lempel-Ziv. | |
| Munro et al. | Text indexing and searching in sublinear time | |
| Cenzato et al. | Suffixient Arrays: a New Efficient Suffix Array Compression Technique |