Kruus et al., 2010 - Google Patents
Bimodal content defined chunking for backup streams.Kruus et al., 2010
View PDF- Document ID
- 6130450628438444633
- Author
- Kruus E
- Ungureanu C
- Dubnicki C
- Publication year
- Publication venue
- Fast
External Links
Snippet
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are well established methods of separating a data stream into variable-size chunks such that …
- 230000002902 bimodal 0 title description 22
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30153—Redundancy elimination performed by the file system using compression, e.g. sparse files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/3015—Redundancy elimination performed by the file system
- G06F17/30156—De-duplication implemented within the file system, e.g. based on file segments
- G06F17/30159—De-duplication implemented within the file system, e.g. based on file segments based on file chunks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/30144—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
- G06F3/0601—Dedicated interfaces to storage systems
- G06F3/0628—Dedicated interfaces to storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/064—Management of blocks
- G06F3/0641—De-duplication techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30067—File systems; File servers
- G06F17/30129—Details of further file system functionalities
- G06F17/30138—Details of free space management performed by the file system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30017—Multimedia data retrieval; Retrieval of more than one type of audiovisual media
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from or digital output to record carriers, e.g. RAID, emulated record carriers, networked record carriers
- G06F3/0601—Dedicated interfaces to storage systems
- G06F3/0602—Dedicated interfaces to storage systems specifically adapted to achieve a particular effect
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
-
- H—ELECTRICITY
- H03—BASIC ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3084—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
- H03M7/3086—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/40—Specific encoding of data in memory or cache
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kruus et al. | Bimodal content defined chunking for backup streams. | |
| US11640256B2 (en) | Methods and systems for object level de-duplication for data storage system | |
| You et al. | Deep Store: An archival storage system architecture | |
| Xia et al. | A comprehensive study of the past, present, and future of data deduplication | |
| US9880746B1 (en) | Method to increase random I/O performance with low memory overheads | |
| EP2256934B1 (en) | Method and apparatus for content-aware and adaptive deduplication | |
| Xia et al. | Ddelta: A deduplication-inspired fast delta compression approach | |
| US8639669B1 (en) | Method and apparatus for determining optimal chunk sizes of a deduplicated storage system | |
| US8712963B1 (en) | Method and apparatus for content-aware resizing of data chunks for replication | |
| CA2670400C (en) | Methods and systems for quick and efficient data management and/or processing | |
| US9268783B1 (en) | Preferential selection of candidates for delta compression | |
| Lin et al. | Migratory compression: Coarse-grained data reordering to improve compressibility | |
| EP2940598B1 (en) | Data object processing method and device | |
| US8972672B1 (en) | Method for cleaning a delta storage system | |
| US10135462B1 (en) | Deduplication using sub-chunk fingerprints | |
| US9400610B1 (en) | Method for cleaning a delta storage system | |
| US9026740B1 (en) | Prefetch data needed in the near future for delta compression | |
| Park et al. | Characterizing datasets for data deduplication in backup applications | |
| US10915260B1 (en) | Dual-mode deduplication based on backup history | |
| US9116902B1 (en) | Preferential selection of candidates for delta compression | |
| EP3432168B1 (en) | Metadata separated container format | |
| Vikraman et al. | A study on various data de-duplication systems | |
| Tolic et al. | Deduplication in unstructured-data storage systems | |
| Abdulsalam et al. | Evaluation of Two Thresholds Two Divisor Chunking Algorithm Using Rabin Finger print, Adler, and SHA1 Hashing Algorithms | |
| Majed et al. | Cloud based industrial file handling and duplication removal using source based deduplication technique |