Crate datafusion_row
source · [−]Expand description
An implementation of Row backed by raw bytes
Each tuple consists of up to three parts: “null bit set
” , “values
” and “var length data
”
The null bit set is used for null tracking and is aligned to 1-byte. It stores one bit per field.
In the region of the values, we store the fields in the order they are defined in the schema.
- For fixed-length, sequential access fields, we store them directly. E.g., 4 bytes for int and 1 byte for bool.
- For fixed-length, update often fields, we store one 8-byte word per field.
- For fields of non-primitive or variable-length types, we append their actual content to the end of the var length region and store their offset relative to row base and their length, packed into an 8-byte word.
┌────────────────┬──────────────────────────┬───────────────────────┐ ┌───────────────────────┬────────────┐ │Validity Bitmask│ Fixed Width Field │ Variable Width Field │ … │ vardata area │ padding │ │ (byte aligned) │ (native type width) │(vardata offset + len) │ │ (variable length) │ bytes │ └────────────────┴──────────────────────────┴───────────────────────┘ └───────────────────────┴────────────┘
For example, given the schema (Int8, Utf8, Float32, Utf8)
Encoding the tuple (1, “FooBar”, NULL, “baz”)
Requires 32 bytes (31 bytes payload and 1 byte padding to make each tuple 8-bytes aligned):
┌──────────┬──────────┬──────────────────────┬──────────────┬──────────────────────┬───────────────────────┬──────────┐ │0b00001011│ 0x01 │0x00000016 0x00000006│ 0x00000000 │0x0000001C 0x00000003│ FooBarbaz │ 0x00 │ └──────────┴──────────┴──────────────────────┴──────────────┴──────────────────────┴───────────────────────┴──────────┘ 0 1 2 10 14 22 31 32
Re-exports
pub use layout::row_supported;
pub use layout::RowType;
Modules
Setter/Getter for row with all fixed-sized fields.
Various row layout for different use case
Accessing row from raw bytes
Reusable row writer backed by Vec
Macros
Structs
Columnar Batch buffer