Struct datafusion_common::config::ExecutionOptions
source · #[non_exhaustive]pub struct ExecutionOptions {Show 16 fields
pub batch_size: usize,
pub coalesce_batches: bool,
pub collect_statistics: bool,
pub target_partitions: usize,
pub time_zone: Option<String>,
pub parquet: ParquetOptions,
pub aggregate: AggregateOptions,
pub planning_concurrency: usize,
pub sort_spill_reservation_bytes: usize,
pub sort_in_place_threshold_bytes: usize,
pub meta_fetch_concurrency: usize,
pub minimum_parallel_output_files: usize,
pub soft_max_rows_per_output_file: usize,
pub max_buffered_batches_per_output_file: usize,
pub listing_table_ignore_subdirectory: bool,
pub enable_recursive_ctes: bool,
}
Expand description
Options related to query execution
See also: SessionConfig
Fields (Non-exhaustive)§
This struct is marked as non-exhaustive
Struct { .. }
syntax; cannot be matched against without a wildcard ..
; and struct update syntax will not work.batch_size: usize
Default batch size while creating new batches, it’s especially useful for buffer-in-memory batches since creating tiny batches would result in too much metadata memory consumption
coalesce_batches: bool
When set to true, record batches will be examined between each operator and small batches will be coalesced into larger batches. This is helpful when there are highly selective filters or joins that could produce tiny output batches. The target batch size is determined by the configuration setting
collect_statistics: bool
Should DataFusion collect statistics after listing files
target_partitions: usize
Number of partitions for query execution. Increasing partitions can increase concurrency.
Defaults to the number of CPU cores on the system
time_zone: Option<String>
The default time zone
Some functions, e.g. EXTRACT(HOUR from SOME_TIME)
, shift the underlying datetime
according to this time zone, and then extract the hour
parquet: ParquetOptions
Parquet options
aggregate: AggregateOptions
Aggregate options
planning_concurrency: usize
Fan-out during initial physical planning.
This is mostly use to plan UNION
children in parallel.
Defaults to the number of CPU cores on the system
sort_spill_reservation_bytes: usize
Specifies the reserved memory for each spillable sort operation to facilitate an in-memory merge.
When a sort operation spills to disk, the in-memory data must be sorted and merged before being written to a file. This setting reserves a specific amount of memory for that in-memory sort/merge process.
Note: This setting is irrelevant if the sort operation cannot spill
(i.e., if there’s no DiskManager
configured).
sort_in_place_threshold_bytes: usize
When sorting, below what size should data be concatenated and sorted in a single RecordBatch rather than sorted in batches and merged.
meta_fetch_concurrency: usize
Number of files to read in parallel when inferring schema and statistics
minimum_parallel_output_files: usize
Guarantees a minimum level of output files running in parallel. RecordBatches will be distributed in round robin fashion to each parallel writer. Each writer is closed and a new file opened once soft_max_rows_per_output_file is reached.
soft_max_rows_per_output_file: usize
Target number of rows in output files when writing multiple. This is a soft max, so it can be exceeded slightly. There also will be one file smaller than the limit if the total number of rows written is not roughly divisible by the soft max
max_buffered_batches_per_output_file: usize
This is the maximum number of RecordBatches buffered for each output file being worked. Higher values can potentially give faster write performance at the cost of higher peak memory consumption
listing_table_ignore_subdirectory: bool
Should sub directories be ignored when scanning directories for data
files. Defaults to true (ignores subdirectories), consistent with
Hive. Note that this setting does not affect reading partitioned
tables (e.g. /table/year=2021/month=01/data.parquet
).
enable_recursive_ctes: bool
Should DataFusion support recursive CTEs Defaults to false since this feature is a work in progress and may not behave as expected
Trait Implementations§
source§impl Clone for ExecutionOptions
impl Clone for ExecutionOptions
source§fn clone(&self) -> ExecutionOptions
fn clone(&self) -> ExecutionOptions
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more