Module datafusion::physical_optimizer
source · Expand description
Optimizer that rewrites ExecutionPlan
s.
These rules take advantage of physical plan properties , such as “Repartition” or “Sortedness”
Re-exports
pub use optimizer::PhysicalOptimizerRule;
Modules
- Utilizing exact statistics from sources to avoid scanning data
- CoalesceBatches optimizer that groups batches together rows in bigger batches to avoid overhead with small batches
- CombinePartialFinalAggregate optimizer rule checks the adjacent Partial and Final AggregateExecs and try to combine them if necessary
- EnforceDistribution optimizer rule inspects the physical plan with respect to distribution requirements and adds RepartitionExecs to satisfy them when necessary.
- The
JoinSelection
rule tries to modify a given plan so that it can accommodate infinite sources and utilize statistical information (if there is any) to obtain more performant plans. To achieve the first goal, it tries to transform a non-runnable query (with the given infinite sources) into a runnable query by replacing pipeline-breaking join operations with pipeline-friendly ones. To achieve the second goal, it selects the properPartitionMode
and the build side using the available statistics for hash joins. - Physical optimizer traits
- The PipelineChecker rule ensures that a given plan can accommodate its infinite sources, if there are any. It will reject non-runnable query plans that use pipeline-breaking operators on infinite input(s).
- This module contains code to prune “containers” of row groups based on statistics prior to execution. This can lead to significant performance improvements by avoiding the need to evaluate a plan on entire containers (e.g. an entire file)
- Repartition optimizer that introduces repartition nodes to increase the level of parallelism available
- Optimizer rule that replaces executors that lose ordering with their order-preserving variants when it is helpful; either in terms of performance or to accommodate unbounded streams by fixing the pipeline.
- EnforceSorting optimizer rule inspects the physical plan with respect to local sorting requirements and does the following: