Module datafusion::physical_plan
source · Expand description
re-export of datafusion_physical_plan
crate
Modules
- Aggregates functionalities
- Defines the ANALYZE operator
- CoalesceBatchesExec combines small batches into larger batches for more efficient use of vectorized processing by upstream operators.
- Defines the merge plan for executing partitions in parallel and then merging the results into a single partition
- Defines common code used in execution plans
- Implementation of physical plan display. See
crate::displayable
for examples of how to format - EmptyRelation execution plan
- Defines the EXPLAIN operator
- Defines physical expressions that can evaluated at runtime during query execution
- FilterExec evaluates a boolean predicate against all input batches to determine which rows to include in its output batches.
- Declaration of built-in (scalar) functions. This module contains built-in functions’ enumeration and metadata.
- Functionality used both on logical and physical plans
- Execution plan for writing data to
DataSink
s - DataFusion Join implementations
- Defines the LIMIT plan
- Execution plan for reading in-memory batches of data
- Metrics for recording information about execution
- Defines the projection execution plan. A projection determines which columns or expressions are returned from a query. The SQL statement
SELECT a, b, a+b FROM t1
is an example of a projection on tablet1
where the expressionsa
,b
, anda+b
are the projection expressions.SELECT
withoutFROM
will only evaluate expressions. - This file implements the
RepartitionExec
operator, which maps N input partitions to M output partitions based on a partitioning scheme, optionally maintaining the order of the input rows in the output. - Sort functionalities
- Stream wrappers for physical operators
- Generic plans for deferred execution:
StreamingTableExec
andPartitionStream
- This module provides common traits for visiting or rewriting tree nodes easily.
- This module contains functions and structs supporting user-defined aggregate functions.
- UDF support
- The Union operator combines multiple inputs with the same schema
- Defines the unnest column plan for unnesting values in a column that contains a list type, conceptually is like joining each row with all the values in the list column.
- Values execution plan
- Physical expressions for window functions
Macros
- Macro wraps Err(
$ERR
) to add backtrace feature
Structs
- Statistics for a column within a relation
- A newtype wrapper to display
T
implementingDisplayAs
using theDefault
mode - EmptyRecordBatchStream can be used to create a RecordBatchStream that will produce no results
- Something that tracks a value of interest (metric) of a DataFusion
ExecutionPlan
execution. - Statistics for a relation Fields are optional and can be inexact because the sources sometimes provide approximate estimates for performance reasons and the transformations output are not always predictable.
- Global TopK
- A newtype wrapper to display
T
implementingDisplayAs
using theVerbose
mode
Enums
- Represents the result of evaluating an expression: either a single
ScalarValue
or anArrayRef
. - Options for controlling how each
ExecutionPlan
should format itself - How data is distributed amongst partitions. See
Partitioning
for more details. - Output partitioning supported by
ExecutionPlan
s.
Traits
- Describes an aggregate functions’s state.
- An aggregate expression that:
- Trait for types which could have additional details when formatted in
Verbose
mode - Represent nodes in the DataFusion Physical Plan.
- Trait that implements the Visitor pattern for a depth first walk of
ExecutionPlan
nodes.pre_visit
is called before any children are visited, and thenpost_visit
is called after all children have been visited. - Expression that can be evaluated against a RecordBatch A Physical expression knows its type, nullability and how to evaluate itself.
- Trait for types that stream arrow::record_batch::RecordBatch
- Common trait for window function implementations
Functions
- Visit all children of this plan, according to the order defined on
ExecutionPlanVisitor
. - Execute the ExecutionPlan and collect the results in memory
- Execute the ExecutionPlan and collect the results in memory
- Return a wrapper around an
ExecutionPlan
which can be displayed in various easier to understand ways. - Execute the ExecutionPlan and return a single stream of results
- Execute the ExecutionPlan and return a vec with one stream per output partition
- Indicate whether a data exchange is needed for the input of
plan
, which will be very helpful especially for the distributed engine to judge whether need to deal with shuffling. Currently there are 3 kinds of execution plan which needs data exchange 1. RepartitionExec for changing the partition number between twoExecutionPlan
s 2. CoalescePartitionsExec for collapsing all of the partitions into one without ordering guarantee 3. SortPreservingMergeExec for collapsing all of the sorted partitions into one with ordering guarantee - Applies an optional projection to a
SchemaRef
, returning the projected schema - Recursively calls
pre_visit
andpost_visit
for this node and all of its children, as described onExecutionPlanVisitor
- Returns a copy of this plan if we change any child according to the pointer comparison. The size of
children
must be equal to the size ofExecutionPlan::children()
.
Type Aliases
- Trait for a [
Stream
] ofRecordBatch
es