pub struct ColumnStats {
pub col_idx: usize,
pub dtype: ScalarType,
pub null_count: u64,
pub distinct_estimate: u64,
pub min_value: Option<i64>,
pub max_value: Option<i64>,
pub avg_width: Option<f32>,
}Expand description
Per-column statistics for optimizer cost estimation.
Tracks null counts, distinct value estimates, and value ranges for columns. These statistics enable the optimizer to estimate filter selectivity and join cardinalities.
Fields§
§col_idx: usizeColumn index within the relation
dtype: ScalarTypeData type of the column
null_count: u64Count of null values (for nullable columns)
distinct_estimate: u64HyperLogLog-style distinct value estimate
min_value: Option<i64>Minimum value (for orderable types, stored as i64)
max_value: Option<i64>Maximum value (for orderable types, stored as i64)
avg_width: Option<f32>Average value length for variable-length types (e.g., symbols)
Implementations§
Source§impl ColumnStats
impl ColumnStats
Sourcepub fn update_distinct(&mut self, estimate: u64)
pub fn update_distinct(&mut self, estimate: u64)
Updates the distinct value estimate.
This should be updated from HyperLogLog or similar cardinality estimation algorithms running on the GPU.
§Arguments
estimate- The new distinct value estimate
Sourcepub fn update_range(&mut self, min: i64, max: i64)
pub fn update_range(&mut self, min: i64, max: i64)
Updates the value range for this column.
§Arguments
min- The minimum value (encoded as i64)max- The maximum value (encoded as i64)
Sourcepub fn update_null_count(&mut self, count: u64)
pub fn update_null_count(&mut self, count: u64)
Sourcepub fn update_avg_width(&mut self, width: f32)
pub fn update_avg_width(&mut self, width: f32)
Updates the average width for variable-length columns.
§Arguments
width- The average value width in bytes
Sourcepub fn equality_selectivity(&self, total_rows: u64) -> f64
pub fn equality_selectivity(&self, total_rows: u64) -> f64
Sourcepub fn range_selectivity(&self, low: i64, high: i64) -> f64
pub fn range_selectivity(&self, low: i64, high: i64) -> f64
Estimates selectivity for a range predicate.
Uses min/max values to estimate what fraction of the range is covered. Returns a default estimate if range statistics are unavailable.
§Arguments
low- The lower bound of the range (inclusive)high- The upper bound of the range (inclusive)
§Returns
The estimated selectivity (0.0 to 1.0)
Sourcepub fn value_size_bytes(&self) -> usize
pub fn value_size_bytes(&self) -> usize
Returns the storage size per value for this column type.
Trait Implementations§
Source§impl Clone for ColumnStats
impl Clone for ColumnStats
Source§fn clone(&self) -> ColumnStats
fn clone(&self) -> ColumnStats
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more