Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I am profiling clickbench query 26 with predicate pushdown enabled as part of
samply record -- /Users/andrewlamb/Software/datafusion2/target/profiling/datafusion-cli -f q.sql > /dev/null 2>&1
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "EventTime", "SearchPhrase" LIMIT 10;
While looking at the profile, I noticed that 3% of the time is spent concatenating in the cached array reader
I believe the call is here:
|
_ => Ok(arrow_select::concat::concat( |
Describe the solution you'd like
I would like to make this faster
Describe alternatives you've considered
I think we can use the BatchCoalescer for this task and potentially save at least one copy
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I am profiling clickbench query 26 with predicate pushdown enabled as part of
filter_pushdown) by default datafusion#3463While looking at the profile, I noticed that 3% of the time is spent concatenating in the cached array reader
I believe the call is here:
arrow-rs/parquet/src/arrow/array_reader/cached_array_reader.rs
Line 333 in 814ee42
Describe the solution you'd like
I would like to make this faster
Describe alternatives you've considered
I think we can use the
BatchCoalescerfor this task and potentially save at least one copyAdditional context