Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This came up in the context of this PR in DataFusion:
In that case we are applying some operations to a PrimitiveArray and would like to reuse the allocation if possible
However, the current API of PrimitiveArray::unary_mut and similar functions makes this awkward to do as the caller must handle the case where the allocation can not be reused
// want to apply an operation to arr, reusing allocation if possible
let arr: PrimitiveArray<u64> = ...
// to do so we call try_unary but also must handle when the allocation is shared
let new_arr = match arr.unary_mut(|a| a+ 1) {
Ok(arr) => arr,
Err(old_arr) => old_arr.unary(|a| a+1)
}
This can be done, but it is hard to use.
I proposed the following function in DataFusion
/// Applies the unary operation in place if possible, or cloning the array if not
fn try_unary_mut_or_clone<F>(
array: PrimitiveArray<Int64Type>,
op: F,
) -> Result<PrimitiveArray<Int64Type>>
where
F: Fn(i64) -> Result<i64>,
{
match array.try_unary_mut(&op) {
Ok(result) => result,
// on error, make a new array
Err(array) => array.try_unary(op),
}
}
but quoting @findepi on https://github.com/apache/datafusion/pull/18360/files#r2475557450:
can this be made more flexible with a more generous use of generics?
perhaps it could even be in arrow-rs. it makes try_unary_mut significantly more approachable
Describe the solution you'd like
I would like it to be easier to apply unary and binary operations on PrimitiveArrays and reuse the allocation if possble
Describe alternatives you've considered
One alternative would be to follow the API of Arc::unwrap_or_clone
So that would mean functions something like
PrimitiveArray::unary_mut_or_clone
PrimitiveArray::try_unary_mut_or_clone
PrimitiveArray::binary_mut_or_clone
PrimitiveArray::try_binary_mut_or_clone
Which would be implemented like the function above
I think this would make it much easier to use these APIs
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This came up in the context of this PR in DataFusion:
date_truncfunction by avoiding allocations datafusion#18360In that case we are applying some operations to a
PrimitiveArrayand would like to reuse the allocation if possibleHowever, the current API of PrimitiveArray::unary_mut and similar functions makes this awkward to do as the caller must handle the case where the allocation can not be reused
This can be done, but it is hard to use.
I proposed the following function in DataFusion
but quoting @findepi on https://github.com/apache/datafusion/pull/18360/files#r2475557450:
Describe the solution you'd like
I would like it to be easier to apply unary and binary operations on PrimitiveArrays and reuse the allocation if possble
Describe alternatives you've considered
One alternative would be to follow the API of
Arc::unwrap_or_cloneSo that would mean functions something like
PrimitiveArray::unary_mut_or_clonePrimitiveArray::try_unary_mut_or_clonePrimitiveArray::binary_mut_or_clonePrimitiveArray::try_binary_mut_or_cloneWhich would be implemented like the function above
I think this would make it much easier to use these APIs