Is your feature request related to a problem or challenge? Please describe what you are trying to do.
See original issue comment: https://github.com/apache/arrow-rs/pull/8912/changes#r2553987033
Describe the solution you'd like
Implement fast paths for casting from LargeUtf8/LargeBinary to Utf8View/BinaryView -> need to consider if offsets fit
Implement fast path for casting Utf8 -> BinaryView, Binary -> Utf8View -> need to consider if need to validate binary data for Binary -> Utf8View case
See:
|
// `unpack_dictionary` can handle Utf8View/BinaryView types, but incurs unnecessary data |
|
// copy of the value buffer. Fast path which avoids copying underlying values buffer. |
|
// TODO: handle LargeUtf8/LargeBinary -> View (need to check offsets can fit) |
|
// TODO: handle cross types (String -> BinaryView, Binary -> StringView) |
|
// (need to validate utf8?) |
|
(Utf8, Utf8View) => view_from_dict_values::<K, Utf8Type, StringViewType>( |
|
array.keys(), |
|
array.values().as_string::<i32>(), |
|
), |
|
(Binary, BinaryView) => view_from_dict_values::<K, BinaryType, BinaryViewType>( |
|
array.keys(), |
|
array.values().as_binary::<i32>(), |
|
), |
Describe alternatives you've considered
Additional context
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
See original issue comment: https://github.com/apache/arrow-rs/pull/8912/changes#r2553987033
Describe the solution you'd like
Implement fast paths for casting from LargeUtf8/LargeBinary to Utf8View/BinaryView -> need to consider if offsets fit
Implement fast path for casting Utf8 -> BinaryView, Binary -> Utf8View -> need to consider if need to validate binary data for Binary -> Utf8View case
See:
arrow-rs/arrow-cast/src/cast/dictionary.rs
Lines 37 to 49 in 08dcc0b
Describe alternatives you've considered
Additional context