Skip to content

BUG: Preserve nullable index dtype in _transform_index#65316

Open
gautamvarmadatla wants to merge 3 commits intopandas-dev:mainfrom
gautamvarmadatla:fix/rename-nullable-index-dtype
Open

BUG: Preserve nullable index dtype in _transform_index#65316
gautamvarmadatla wants to merge 3 commits intopandas-dev:mainfrom
gautamvarmadatla:fix/rename-nullable-index-dtype

Conversation

@gautamvarmadatla
Copy link
Copy Markdown
Contributor

Fixed Index._transform_index() to use _cast_pointwise_result when rebuilding the index. This preserves nullable extension dtypes instead of losing them through plain list inference.

@gautamvarmadatla gautamvarmadatla force-pushed the fix/rename-nullable-index-dtype branch from 35f1026 to 03b066d Compare April 21, 2026 02:40
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR!

Comment thread pandas/core/indexes/base.py Outdated
Comment on lines +6898 to +6899
arr = self[:0].array
new_values = arr._cast_pointwise_result(items)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arr = self[:0].array
new_values = arr._cast_pointwise_result(items)
new_values = self.array._cast_pointwise_result(items)

The slicing should not be needed I think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, you're right! removed it

Comment thread pandas/core/indexes/base.py Outdated
Comment on lines +6896 to +6897
if isinstance(items[0], tuple):
return Index(items, name=self.name, tupleize_cols=False)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this check? (I was first thinking it might be to ensure we create a MultiIndex when having tuples (so not specify the dtype as in the path below), but since tupleize_cols=False, that should already not happen?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that _cast_pointwise_result, when getting something (like tuples) that it cannot convert, it should give you back the original input as an object-dtype ndarray, and so then the below code path should still work as well?

(and if that is not the case, I think that we should fix _cast_pointwise_result to ensure that, so the usage here can be simpler)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With #65318 merged, it might now work to just call _cast_pointwise_result without the check for tuples

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! The check was guarding against a ValueError in masked._cast_pointwise_result where tuple inputs caused np.asarray to return a 2D array. With #65318 pushed it's just dead code now, so removed it.

@jorisvandenbossche jorisvandenbossche added ExtensionArray Extending pandas with custom dtypes or arrays. Index Related to the Index class or subclasses labels Apr 21, 2026
Comment on lines +6892 to +6895
if not items:
return Index(
items, dtype=self.dtype, name=self.name, tupleize_cols=False
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one still needed? I would also expect that _cast_pointwise_result should essentially do that as well

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I see that in practice this is indeed certainly not the case right now, so ignore my comment

(we might want to tighten that aspect in the interface, and then this can be simplified later)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jbrockmendel what do you think about expecting that _cast_pointwise_result defaults to the caller's dtype in case of empty input?

(generally, at that level, the code cannot know what the "correct" dtype would be. But at the moment our different EAs have varying behaviour, and it might be good to at least be consistent, and then using the calling dtype seems to be an obvious choice)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually I'd like to get to Void Dtype, but im fine with same-dtype for the interim.

@gautamvarmadatla
Copy link
Copy Markdown
Contributor Author

The CI failures look unrelated to this PR. All failures are from np.datetime64("NaT") (without a unit specifier) triggering a NumPy deprecation warning on the numpy-nightly build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ExtensionArray Extending pandas with custom dtypes or arrays. Index Related to the Index class or subclasses

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: DataFrame/Series.rename silently downcasts nullable index/column dtypes to NumPy dtypes

3 participants