-
- Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Breaking this off from discussion on setting a float scalar into a float32 Series (#55679 (comment)), because it's a more general issue that should be clarified.
PDEP 6 bans upcasting in setitem-like operations, and thus the simple first rule is that in such operation, the dtype never changes. But the second aspect is less clear: when do we actually coerce the value-being-set to the target dtype, or when do we decide that an upcast would be needed (and thus would raise an error in the future).
In hindsight, I think the PDEP should have been more explicit on this. Currently, the text says (https://pandas.pydata.org/pdeps/0006-ban-upcasting.html):
- If a
setitem-like operation would previously have changed a Series' dtype, it would now raise.
It essentially ties this decision to the current behaviour, but 1) the current behaviour is not always correct (or can be upcasting too liberally for a world where we would ban upcasting in setitem), and 2) it's also very strange to explain in the future when something will error or not (assume someone is using pandas 3.0 and wonders if a certain setitem operation will raise or not, the answer would be: "well, check if pandas 2.1 upcasted, then it will now raise", which is of course not a good answer for future users).
(I know it is of course a very logical way to phrase the impact of the change for current users)
So can we define a more general rule when the value is cast to the target dtype? (not depending on current behaviour details)