Skip to content

BUG: in describe() result, mean is to NaN or Inf, when change float64 to float32 or float16 #48757

@ruifeng96150

Description

@ruifeng96150

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame({"a": range(1000)}) df.a = 1.2345678 * df.a df["b"] = df.a.astype("float16") df.describe() ab count1000.0000001000.0000 mean616.666616inf std356.567176inf min0.0000000.0000 25%308.333308308.4375 50%616.666616616.7500 75%924.999924924.8750 max1233.3332321233.0000

Issue Description

when change the column to low float format, the mean or std will calc wrong. In this case mean ,616 to inf. and in some case will result NaN. such as (but there is no NaN value in the column):
count 674522.000000
mean NaN
std 0.000000
min -17.359375
25% -1.610352
50% -0.280029
75% 1.049805
max 19.984375

Expected Behavior

same as normal mean or std

Installed Versions

/home/terry/.local/lib/python3.8/site-packages/_distutils_hack/init.py:30: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
Output exceeds the size limit. Open the full output data in a text editor

INSTALLED VERSIONS

commit : 87cfe4e
python : 3.8.10.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-46-generic
Version : #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.5.0
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 62.1.0
pip : 22.2.2
Cython : 0.29.30
pytest : 7.1.2
hypothesis : None
...
xlrd : 2.0.1
xlwt : None
zstandard : None
tzdata : 2022.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDescribe/info/etcobj.describe, obj.info, requests for methods that look similarDtype ConversionsUnexpected or buggy dtype conversionsReduction Operationssum, mean, min, max, etc.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions