Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
avoid some upcasting when its not the purpose of the test
  • Loading branch information
MarcoGorelli committed Dec 30, 2022
commit 96e4dc90ae7234075c6b9716b2b7e313db3b13d0
4 changes: 2 additions & 2 deletions pandas/tests/frame/test_query_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,7 +448,7 @@ def test_date_index_query(self):
def test_date_index_query_with_NaT(self):
engine, parser = self.engine, self.parser
n = 10
df = DataFrame(np.random.randn(n, 3))
df = DataFrame(np.random.randn(n, 3)).astype({0: object})
df["dates1"] = date_range("1/1/2012", periods=n)
df["dates3"] = date_range("1/1/2014", periods=n)
df.iloc[0, 0] = pd.NaT
Comment on lines +452 to 455
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, df[0] is first created of int dtype, and then df.iloc[0, 0] = pd.NaT upcasts it to object. Might as well create it of dtype object in the first place, as the purpose of this test comes in the lines below (df.query(...)

Expand Down Expand Up @@ -808,7 +808,7 @@ def test_date_index_query(self):
def test_date_index_query_with_NaT(self):
engine, parser = self.engine, self.parser
n = 10
df = DataFrame(np.random.randn(n, 3))
df = DataFrame(np.random.randn(n, 3)).astype({0: object})
df["dates1"] = date_range("1/1/2012", periods=n)
df["dates3"] = date_range("1/1/2014", periods=n)
df.iloc[0, 0] = pd.NaT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Expand Down
8 changes: 6 additions & 2 deletions pandas/tests/frame/test_reductions.py
Original file line number Diff line number Diff line change
Expand Up @@ -448,11 +448,15 @@ def test_var_std(self, datetime_frame):
@pytest.mark.parametrize("meth", ["sem", "var", "std"])
def test_numeric_only_flag(self, meth):
# GH 9201
df1 = DataFrame(np.random.randn(5, 3), columns=["foo", "bar", "baz"])
df1 = DataFrame(np.random.randn(5, 3), columns=["foo", "bar", "baz"]).astype(
{"foo": object}
)
# set one entry to a number in str format
df1.loc[0, "foo"] = "100"

df2 = DataFrame(np.random.randn(5, 3), columns=["foo", "bar", "baz"])
df2 = DataFrame(np.random.randn(5, 3), columns=["foo", "bar", "baz"]).astype(
{"foo": object}
)
# set one entry to a non-number str
df2.loc[0, "foo"] = "a"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in both df1 and df2, column 'foo' is initially created to be of dtype 'int', and then by setting an element to a string it is upcasting to object. This is done on purpose to check the numeric_only flag below

Might as well declare column 'foo' to be of dtype object from the start then


Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/groupby/test_timegrouper.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ def test_groupby_with_timegrouper(self):
index=date_range(
"20130901", "20131205", freq="5D", name="Date", inclusive="left"
),
)
).astype({"Buyer": object})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column 'Buyer' is originally set to 0, and then some strings are set (thus upcasting to object). Might as well construct it as object from the start - furthermore, this is the expected dataframe, so it doesn't change what's being tested

expected.iloc[0, 0] = "CarlCarlCarl"
expected.iloc[6, 0] = "CarlCarl"
expected.iloc[18, 0] = "Joe"
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/series/methods/test_replace.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ def test_replace_explicit_none(self):
expected = pd.Series([0, 0, None], dtype=object)
tm.assert_series_equal(result, expected)

df = pd.DataFrame(np.zeros((3, 3)))
df = pd.DataFrame(np.zeros((3, 3))).astype({2: object})
df.iloc[2, 2] = ""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the third column df[2] is initially made up of ints, and then an element is set to ' ' (thus upcasting to object)

might as well set to object from the start, as the purpose of the test is on the line below df.replace(...

result = df.replace("", None)
expected = pd.DataFrame(
Expand Down