Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
c3064e6
BUG:Sanity check on merge parameters for correct exception #26824
harshit-py Jun 14, 2019
c862ec0
added tests
harshit-py Jun 15, 2019
0df85f0
updated tests for better coverage
harshit-py Jun 15, 2019
dea59fa
requested changes on the test
harshit-py Jun 15, 2019
53aee36
updated whatsnew v0250rst
harshit-py Jun 15, 2019
6396947
requested changes
harshit-py Jun 15, 2019
11b1894
further changes
harshit-py Jun 15, 2019
7b4aa22
updated test
harshit-py Jun 15, 2019
bfb7984
Merge branch 'master' into merge_py_bug
harshit-py Jun 17, 2019
1e2b276
BUG:Sanity check on merge parameters for correct exception #26824
harshit-py Jun 14, 2019
0f88c32
added tests
harshit-py Jun 15, 2019
a9905bd
updated tests for better coverage
harshit-py Jun 15, 2019
a939d8e
requested changes on the test
harshit-py Jun 15, 2019
83f8cda
updated whatsnew v0250rst
harshit-py Jun 15, 2019
962882d
requested changes
harshit-py Jun 15, 2019
139d696
further changes
harshit-py Jun 15, 2019
4e6bd89
updated test
harshit-py Jun 15, 2019
a0680e0
Trying original patch
harshit-py Jun 17, 2019
0a894a9
Revert "Trying original patch"
harshit-py Jun 19, 2019
0cb8843
Trying original patch
harshit-py Jun 19, 2019
4e45c75
Revert "Trying original patch"
harshit-py Jun 19, 2019
500e374
Trying original patch
harshit-py Jun 19, 2019
773dec2
Original patch
harshit-py Jun 19, 2019
7770b1d
further changes
harshit-py Jun 19, 2019
efb0761
Merge remote-tracking branch 'upstream/master' into harshit-py-merge_…
TomAugspurger Jun 25, 2019
6b7f5a2
re-revert
TomAugspurger Jun 25, 2019
b7c482e
Merge pull request #1 from TomAugspurger/harshit-py-merge_py_bug
harshit-py Jun 28, 2019
e4e486c
Merge branch 'master' into PR_TOOL_MERGE_PR_26855
jreback Jun 28, 2019
File filter

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,6 @@ than :attr:`options.display.max_seq_items` (default: 100 items). Horizontally,
the output will truncate, if it's wider than :attr:`options.display.width`
(default: 80 characters).


.. _whatsnew_0250.enhancements.other:

Other enhancements
Expand All @@ -156,6 +155,7 @@ Other enhancements
- Error message for missing required imports now includes the original import error's text (:issue:`23868`)
- :class:`DatetimeIndex` and :class:`TimedeltaIndex` now have a ``mean`` method (:issue:`24757`)
- :meth:`DataFrame.describe` now formats integer percentiles without decimal point (:issue:`26660`)
- :func:`pandas.merge` now raises ``ValueError`` when either of ``left_on`` or ``right_on`` is not provided (:issue:`26855`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate slightly; this only matters if on itself is not provided. Do we have a check if on is provide and either of left_on or right_on is not None?

- Added support for reading SPSS .sav files using :func:`read_spss` (:issue:`26537`)
- Added new option ``plotting.backend`` to be able to select a plotting backend different than the existing ``matplotlib`` one. Use ``pandas.set_option('plotting.backend', '<backend-module>')`` where ``<backend-module`` is a library implementing the pandas plotting API (:issue:`14130`)
- :class:`pandas.offsets.BusinessHour` supports multiple opening hours intervals (:issue:`15481`)
Expand Down Expand Up @@ -604,6 +604,7 @@ Other deprecations
Instead, use :meth:`Series.dtype` and :meth:`DataFrame.dtypes` (:issue:`26705`).
- :meth:`Timedelta.resolution` is deprecated and replaced with :meth:`Timedelta.resolution_string`. In a future version, :meth:`Timedelta.resolution` will be changed to behave like the standard library :attr:`timedelta.resolution` (:issue:`21344`)


.. _whatsnew_0250.prior_deprecations:

Removal of prior version deprecations/changes
Expand Down
6 changes: 6 additions & 0 deletions pandas/core/reshape/merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1091,13 +1091,19 @@ def _validate_specification(self):
raise ValueError('len(left_on) must equal the number '
'of levels in the index of "right"')
self.right_on = [None] * n
if not self.right_on:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we instead try to set

self.left_on = self.left_on or [None] * len( self.right_on)
and same for right_on

before these if clauses; that way don’t need to add additional checks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

raise ValueError('both "right_on" and "left_on" '
'should be passed')
elif self.right_on is not None:
n = len(self.right_on)
if self.left_index:
if len(self.right_on) != self.left.index.nlevels:
raise ValueError('len(right_on) must equal the number '
'of levels in the index of "left"')
self.left_on = [None] * n
if not self.left_on:
raise ValueError('both "right_on" and "left_on" '
'should be passed')
if len(self.right_on) != len(self.left_on):
raise ValueError("len(right_on) must equal len(left_on)")

Expand Down
33 changes: 33 additions & 0 deletions pandas/tests/reshape/merge/test_merge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1026,6 +1026,22 @@ def test_validation(self):
result = merge(left, right, on=['a', 'b'], validate='1:1')
assert_frame_equal(result, expected_multi)

@pytest.mark.parametrize('merge_type', ['left_on', 'right_on'])
def test_missing_on_raises(self, merge_type):
# GH26824
left = DataFrame({
'A': [1, 2, 3, 4, 5, 6],
'B': ['P', 'Q', 'R', 'S', 'T', 'U']
})
right = DataFrame({
'A': [1, 2, 4, 5, 7, 8],
'C': ['L', 'M', 'N', 'O', 'P', 'Q']
})
msg = 'must equal'
kwargs = {merge_type: 'A'}
with pytest.raises(ValueError, match=msg):
pd.merge(left, right, how='left', **kwargs)

def test_merge_two_empty_df_no_division_error(self):
# GH17776, PR #17846
a = pd.DataFrame({'a': [], 'b': [], 'c': []})
Expand Down Expand Up @@ -1763,3 +1779,20 @@ def test_merge_equal_cat_dtypes2():

# Categorical is unordered, so don't check ordering.
tm.assert_frame_equal(result, expected, check_categorical=False)


@pytest.mark.parametrize('merge_type', ['left_on', 'right_on'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this near the test_validation test

def test_missing_on_raises(merge_type):
# GH26824
df1 = DataFrame({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call these left & right

'A': [1, 2, 3, 4, 5, 6],
'B': ['P', 'Q', 'R', 'S', 'T', 'U']
})
df2 = DataFrame({
'A': [1, 2, 4, 5, 7, 8],
'C': ['L', 'M', 'N', 'O', 'P', 'Q']
})
msg = 'must equal'
kwargs = {merge_type: 'A'}
with pytest.raises(ValueError, match=msg):
pd.merge(df1, df2, how='left', **kwargs)