Skip to content

Conversation

@kylestahl
Copy link

Now .explode() can take a list of column names and will explode multiple at the same time (given that each element across all the columns have the same length in every single row

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
Kyle Stahl added 11 commits September 14, 2019 13:28
explode multiple columns at same time
Now if you pass a list of column names to .explode(), so long as all the lengths of lists are consistent across all the columns for each records, all the columns will be exploded.
ENH: DataFrame.explode() allow for multiple columns
Now explode() can also take in a list of columns and explode them all, given that for every record in the dataframe the elements of the exploding columns all have the same length
ENH: DataFrame.explode() multiple columns
@pep8speaks
Copy link

pep8speaks commented Sep 16, 2019

Hello @stahl085! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-09-16 16:48:53 UTC
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always add tests first

@WillAyd
Copy link
Member

WillAyd commented Sep 16, 2019

Just a heads up - this was part of the original implementation in #27267 so you can check there for inspiration on tests and implementation. The big blocker there though was how to handle duplicate values, i.e. whether we should generate a cartesian product or not. Do you know how other similar tools would handle that?

@kylestahl
Copy link
Author

Thanks for the info, I like that #27267 implementation much better!
By duplicate do you mean the user passes in the same column name twice? ex: ['A', 'A']

I haven't seen this implemented elsewhere, but to me it seems un-natural for this to return a cartesian product. Would it make sense to include that as an optional argument cartesian=False?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would need a number of tests; the impl will be very non-performant, so needs updating.

@kylestahl kylestahl closed this Sep 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants