Skip to content

Conversation

@MaxVanDeursen
Copy link
Contributor

In this pull request, the documentation of the rename function of NDFrame is generalized over all its children using the function: DataFrame, Series and Panel. These three classes now use the Substitute decorator to add the generic part of the documentation to their own, using the same decorator to replace class specific parameters into the generic documentation.

I have tried my best to create as little changes to the documentation of each individual function. Although I have tried to get rid of all style errors, I do still receive some errors from the validate_docstrings.py:

Series 2 Errors found:	Parameters {**kwargs} not documented	Unknown parameters {errors, copy, inplace, level} 
Panel 2 Errors found:	Parameters {minor_axis, items, **kwargs, major_axis} not documented	Unknown parameters {items, major_axis, minor_axis, copy, errors, inplace, level} 

Help is appreciated!

In this commit, the documentation of the rename function of the NDFrame, is generalized over all its children using the function: DataFrame, Series and Panel. These three classes now use the Substitute decorator to add the generic part of the documentation to their own, using the same decorator to replace class specific parameters into the generic documentation.
@WillAyd WillAyd added the Docs label Mar 6, 2019
@codecov
Copy link

codecov bot commented Mar 6, 2019

Codecov Report

Merging #25577 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #25577 +/- ## ========================================== + Coverage 91.26% 91.26% +<.01%  ========================================== Files 173 173 Lines 52966 52970 +4 ========================================== + Hits 48337 48342 +5  + Misses 4629 4628 -1
Flag Coverage Δ
#multiple 89.83% <100%> (ø) ⬆️
#single 41.71% <100%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 93.65% <ø> (ø) ⬆️
pandas/core/frame.py 96.79% <100%> (ø) ⬆️
pandas/core/series.py 93.69% <100%> (+0.01%) ⬆️
pandas/core/panel.py 38.56% <100%> (ø) ⬆️
pandas/util/testing.py 87.66% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e652ac...6b489e4. Read the comment docs.

1 similar comment
@codecov
Copy link

codecov bot commented Mar 6, 2019

Codecov Report

Merging #25577 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #25577 +/- ## ========================================== + Coverage 91.26% 91.26% +<.01%  ========================================== Files 173 173 Lines 52966 52970 +4 ========================================== + Hits 48337 48342 +5  + Misses 4629 4628 -1
Flag Coverage Δ
#multiple 89.83% <100%> (ø) ⬆️
#single 41.71% <100%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 93.65% <ø> (ø) ⬆️
pandas/core/frame.py 96.79% <100%> (ø) ⬆️
pandas/core/series.py 93.69% <100%> (+0.01%) ⬆️
pandas/core/panel.py 38.56% <100%> (ø) ⬆️
pandas/util/testing.py 87.66% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3e652ac...6b489e4. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 6, 2019

Codecov Report

Merging #25577 into master will increase coverage by 0.03%.
The diff coverage is 100%.

Impacted file tree graph

@@ Coverage Diff @@ ## master #25577 +/- ## ========================================== + Coverage 91.26% 91.29% +0.03%  ========================================== Files 173 173 Lines 52968 52965 -3 ========================================== + Hits 48340 48354 +14  + Misses 4628 4611 -17
Flag Coverage Δ
#multiple 89.87% <100%> (+0.03%) ⬆️
#single 41.73% <100%> (+0.02%) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 96.79% <100%> (ø) ⬆️
pandas/core/series.py 93.69% <100%> (+0.01%) ⬆️
pandas/core/generic.py 93.65% <100%> (ø) ⬆️
pandas/core/panel.py 38.56% <100%> (ø) ⬆️
pandas/core/sorting.py 98.29% <0%> (+0.02%) ⬆️
pandas/core/ops.py 91.74% <0%> (+0.1%) ⬆️
pandas/util/testing.py 89.08% <0%> (+1.41%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f886139...9053bac. Read the comment docs.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Per our conversation on Gitter I think you can actually move a lot more into the generic docstring and reduce the amount of substitution going on here. I think this is confused by the current Series docstring which may not be properly documenting all of the allowed parameters.

If you don't mind looking a little further and seeing if we can simplify by moving more into the generic docstring would be greatly appreciated

@MaxVanDeursen
Copy link
Contributor Author

From our conversation yesterday and from the code, you would think that indeed the mapper, columns, and axis parameters are allowed as well.

Upon further inspection, I did notice that they are indeed not disallowed. However, they do not seem to have the same behavior as the DataFrame counterparts:

s = pd.Series([1,2,3,4]) df = pd.DataFrame([1,2,3,4]) s 0 1 1 2 2 3 3 4 dtype: int64 df 0 0 1 1 2 2 3 3 4 s.rename(mapper={0:'x'}, axis=0) 0 1 1 2 2 3 3 4 dtype: int64 df.rename(mapper={0:'x'}, axis=0) 0 x 1 1 2 2 3 3 4 s.rename(columns={"NONEXISTENT": 'a'}, errors="raise") 0 1 1 2 2 3 3 4 dtype: int64 df.rename(colums={"NONEXISTENT": 'a'}, errors="raise") Traceback (most recent call last): ... KeyError: "['NONEXISTENT'] not found in axis"

From this, I do not think it is a good decision to generalize the documentation over these arguments, as they do not provide the same functionality for Series and DataFrame. Generalizing the Series documentation to essentially the one of DataFrame, leads to a misleading documentation for Series.

@WillAyd
Copy link
Member

WillAyd commented Mar 8, 2019

Can you take a look at reindex? I think that has a similar issue from an API perspective but handles the docstring substitution in a way that requires less duplication

@MaxVanDeursen
Copy link
Contributor Author

MaxVanDeursen commented Mar 8, 2019

@WillAyd I have now tried to use the _shared_doc_kwargs together with a few additions:

  • altered: To specify what is being altered
  • axes_alt_types: To specify the alternative types of the axes, which are different for Series since it also includes the rename use of Series.name
  • alternative_use: A string describing the above alternative use.

I have not found a way to effectively generalize the mapper parameter for DataFrame, as there is no such parameter for Series. I have thought of three options so far:

  1. Include another addition to the substitute, which describes the following: Instead, ``mapper`` may be used together with ``axis`` to apply the function to the axis values.'. Notice that this would be only for the DataFrame, and left empty for Series (and Panel). This is similar to alternative_use and axes_alt_types
  2. Include the same description of 1, only in alternative_use. However, alternative_use is used in the function description as well, hence making a bit of an illogical description. A solution to this would be to remove alternative_use from this description, however, this would make the description of Series only include the Series.name explanation in it's index parameter description.

Note that in the new commit, mapper is not included in either of those ways. To be honest, I do not think axis and mapper are even needed, as the same can be accomplished with index or columns. Then again, removing these arguments will of course break backwards compatibility, so this is not an option.

@MaxVanDeursen
Copy link
Contributor Author

@WillAyd Any ideas?

@TomAugspurger
Copy link
Contributor

It sounds like there are potentially two issues here:

  1. Reduce the duplication of docs between Series.rename and DataFrame.rename (lets ignore panel, it's going away)
  2. Differences in behavior between Series.rename and DataFrame.rename

I would isolate number 2 first. For example

s.rename(columns={"NONEXISTENT": 'a'}, errors="raise") 

Seems buggy. I think providing non-N�one columns to Series.rename should raise.

I think once we agree on the correct behavior it'll be easier to document and determine whether or not the docstrings should be shared.

@MaxVanDeursen
Copy link
Contributor Author

MaxVanDeursen commented Mar 14, 2019

I see what you mean. I do not think I am the one to decide on the behavior of Series.rename!

Seems buggy. I think providing non-N�one columns to Series.rename should raise.

Even None columns would not make sense though, in the context of Series, since Series does not have columns.

To be honest, I would say to either force the API of Series.rename (i.e. throw on any non-documented argument) or make sure all possible arguments of NDFrame.rename make sense to Series as well, and provide behaviour.

@WillAyd
Copy link
Member

WillAyd commented Mar 14, 2019

Thanks for the input @TomAugspurger. Mentioned above but we have a similar handling of the reindex method. I think we can just mimic that substitution behavior for this PR.

Agreed we can also change up some things with implementation but I don't think that's a blocker for this and could be subsequent PR

('inplace', False),
('level', None),
('errors', 'ignore')])
@Substitution(**dict(_shared_doc_kwargs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this rather difficult to read - can you just make multiple Substitution decorations instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that that would be better; however, when splitting this into multiple Substitution decorations, the "top" Substitution is not substituted into the NDFrame.rename documentation.
For example; if we use the following Substitution decorations:

 @Substitution(**_shared_doc_kwargs) @Substitution(altered='axes input function or functions',  alternative_use='',  axes_alt_types='') @Substitution(generic_documentation=NDFrame.rename.__doc__)

I get the following error when building the documentation:

 File "make.py", line 341, in <module> sys.exit(main()) File "make.py", line 329, in main globals()['pandas'] = importlib.import_module('pandas') File "/opt/anaconda3/lib/python3.7/importlib/__init__.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "<frozen importlib._bootstrap>", line 1006, in _gcd_import File "<frozen importlib._bootstrap>", line 983, in _find_and_load File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 677, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 728, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/home/max/git/pandas/pandas/__init__.py", line 42, in <module> from pandas.core.api import * File "/home/max/git/pandas/pandas/core/api.py", line 40, in <module> from pandas.core.panel import Panel File "/home/max/git/pandas/pandas/core/panel.py", line 109, in <module> class Panel(NDFrame): File "/home/max/git/pandas/pandas/core/panel.py", line 1257, in Panel def rename(self, items=None, major_axis=None, minor_axis=None, **kwargs): File "/home/max/git/pandas/pandas/util/_decorators.py", line 259, in __call__ func.__doc__ = func.__doc__ and func.__doc__ % self.params KeyError: 'axes' 

This is what I encountered before, and I found out the pushed code fixes this issue.


@Substitution(**_shared_doc_kwargs)
@Appender(NDFrame.rename.__doc__)
@Substitution(**dict(_shared_doc_kwargs,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned don't worry about Panel - we'll be removing this anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand; however, it still had to be made compliant with the NDFrame.generic substitutions in order for the documentation to be built (without errors).

@jreback
Copy link
Contributor

jreback commented Mar 20, 2019

@jreback
Copy link
Contributor

jreback commented May 12, 2019

good idea, but closing as stale

@jreback jreback closed this May 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants