DOC: Improve description of axis parameter for np.median#25228
DOC: Improve description of axis parameter for np.median#25228mdhaber merged 3 commits intonumpy:mainfrom
axis parameter for np.median#25228Conversation
The functionality of passing a sequence of axes to the numpy.median axis parameter is not explained in sufficient detail, leading to possible confusion about expected results. Specifically, the current description for the axis parameter does not mention that an array will be flattened along the given sequence of axes before a median is computed. This commit changes the description to clarify that the array is first flattened along the given axes and then a median is computed. Additionally, a ..versionadded command replaces a written version acknowledgement. See numpy#25174
The docstring for numpy.median does not contain an example for passing a sequence of ints to the axis parameter. Including an example could help clarify the functionality of the extended axis feature. This channge includes a new example demonstrating the median with extended axis.
numpy/lib/_function_base_impl.py
Outdated
| >>> np.median(a, axis=1) | ||
| array([7., 2.]) | ||
| >>> np.median(b, axis=(0,1), overwrite_input=True) | ||
| 3.5 |
There was a problem hiding this comment.
These examples in the docstrings should be runnable. Using b here when it hasn't been defined above doesn't make sense. I would also not use overwrite_input, since that confuses things.
In principle doctests should catch issues like b not being defined in the docstring yet, but I'm not sure we run doctests on the full numpy codebase...
Instead, I think it makes more sense to just rewrite all of these examples to use a small 3D array, so this example where you take the median successively over two dimensions makes sense. Something like:
>>> import numpy as np
>>> a = np.arange(27)
>>> a.shape = (3, 3, 3)
>>> a
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> np.median(a)
np.float64(13.0)
>>> np.median(a, axis=0)
array([[ 9., 10., 11.],
[12., 13., 14.],
[15., 16., 17.]])
>>> np.median(a, axis=(0, 1))
array([12., 13., 14.])And then if you didn't feel like editing the rest of this, you could redefine a to its original value below this example.
Just a thought, though! There are probably other ways to edit this that would make sense.
Also it would be nice if you could make sure this docstring is actually fully runnable and updated, since it looks like it hasn't been updated for the fallout from NEP 50 (ping @seberg - there's probably a lot of other docstrings that need to be updated right?).
There was a problem hiding this comment.
Err, not NEP 50, I guess just the new reprs for scalars?
There was a problem hiding this comment.
Yeah, that is on my not-done list. I had always hoped to use scientific-python/pytest-doctestplus#227 although that is a bit tedious until we better support doctestplus.
OTOH, a few failures (and misses) shouldn't matter too much.
At this time, I think just don't worry about whether or not you write np.float64(1.0) or 1.0, keeping things consistent may be better...
[skip azp] [skip actions] [skip cirrus]
|
I went ahead and made the example runnable and removed |
axis parameter for np.medianaxis parameter for np.median
|
Thank you for fixing up the docstring and for the comments. If there ends up being interest in making further improvements to all functions that use the axis parameter, I'd be happy to contribute. |
Sure. If you're interested, I'd suggest surveying what is currently documented about the |
This PR includes an updated docstring for
numpy.medianwhich clarifies the functionality of the extended axis feature, introduced in 1.9.0.Specifically, the description of the docstring is changed and an example is added. These help to clarify if a sequence of axes is passed, that the array is flattened along the given axes and then the median is computed over the resulting flattened axis. Without this change, the functionality of passing multiple axes could be misunderstood as computing medians sequentially over the given axes (without flattening), see #25174.
Resolves #25174