od to use when the desired percentile lies between two indexes ``i`` and ``j = i + 1``. In that case, we first determine ``i + g``, a virtual index that lies between ``i`` and ``j``, where ``i`` is the floor and ``g`` is the fractional part of the index. The final result is, then, an interpolation of ``a[i]`` and ``a[j]`` based on ``g``. During the computation of ``g``, ``i`` and ``j`` are modified using correction constants ``alpha`` and ``beta`` whose choices depend on the ``method`` used. Finally, note that since Python uses 0-based indexing, the code subtracts another 1 from the index internally. The following formula determines the virtual index ``i + g``, the location of the percentile in the sorted sample: .. math:: i + g = (q / 100) * ( n - alpha - beta + 1 ) + alpha The different methods then work as follows inverted_cdf: method 1 of H&F [1]_. This method gives discontinuous results: * if g > 0 ; then take j * if g = 0 ; then take i averaged_inverted_cdf: method 2 of H&F [1]_. This method give discontinuous results: * if g > 0 ; then take j * if g = 0 ; then average between bounds closest_observation: method 3 of H&F [1]_. This method give discontinuous results: * if g > 0 ; then take j * if g = 0 and index is odd ; then take j * if g = 0 and index is even ; then take i interpolated_inverted_cdf: method 4 of H&F [1]_. This method give continuous results using: * alpha = 0 * beta = 1 hazen: method 5 of H&F [1]_. This method give continuous results using: * alpha = 1/2 * beta = 1/2 weibull: method 6 of H&F [1]_. This method give continuous results using: * alpha = 0 * beta = 0 linear: method 7 of H&F [1]_. This method give continuous results using: * alpha = 1 * beta = 1 median_unbiased: method 8 of H&F [1]_. This method is probably the best method if the sample distribution function is unknown (see reference). This method give continuous results using: * alpha = 1/3 * beta = 1/3 normal_unbiased: method 9 of H&F [1]_. This method is probably the best method if the sample distribution function is known to be normal. This method give continuous results using: * alpha = 3/8 * beta = 3/8 lower: NumPy method kept for backwards compatibility. Takes ``i`` as the interpolation point. higher: NumPy method kept for backwards compatibility. Takes ``j`` as the interpolation point. nearest: NumPy method kept for backwards compatibility. Takes ``i`` or ``j``, whichever is nearest. midpoint: NumPy method kept for backwards compatibility. Uses ``(i + j) / 2``. Examples -------- >>> a = np.array([[10, 7, 4], [3, 2, 1]]) >>> a array([[10, 7, 4], [ 3, 2, 1]]) >>> np.percentile(a, 50) 3.5 >>> np.percentile(a, 50, axis=0) array([6.5, 4.5, 2.5]) >>> np.percentile(a, 50, axis=1) array([7., 2.]) >>> np.percentile(a, 50, axis=1, keepdims=True) array([[7.], [2.]]) >>> m = np.percentile(a, 50, axis=0) >>> out = np.zeros_like(m) >>> np.percentile(a, 50, axis=0, out=out) array([6.5, 4.5, 2.5]) >>> m array([6.5, 4.5, 2.5]) >>> b = a.copy() >>> np.percentile(b, 50, axis=1, overwrite_input=True) array([7., 2.]) >>> assert not np.all(a == b) The different methods can be visualized graphically: .. plot:: import matplotlib.pyplot as plt a = np.arange(4) p = np.linspace(0, 100, 6001) ax = plt.gca() lines = [ ('linear', '-', 'C0'), ('inverted_cdf', ':', 'C1'), # Almost the same as `inverted_cdf`: ('averaged_inverted_cdf', '-.', 'C1'), ('closest_observation', ':', 'C2'), ('interpolated_inverted_cdf', '--', 'C1'), ('hazen', '--', 'C3'), ('weibull', '-.', 'C4'), ('median_unbiased', '--', 'C5'), ('normal_unbiased', '-.', 'C6'), ] for method, style, color in lines: ax.plot( p, np.percentile(a, p, method=method), label=method, linestyle=style, color=color) ax.set( title='Percentiles for different methods and data: ' + str(a), xlabel='Percentile', ylabel='Estimated percentile value', yticks=a) ax.legend(bbox_to_anchor=(1.03, 1)) plt.tight_layout() plt.show() References ---------- .. [1] R. J. Hyndman and Y. Fan, "Sample quantiles in statistical packages," The American Statistician, 50(4), pp. 361-365, 1996 Nr=