oTestResult An object with attributes: statistic : float or ndarray The test statistic of the observed `data`. pvalue : float or ndarray The p-value for the given alternative. null_distribution : ndarray The values of the test statistic generated under the null hypothesis. .. warning:: The p-value is calculated by counting the elements of the null distribution that are as extreme or more extreme than the observed value of the statistic. Due to the use of finite precision arithmetic, some statistic functions return numerically distinct values when the theoretical values would be exactly equal. In some cases, this could lead to a large error in the calculated p-value. `monte_carlo_test` guards against this by considering elements in the null distribution that are "close" (within a relative tolerance of 100 times the floating point epsilon of inexact dtypes) to the observed value of the test statistic as equal to the observed value of the test statistic. However, the user is advised to inspect the null distribution to assess whether this method of comparison is appropriate, and if not, calculate the p-value manually. References ---------- .. [1] B. Phipson and G. K. Smyth. "Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn." Statistical Applications in Genetics and Molecular Biology 9.1 (2010). Examples -------- Suppose we wish to test whether a small sample has been drawn from a normal distribution. We decide that we will use the skew of the sample as a test statistic, and we will consider a p-value of 0.05 to be statistically significant. >>> import numpy as np >>> from scipy import stats >>> def statistic(x, axis): ... return stats.skew(x, axis) After collecting our data, we calculate the observed value of the test statistic. >>> rng = np.random.default_rng() >>> x = stats.skewnorm.rvs(a=1, size=50, random_state=rng) >>> statistic(x, axis=0) 0.12457412450240658 To determine the probability of observing such an extreme value of the skewness by chance if the sample were drawn from the normal distribution, we can perform a Monte Carlo hypothesis test. The test will draw many samples at random from their normal distribution, calculate the skewness of each sample, and compare our original skewness against this distribution to determine an approximate p-value. >>> from scipy.stats import monte_carlo_test >>> # because our statistic is vectorized, we pass `vectorized=True` >>> rvs = lambda size: stats.norm.rvs(size=size, random_state=rng) >>> res = monte_carlo_test(x, rvs, statistic, vectorized=True) >>> print(res.statistic) 0.12457412450240658 >>> print(res.pvalue) 0.7012 The probability of obtaining a test statistic less than or equal to the observed value under the null hypothesis is ~70%. This is greater than our chosen threshold of 5%, so we cannot consider this to be significant evidence against the null hypothesis. Note that this p-value essentially matches that of `scipy.stats.skewtest`, which relies on an asymptotic distribution of a test statistic based on the sample skewness. >>> stats.skewtest(x).pvalue 0.6892046027110614 This asymptotic approximation is not valid for small sample sizes, but `monte_carlo_test` can be used with samples of any size. >>> x = stats.skewnorm.rvs(a=1, size=7, random_state=rng) >>> # stats.skewtest(x) would produce an error due to small sample >>> res = monte_carlo_test(x, rvs, statistic, vectorized=True) The Monte Carlo distribution of the test statistic is provided for further investigation. >>> import matplotlib.pyplot as plt >>> fig, ax = plt.subplots() >>> ax.hist(res.null_distribution, bins=50) >>> ax.set_title("Monte Carlo distribution of test statistic") >>> ax.set_xlabel("Value of Statistic") >>> ax.set_ylabel("Frequency") >>> plt.show() r