Contents
Hypothesis
Statistical inference consists of establishing tentative hypotheses about the parameters of a population based on statistics calculated from a sample, and testing steps to accept or reject the hypothesis. In the test stage, the statistic of the sample that is the basis for judgment is called test statistic. The brobability of a more extreme statistic based on that test statistic is called p-value. By comparing the p-value with the significance level, acceptance or rejection of the statistic is determined.
- p-value < significance level: reject the hypothesis assumed to be true
- p-value > significance level: Failed to reject the hypothesis assumed to be true
Power is the probability of rejecting a false hypothesis. For example, a power of 90% indicates that there is a 10% chance of accepting an incorrect hypothesis. This is a type 2 error shown in Table 1.
This power increases as the sample size increases. Therefore, in order to obtain the desired power, it is necessary to have an appropriate number of samples.
Null and Alternative hypotheses
Analysts can hypothesize that the mean of the sample means is used as an estimate of the population mean and test the statistical validity of this hypothesis. The analyst does not expect this hypothesis to be rejected because it does not show a statistically significant difference. This hypothesis is called null hypothesis (H0). The hypothesis that is expected to be rejected corresponding to this null hypothesis is called alternative hypothesis(Ha). The test of the null hypothesis is based on information from the sample, that is, the test statistic. Therefore, it includes the possibility of errors such as:
H0 True | Ha True | |
H0 Accept | Right decision | type II error |
H0 Reject | type I error, $\alpha$ | Right decision |
The significance level is the probability of making a type I error of rejecting the null hypothesis when the null hypothesis is true. By specifically setting the significance level, the analyst can control the probability of making a type I error. In other words, as a result, increasing the significance level narrows the margin of error and increases the number of cases where the null hypothesis is accepted only when it is true. In contrast, type II errors that are related to power cannot be adjusted by the analyst. As such, a hypothesis test that controls only Type I errors is called significance test.
In summary, hypothesis testing is the process of establishing a hypothesis for the estimated parameter with the statistics of the sample and determining whether the hypothesis is appropriate. This method consists of the following steps:
- Establishing a hypothesis.
- A hypothesis consists of a null hypothesis (H0) and an alternative hypothesis (H1 or Ha ).
- An alternative hypothesis is a hypothesis that requires proof, and a claim that opposes the alternative hypothesis or an existing claim is called the null hypothesis.
- For example, a claim that the population mean ($\mu$) ``will be greater than the sample mean" requires proof. In this case, this hypothesis would be an alternative hypothesis. Conversely, ``the population mean is less than or equal to the sample mean" is the null hypothesis.
H0: μ ≤ X, H1: μ ≥ X
- Calculation of basic statistics of samples, that is, obtained data: sample mean, standard deviation, etc.
- Calculate the test statistic
- The values to test the hypothesis on
- For example, if the population can be assumed to be normally distributed, the standard score z can be used as the basis for the test. In this case, z is the test statistic. $$Z=\frac{\bar{X}-\mu}{\frac{\sigma}{\sqrt{n}}} \; \text{OR} \;\frac{\bar{X}-\mu}{\frac{\text{s}}{\sqrt{n}}}{}$$
- Set the area to reject the null hypothesis
- The x value at which this region starts is called threshold for rejection and the region to be rejected is called critical region.
- For example, if the test is performed at a 95% confidence level, the confidence interval for the test statistic is expressed as follows. $$\bar{X} - 1.96\frac{\sigma}{\sqrt{n}} \le \mu \le \bar{X} + 1.96\frac{\sigma}{\sqrt{n}} $$ If it exists outside this interval, the null hypothesis can be rejected.
- Conclusion
Example 1)
The probability of an increase in the daily rate of change between the opening and closing prices of a stock is 0.53. In the analysis of some samples, it is said that the increase in change occurs 10 times in 30 days. Can these results be generalized?
This problem can assume a negative binomial distribution, that is, a distribution representing the probability change up to the r-th success while iterating the Bernoulli trial. In this distribution, the total number of trials becomes the random variable x, and the number of successes (r) and probability (p) are the parameters.
$$x \sim NB(10, 0.53)$$The mean and variance of this negative binomial distribution are
import numpy as np import pandas as pd import matplotlib.pyplot as plt from scipy import stats
r, p=10, 0.53 mu, var=stats.nbinom.stats(r, p, moments='mv') print(f'mean: {np.round(mu,3)}, variance: {np.round(var, 3)}')
mean: 8.868, variance: 16.732
The null hypothesis for this problem is:
$$H0: x=30$$The interval()
method that calculates the confidence interval for each distribution class in the scipy.stats module returns the lower and upper bounds as a two-sided test. This is reasonable when the reference distribution is a symmetric distribution. However, this method cannot be used for negative binomial and one-sided distributions, because only the upper or lower bounds can be represented. Instead, use the ppf()
method, which returns the variable value corresponding to the probability. Therefore, the confidence interval at the significance level α = 0.05 is calculated as
cv=stats.nbinom.ppf(0.95, r, p) print(f'critical value: {round(cv, 3)}')
critical value: 16.0
For a significance level of 0.05, the threshold of the confidence interval is 16+10=26. The null hypothesis has an x value of 30, so it is outside the confidence interval. That is, it is difficult to accept the null hypothesis.
The above result can be confirmed using the significance probability (p value). The probability means the probability in an extreme state than the above hypothesis, and can be calculated by survival function. This function is equivalent to subtracting the cumulative probability from the whole to the variable x. The following code can apply the sf()
method on each distribution class in the stats module.
k=30-r pVal=stats.nbinom.sf(k, r, p) print(f'p value: {np.round(pVal, 4)}')
p value: 0.0093
Compared with the significance level of 0.05, the above significance probability is very small. That is, the null hypothesis can be rejected.
A visualization of the above results is shown in Figure 5.7.
point=stats.nbinom.ppf(1-pVal, r, p) point
20.0
plt.figure(figsize=(6,3)) x=range(41) y=[stats.nbinom.pmf(10, i, 0.53) for i in x] plt.plot(x, y, label="NB(10, 0.53)") plt.fill_between(x, 0, y, where=(x<=cv), facecolor="skyblue", label="1-α") plt.fill_between(x, 0, y, where=(x>=cv), facecolor="silver", label="α") plt.fill_between(x, 0, y, where=(x>=point), facecolor="teal", label="p value", alpha=0.1) plt.axvline(k, linestyle="--", color='red', label="k(20)") plt.legend(loc="best") plt.xlabel("x", size="13", weight="bold") plt.ylabel("pmf", size="13", weight="bold") plt.text(8,0.02, '0.95', size="13", weight="bold") plt.text(18,0.02, '0.05', size="13", weight="bold") plt.text(20,0.01, '0.0093', size="13", weight="bold", color="teal") plt.show()
One-sided and two-sided tests
Here are some recent data from one stock. From this data, the null hypothesis for estimating the population mean can be written as
$$\begin{align} \text{hypothesis 1} \quad & \text{H0}: \mu=\bar{X}, \; \text{H1}: \mu \neq \bar{X}\\ \text{hypothesis 2} \quad & \text{H0}: \mu\ge \bar{X}, \; \text{H1}: \mu \le \bar{X} \end{align}$$In the case of Hypothesis 1 above, it tests whether the population mean agrees with the sample mean, and the direction is irrelevant. That is, it doesn't matter if it exists to the left or right of the mean of the normal distribution. This case is called two-tailed test. In contrast, Hypothesis 2 can be oriented. The population mean tests whether it exists at a location larger than the sample mean. It is called one-sided test.
Example 2)
The following is the closing price change data of the Philadelphia Semiconductor Index (sox). Using this data as the population, two-sided and one-sided tests are performed to estimate whether the sample mean of the sampled distribution can be used as an unbiased estimate of the population mean.
import FinanceDataReader as fdr st=pd.Timestamp(2021,1, 1) et=pd.Timestamp(2021, 12, 17) da=fdr.DataReader('SOXX', st, et)["Close"] da1=da.pct_change()[1:]*100 da1.index=range(len(da1)) da1.head(2)
0 2.044546 1 -0.324414 Name: Close, dtype: float64
mu=da1.mean() std=da1.std() print(f'Pop.mean: {round(mu, 4)}, Pop.std: {round(std, 4)}')
Pop.mean: 0.1469, Pop.std: 1.9027
100 samples in the above data are sampled. This process applies the pd object.sample()
method. The sample mean, standard deviation, and standard error are calculated as follows. Confidence intervals are calculated by applying stats.norm.interval()
at a significance level of 0.05.
smplData=np.array([da1.sample(n=10).mean()]) for i in range(100): smplData=np.append(smplData, da1.sample(n=10).mean()) smplData[:3]
array([ 0.59807416, -0.94670731, 0.60892061])
BarX=smplData.mean() round(BarX,4)
0.0883
lb, ub=stats.norm.interval(0.95, mu, std) np.around(pd.DataFrame([lb, ub], index=['Lower', 'Upper']), 4)
0 | |
---|---|
Lower | -3.5824 |
Upper | 3.8763 |
The interval()
function performs a two-sided test as shown in Figure 2. According to the result, the sample mean falls within the confidence interval of the normal distribution with the population mean and population standard deviation as parameters. That is, the null hypothesis cannot be rejected at the significance level of 0.05.
plt.figure(figsize=(6,3)) x=np.linspace(-10, 10.01, 1000) y=[stats.norm.pdf(i, mu, std) for i in x] plt.plot(x, y, label=f"Norm({round(mu, 2)}, {round(std, 2)})") plt.fill_between(x, 0, y, where=(x>=lb)&(x<=ub), facecolor="skyblue", label=r"1-$\alpha$") plt.fill_between(x, 0, y, where=(x<=lb) | (x>=ub), facecolor="red", alpha=0.3, label=r"$\alpha$") plt.legend(loc="best") plt.xlabel("x", size="13", weight='bold') plt.ylabel("pdf", size="13", weight='bold') plt.ylim(0, 0.22) plt.xticks([]) plt.text(-0.5, 0.050, 0.95, size="13", weight='bold') plt.text(-8 , 0.025, 0.025, size="13", weight='bold', color="red") plt.text(5.1, 0.025, 0.025, size="13", weight='bold', color="red") plt.text(lb-1, -0.015, round(lb, 2), size="12", weight="bold", color="blue") plt.text(ub-1, -0.015, round(ub, 2), size="12", weight="bold", color="blue") plt.show()
For a one-sided test, the significance level exists on one side, so if the significance level is 0.05, as shown in Figure 3, the standard score of the threshold is Z0.05 or Z1-0.05. In this example, the null hypothesis (H0) is μ > $\bar{X}$, so Z1-0.05 is the threshold.
In a one-sided test, the critical value can be calculated using the stats.normal.ppf()
method.
CP=stats.norm.ppf(1-0.05, mu, std) print(f'Critical Point: {round(CP, 4)}')
Critical Point: 3.2767
As shown in Figure 3, the sample mean 0.1319 is included in the confidence interval x ≤ 4.17. Therefore, the null hypothesis cannot be rejected. Also, the significance probability (p-value) for the sample mean is as follows.
pval=stats.norm.sf(BarX, mu, std) print(f'p value: {round(pval, 4)}')
0.5162
plt.figure(figsize=(6,3)) x=np.linspace(-10, 10.01, 1000) y=[stats.norm.pdf(i, mu, std) for i in x] plt.plot(x, y, label=f"Norm({round(mu, 2)}, {round(std, 2)})") plt.fill_between(x, 0, y, where=(x <=CP), facecolor="skyblue", label=r"1-$\alpha$") plt.fill_between(x, 0, y, where=(x >=CP), facecolor="red", alpha=0.3, label=r"$\alpha$") plt.legend(loc="best") plt.xlabel("x", size="13", weight='bold') plt.ylabel("pdf", size="13", weight='bold') plt.ylim(0, 0.22) plt.xticks([]) plt.text(-0.5, 0.050, 0.95, size="13", weight='bold') plt.text(4.1, 0.025, 0.05, size="13", weight='bold', color="red") plt.text(CP-1, -0.015, round(CP, 2), size="12", weight="bold", color="blue") plt.show()
As shown in the above result, the p value is large compared to the significance level. Therefore, it is the same as the conclusion based on the confidence interval.
댓글
댓글 쓰기