Contents
Probability Inequalities & Moment Generating Functions
It is necessary to know the interval of probabilities that contains the value(s) of interest in a distribution that can be estimated as a statistic. In addition, it is necessary to establish a confidence interval indicating the degree of confidence in the results of statistical analysis. Markov and Chebyshev inequalities are mathematical expressions underlying the rationale for establishing probability intervals or confidence intervals.
Probability Inequalities
Markov's inequality
If X is a random variable and g(x) is a nonnegative real-valued function, then Equation 1 holds for any positive real c.
$$\begin{equation} \tag{1} p[g(x) \ge c] \le \frac{E[g(x)]}{c} \end{equation}$$If the event of this variable is $A=\{x|g(x) \ge c\}$, the above expression is proved as follows.
$$ \begin{aligned} E[g(x)]&=\int^\infty_{-\infty} g(x)f(X)\, dx\\&=\int_{A} g(x)f(X)\, dx+\int^c_{A} g(x)f(X)\, dx\\& \ge \int_{A} g(x)f(X)\, dx\\&\ge \int_{A} cf(X)\, dx\\&=cP[x \in A]\\&=cP[g(x) \ge c] \end{aligned}$$ Example 1)
The random variable X follows a binomial distribution with mean np and variance np(1-p). Apply the Markov inequality to determine the upper bound of the probability that satisfies the expression below.
Chebyshev's inequality
From the random variable X, define another non-negative random variable $Y=(X-E(X))^2$. Markov inequality can be applied to this variable.
$$\begin{aligned}&P(Y \ge b^2) \le \frac{E(Y)}{b^2}\\ &\begin{aligned} E(Y)&=E(X-E(X))^2\\&=Var(x)\end{aligned}\\ &\begin{aligned} P(Y \ge b^2)&=P((X-E(X))^2 \ge b^2)\\&=P(|x-E(X)| \ge b)\end{aligned} \\ &\therefore \; P(|x-E(X)| \ge b) \le \frac{Var(X)}{b^2}\\ &b: \text{positive real number } \end{aligned}$$The above result is called Chebyshev's inequality and is generalized as Equation 2.
$$\begin{equation} \tag{2} P(|x-E(X)| \ge b) \le \frac{Var(X)}{b^2} \end{equation}$$This inequality means that the difference between a random variable X and its mean (E(X)) is bounded by its variance (Var(X)). Even if the probability distribution function of the random variable X is unknown, the deviation from the mean can be estimated intuitively by Equation 2.
Example 2)
The random variable X follows a binomial distribution with mean np and variance np(1-p). Apply Chebyshev's inequality to determine the upper bound of the probability that satisfies
$$\begin{aligned} &P(X \ge \alpha n)\\ & p=\frac{1}{2}, \quad \alpha \frac{3}{4} \end{aligned}$$
Example 2)
If the PDF of the random variable X is $f(x) = e^{-x}$ and $x>0$, then E(X) and Var(X)?
Also, apply the Chebyshev inequality to the probability that the difference between the variable value and the mean is greater than twice the standard deviation.
- The mean and variance are:
import numpy as np import pandas as pd from sympy import *
x=symbols("x") f=exp(-x) E=integrate(x*f, (x, 0, oo)) E
1
var=integrate(x**2*f, (x, 0, oo))-E**2 var
1
- Determine the probabilities of the following events.
N(1-integrate(f, (x, 0, 3)), 3)
0.0498
Applying the above case to the Chebyshev inequality, the upper bound is:
$$P(|x-1| \ge 2) = \frac{1}{4}$$Therefore, the probability of the specified event is less than the upper bound of the Chebyshev inequality.
Moment generating function
Moment is a characteristic quantity that is the basis for calculating other values, and the expected value of the function that can generate the moment is called the moment generating function. Statistics that characterize the probability distribution can be calculated using the moment defined as follows.
The first-order moment is the expected value E[X] and the second-order central moment is the variance of X as E[(x-\mu)2].
Example 4)
If the PMF of the discrete random variable X is as follows, the moment generating function (MGF) is calculated.
MGF can represent all moments of a random variable. Therefore, the distribution can be determined by this function. For example, if two random variables have the same MGF, it means they must have the same distribution.
The function etX for MGF defined above can be expressed as Taylor Seires expanding as Equation 4.
$$\begin{align}\tag{4}e^x&=1+x+\frac{x^2}{2}+\frac{x^3}{3}+ \cdots\\ &=\sum^\infty_{k=0} \frac{x^k}{k!} \end{align}$$
Taylor series can be expressed using the series()
method of the sympy module or the fps()
function. The O()
in the following code (the letter O) is the Big O notation used in computer science as a symbol for $x=\infty$. Use removeO()
to delete this part.
exp(x).series(x)$$\color{blue}{\quad 1 + x + \frac{x^{2}}{2} + \frac{x^{3}}{6} + \frac{x^{4}}{24} + \frac{x^{5}}{120} + O\left(x^{6}\right)}$$
x=symbols("x") fps(exp(x))$$\color{blue}{\quad \left(\sum_{k=1}^{\infty} \begin{cases} \frac{x^{k}}{k!} & \text{for}\: k\bmod{1} = 0 \\0 & \text{otherwise} \end{cases}\right) + 1}$$
Apply the above Taylor equation to the expansion of the moment generating function (MGF).
$$\begin{aligned} M_X(t)&=E(e^{tX})\\&=E(1+tX+\frac{t^2X^2}{2!}+\frac{t^3X^3}{3!}+\cdots)\\ &=1+tE(X)+\frac{t^2}{2!}E(X^2)+\frac{t^3}{3!}E(X^3)+\cdots \end{aligned}$$Perform the first-order derivative of the expanded MGF with respect to t and substitute zero.
$$ \begin{aligned} \frac{d(M_X(t))}{dt}&=E(X)+tE(X^2)+\frac{t^2}{2!}E(X^3)+\cdots\\\frac{d(M_X(0))}{dt}&=E(X) \end{aligned}$$The result of first differentiating MGF and substituting 0 is the first moment of the random variable, that is, the expected value. Try the second derivative in the same way.
$$ \begin{aligned} \frac{d^2(M_X(t))}{dt^2}&=E(X^2)+tE(X^3)+\cdots\\\frac{d^2(M_X(0))}{dt^2}&=E(X^2) \end{aligned}$$The result is a second moment. If this process is generalized as in Equation 5, the moment of the same degree of the derivative of the moment generating function is generated as shown in the following equation. Therefore, various moments such as expected value and variance can be calculated by MGF.
$$\begin{equation}\tag{5}\frac{d^n(M_X(t))}{dt^n}=E(X^n) \end{equation}$$Example 5)
A distribution that has a uniform probability mass function for all random variables is called a uniform distribution. The following is a uniform distribution with the following probability mass function (pmf) in the interval of $a ≤ x ≤ b$ for the random variable X.
Determine the moment generating function and calculate the expected value of the first moment from it.
For this calculation, the series() method of the sympy module was used to develop the moment generating function as a Taylor series, and integrate()
, diff()
was used for integration and differentiation, respectively.
a, b, x, t=symbols("a, b, x, t", real=True) f=1/(b-a) M=integrate(exp(t*x)*f, (x, a, b)) M$$\color{blue}{\begin{cases} \frac{e^{a t}}{a t - b t} - \frac{e^{b t}}{a t - b t} & \text{for}\: a t - b t \neq 0 \\\frac{a}{a - b} - \frac{b}{a - b} & \text{otherwise} \end{cases}}$$
M1=M.args[0][0] M1$$\color{blue}{\displaystyle \frac{e^{a t}}{a t - b t} - \frac{e^{b t}}{a t - b t}}$$
M2=M1.series(t, 0, 4).removeO() M2$$\color{blue}{\displaystyle \begin{aligned}&\frac{a}{a - b} - \frac{b}{a - b} \\&+ t^{3} \left(\frac{a^{4}}{24 \left(a - b\right)} - \frac{b^{4}}{24 \left(a - b\right)}\right) \\&+ t^{2} \left(\frac{a^{3}}{6 \left(a - b\right)} - \frac{b^{3}}{6 \left(a - b\right)}\right) \\&+ t \left(\frac{a^{2}}{2 \left(a - b\right)} - \frac{b^{2}}{2 \left(a - b\right)}\right)\end{aligned}}$$
E1=(M2.diff(t)).subs(t, 0) simplify(E1)$$\color{blue}{\displaystyle \frac{a}{2} + \frac{b}{2}}$$
Combination of random variables
In actual data analysis, the relationship between two or more variables is often the subject of analysis. For example, in determining the relationship between cancer and tobacco or the relationship between stock prices and interest, there are two or more variables to be analyzed. In this multivariate situation, the process of calculating probability and various statistics is similar to the process for univariate introduced in Section 3.3, Probability and Statistics.
Example 6)
Of the 12 students in class A, there are 3 soccer players and 4 baseball players. If three people are selected to play a certain sport with another class, what is the probability that they are all students from athletes?
If the variable of a soccer player is X, the variable of a baseball player is Y, and the remainder is Z, the probability of this distribution is calculated as follows.
$$\begin{aligned} &p(X=x, Y=y, Z=z)=\frac{\binom{3}{x} \binom{4}{y} \binom{5}{z}}{\binom{12}{3}}\\ &x+y+z=12 \end{aligned}$$from scipy import special total=special.comb(12, 3) total
220.0
p=pd.DataFrame([[]]) for i in range(4): for j in range(5): for k in range(5): if i+j+k==3: x=pd.DataFrame([[i, j, k, special.comb(3,i)*special.comb(4, j)*special.comb(5, k)/total]]) p=pd.concat([p, x]) p=np.around(p.iloc[1:,:], 3) p.columns=['x','y','z','P'] p
x | y | z | P | |
---|---|---|---|---|
0 | 0.0 | 0.0 | 3.0 | 0.045 |
0 | 0.0 | 1.0 | 2.0 | 0.182 |
0 | 0.0 | 2.0 | 1.0 | 0.136 |
⁞ | ⁞ | ⁞ | ⁞ | ⁞ |
0 | 3.0 | 0.0 | 0.0 | 0.005 |
0 | 3.0 | 0.0 | 0.0 | 0.005 |
Since all players must come from players, the cross table considering only the variables x and y is helpful in understanding the probabilities. This cross table (pivottable) can be created using the pd object.pivot_table()
method.
p.pivot_table('P','x' , 'y', aggfunc="sum", margins=True )
y 0.0 1.0 2.0 3.0 All x 0.0 0.045 0.182 0.136 0.036 0.399 1.0 0.136 0.273 0.246 NaN 0.655 2.0 0.068 0.220 NaN NaN 0.288 3.0 0.025 NaN NaN NaN 0.025 All 0.274 0.675 0.382 0.036 1.367
As in the above process, each probability of a combined random variable becomes the probability of the intersection of each variable. In the cross table above, when both x and y are 0, that is, in the case of P(0, 0, 3), if all random variables are independent, it is calculated as follows.
px0=special.comb(3, 0)/special.factorial(3) round(px0, 3)
0.167
py0=special.comb(4, 0)/special.factorial(4) round(py0, 3)
0.042
pz3=special.comb(5, 3)/special.factorial(5) round(pz3, 3)
0.083
p003=px0*py0*pz3 round(p003, 3)
0.001
This result differs from the result of 0.045 shown in the crosstabulation. Therefore, the random variables X, Y, and Z are not independent and are calculated by applying Bayes theorem in this case.
$$\begin{aligned}&f(x, y, z)=\frac{\binom{X}{x}\binom{Y}{y}\binom{Z}{z}}{\binom{X+Y+Z}{x+y+z}}\\ &X, Y, Z: \text{Total number of each random variable X, Y, Z }\\&x, y, z: \text{Number of x, y, z selected } \end{aligned}$$The above probability mass function is the same as the hypergeom distribution which will be mentioned later. Applied to this function, P(X=0, Y=0, Z=3) is calculated as
p003=special.comb(3, 0)*special.comb(4, 0)*special.comb(5, 3)/special.comb(12, 3) round(p003, 3)
0.045
For x=0 in this example, this marginal probability is calculated as
$$f_{X=0}(Y=y, Z=z)=\binom{X}{0} \sum^Y_{j=0} \sum^Z_{k=0}\binom{Y}{j}\binom{Z}{k} \frac{1}{\binom{X+Y+Z}{x+y+z}}$$The above calculations are made on discrete random variables, and integration is used for continuous variables.
Example 7)
If the probability density function of two continuous random variables X and Y is
Let's calculate:
- p{X < 1, Y > 1}
x,y=symbols('x y') f=2*exp(-x)*exp(-2*y) p=integrate(f,(x, 1, oo), (y, 0, 1)) p$\quad \color{blue}{\displaystyle - \frac{1}{e^{3}} + e^{-1}}$
N(p, 3)
0.318
- P{X < a}, a > 0
a=symbols("a") p=integrate(f,(x, 0, oo), (y, 0, a)).evalf(3) p$\quad \color{blue}{\displaystyle 1.0 - e^{- 2 a}}$
Example 8)
If two random variables X and Y are independent and each probability density function is
Combined probability density function of random variable X/Y?
Independent means f(x, y)=f(x)f(y). Therefore, the joint probability density function is
$$f(x, y) = \exp(-x)\exp(-y), \quad x,\; y >0$$If the random variable $\frac{X}{Y}$ is a, then the probability density function for a is calculated. This function can be calculated using the combined probability density of two variables X and Y.
$$\begin{aligned}\frac{X}{Y}=a \rightarrow X=aY,\; a>0\\ 0 < X < aY, \; 0 < Y \end{aligned}$$The probability density function is the derivative of the cumulative probability function.
In order to determine the probability density function expressed only by the new variable a, the cumulative probability function of f(x, y) can be calculated and the result can be expressed by differentiating it again. The cumulative probability function is:
$$F_{X/Y}(a)=\int^\infty_0 \int^{ay}_0 \exp(-x)\exp(-y)\, dxdy$$a=symbols("a", positive=True) x=symbols("x", positive=True) y=symbols("y", positive=True) f=exp(-x)*exp(-y) F=integrate(f, (x, 0, a*y),(y, 0, oo)) F$\color{blue}{\displaystyle \frac{1}{\left(a + 1\right)^{2}}}$
Example 9)
Families with children in the village are as follows:
# of children in the family | No | 1 | 2 | 3 |
---|---|---|---|---|
Rate | 15% | 20% | 35% | 30% |
Each child has an equal chance of being a boy or a girl. Therefore, the variable for this is independent. If a family is selected in this village, what is the probability of all cases for the number of boys (B) and the number of girls (G) in that family?
$$\text{PMF} = \text{P(selected family)} \;\times\; \text{P(boy or girl)}$$Here, the probabilities for boys and girls can be recognized as a probability experiment with a probability of success $\displaystyle \frac{1}{2}$. Expressing each pair as (boys, girls), the total sample space (S) is:
$$S=\{(0, 0), (0, 1), (1, 0), (0, 2), (1, 1), (2, 0), (0, 3), (1, 2), (2, 1), (3, 0)\}$$Above (1,1) means (B=1,G=1 | Child=2). Therefore, it is calculated as:
$$\begin{aligned}&P(B=1, G=1|C=2)=\frac{P(B=1, G=1)}{P(C=2)}\\ &\begin{aligned}P(B=1, G=1)&=P(B=1, G=1|C=2) \cdot P(C=2)\\ &=\binom{2}{1}\left(\frac{1}{2}\right)^{1}\left(\frac{1}{2}\right)^{2-1}P(C=2)\end{aligned}\\ & \begin{cases}\text{C}:&\text{Child}\\\text{G} :& \text{Girl}\\\text{B} :& \text{Boy}\end{cases} \end{aligned}$$In the above case, both (B,G) and (G, B) cases are possible, so the permutation of each case must be considered as in the above calculation. The probability mass function that generalizes the above case to this problem is as follows.
$$\begin{aligned}&f(g)=\binom{C}{g}\left(\frac{1}{2}\right)^{C-g}\left(\frac{1}{2}\right)^{g}p(C)\\ & C: \text{Child} = 0, 1, 2, 3\\ &G: \text{Girl}= 0, 1, 2, 3 \end{aligned} $$Of course, replacing the number of girls (g) with the number of boys (b) in the above case results in the same result.
pc={0:0.15, 1:0.2, 2:0.35, 3:0.3} re=pd.DataFrame() for i in pc.keys(): for j in range(i+1): re1=pd.DataFrame([[i, j, i-j, special.comb(i, j)*(1/2)**j*(1/2)**(i-j)*pc[i]]]) re=pd.concat([re, re1]) re.columns=['C', 'G', 'B', 'P'] re
C | G | B | P | |
0 | 0 | 0 | 0 | 0.1500 |
0 | 1 | 0 | 1 | 0.1000 |
⁞ | ⁞ | ⁞ | ⁞ | ⁞ |
0 | 3 | 2 | 1 | 0.1125 |
0 | 3 | 3 | 0 | 0.0375 |
rePivot=re.pivot_table('P','G','B', aggfunc="sum", margins=True) rePivot
B 0 1 2 3 All G 0 0.1500 0.1000 0.0875 0.0375 0.3750 1 0.1000 0.1750 0.1125 NaN 0.3875 2 0.0875 0.1125 NaN NaN 0.2000 3 0.0375 NaN NaN NaN 0.0375 All 0.3750 0.3875 0.2000 0.0375 1.0000$$\begin{aligned}P(G=1)&=P(G=1, B=0)\\&+P(G=1, B=1)\\&+P(G=1, B=2)\\&+P(G=1, B=3)\end{aligned}$$
rePivot.iloc[1, :4].sum()
0.3875$$P(B=1 | G=1)=\frac{P(B=1, G=1)}{P(G=1)}$$
rePivot.iloc[1,1]/rePivot.iloc[1, :4].sum()
0.4516129032258064
Example 10)
The combined probability density function of random variables X and Y is as follows.
P(y) is the integral with respect to x and fixed y in the above probability density function. In other words,
$$\int^1_0f(x, y)\, dx=\int^1_0 \frac{12}{5}x(2-x-y)\, dx$$x, y=symbols('x y') f=12/5*x*(2-x-y) py=f.integrate((x, 0, 1)) py$\quad \color{blue}{\displaystyle 1.6 -1.2y}$
px_y=f/py px_y$\quad \color{blue}{\displaystyle \frac{2.4 x \left(- x - y + 2\right)}{1.6 - 1.2 y}}$
댓글
댓글 쓰기