Review of Probs & Stats
Probabilities and Statistics
Quick review of probs & stats
Basic Statistics
- Skewness: \(\mathbb{S}[X]=\frac{1}{\sigma^3}\mathbb{E}[(X-\mu)^3]\)
- a measure of the asymmetry of the distribution
- skew of a normal distribution is always 0
- skew = 0 does not mean the variable must be normally distributed
- positive skew: right tail is longer than the left
- negative skew: left tail is longer than the right
- Kurtosis: \(\mathbb{K}[X]=\frac{1}{\sigma^4}\mathbb{E}[(X-\mu)^4]\)
- a measure of the heavy-tailedness of the distribution
- kurt of a normal distribition is always 3
- kurt = 3 does not mean the variable must be normally distributed
- kurt > 3: the tails are heavier (fatter) than those normal distribution, outcomes far from the mean are more likely from the normal.
- kurt < 3: the tails are thinner than those normal distribution, outcomes far from the mean are less likely from the normal.
Law of Large Number:
Some i.i.d. with common mean \(\mu\) like \(X_1, \dots, X_n\): \(\lim_{n\rightarrow\infty}\frac{1}{n}\sum_{k=1}^\infty X_k=\mu\)
Central Limit Theorem:
For i.i.d. \(X_1, \dots, X_n\) each with expected value \(\mu\) and variance \(\sigma^2 < \infty\) as \(n \rightarrow \infty\): \(\sqrt{n}(\overline{X}-\mu) \xrightarrow{d} \mathcal{N}(0, \sigma^2)\)
CLT not iid:
If \(X_1, \dots, X_n\) are not i.i.d. with expected values \(\mu_1, \dots, \mu_n\) and variances \(\sigma_1^2, \dots, \sigma_n^2\), let \(s_n^2=\sum_{i=1}^n\sigma_i^2\), then \(\frac{1}{s_n}\sum_{i=1}^{n}(X_i-\mu) \xrightarrow{d} \mathcal{N}(0, 1)\)
Sample estimates
- Mean: \(\overline{\mu}=\frac{1}{n}\sum_{i=1}^nX_i\)
- Std: \(\overline{\sigma^2}=\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{\mu})^2\)
- Skew: \(\overline{s}=\frac{1}{n\overline{\sigma}^3}\sum_{i=1}^n(X_i-\overline{\mu})^3\)
- Kurt: \(\overline{k}=\frac{1}{n\overline{\sigma}^4}\sum_{i=1}^n(X_i-\overline{\mu})^4\)
Lognormal
- \(ln(X) \sim \mathcal{N}(\mu, \sigma^2)\) then \(X \sim \Lambda(\mu, \sigma^2)\), and its original mean: \(\overline{\mu}\), variance: \(\overline{\sigma}^2\)
- if \(X\sim \mathcal{N}(\mu_x, \sigma_x^2)\) then:
- \[a + X \sim \mathcal{N}(a+\mu_x, \sigma_x^2)\]
- \[bX \sim \mathcal{N}(b\mu_x, b^2\sigma_x^2)\]
- \[X_1+X_2 \sim \mathcal{N}(\mu_1+\mu_2, \sigma_1^2+\sigma_2^2+2\delta_{1,2})\]
- if \(Y\sim \Lambda(\mu_y, \sigma_y^2)\) then:
- \[bY \sim \Lambda(\mu_y + ln(b), \sigma_y^2)\]
- \[\frac{1}{Y} \sim \Lambda(-\mu_y, \sigma_y^2)\]
- \[Y^b \sim \Lambda(b\mu_y, b^2\sigma_y^2)\]
- \(Y_1+Y_2\) has no closed-form solution but it can be approximated by moment matching: Let \(S=Y_1+Y_2\), then we know \(\mathbb{E}[S]=\mathbb{E}[Y_1]+\mathbb{E}[Y_2], \mathbb{V}(S)=\mathbb{V}(Y_1)+\mathbb{V}(Y_2)+2\mathbb{C}ov(Y_1, Y_2)\)
- Normal: \(S \sim\mathcal{N}(\mathbb{E}[S], \mathbb{V}(S))\). This method is bad because \(Y_1+Y_2\) is positively skewed. When the std of \(Y_1, Y_2\) is small the approximation is good, otherwise it underestimates the risk.
- Lognormal: \(\sigma_S^2=ln(\frac{\mathbb{V}S}{\mathbb{E}^2[S]}+1)\) and \(\mu_S=ln(\mathbb{E}[S]-\frac{\sigma_s^2}{2})\) so that \(S \sim \Lambda(\mu_S, \sigma_S^2)\).
Conditional Distribution
\(\begin{pmatrix} X_1 \\ X_2 \end{pmatrix} \sim \mathcal{N} \left( \begin{pmatrix} \mu_1 \\ \mu_2 \end{pmatrix} , \begin{pmatrix} \Sigma_{1,1} & \Sigma_{1,2} \\ \Sigma_{2,1} & \Sigma_{2,2} \end{pmatrix} \right)\)
the distribution of $X_1$ conditional on $X_2 = x_2$ is
\[(X_1 | X_2=x_2) \sim \mathcal{N} \left( \mu_1 + \Sigma_{1,2}\Sigma_{2,2}^{-1}(x_2 - \mu_2), \Sigma_{1,1} - \Sigma_{1,2}\Sigma_{2,2}^{-1}\Sigma_{2,1} \right)\]Positive Semi-Definiteness
- \(x^TMx \geq 0\) for symmetric real matrix \(M\)
- All eigenvalues of \(M\) are non-negative
- Covariance \(C=\sum_{k=1}^ny_ky_k^T\) is p.s.d