8 Appendix
8.1 Proofs
8.1.1 Theorem 1
8.1.1.1 Part 1
8.1.1.2 Part 2
8.1.1.3 Part 3
8.1.1.4 Part 4
The theorem Cesàro mean states that: If \(a_n \rightarrow a\) and \(b_n = \frac{1}{n}\sum_{i=1}^n a_i\), then \(b_n \rightarrow a\)
8.1.2 Theorem 2
8.1.2.1 Lower bound
Kraft’s inequality: \[ \sum_{i=1}^{n}|\mathcal{X}|^{-l_i} \leq 1 \]
Let \(\bar{L_k}\) be \(E[L_k]\)
\[\begin{align*} H(X) - \bar{L_k} &= \sum_{j=i}^n p_j\log_2 (1/p_j) - \sum_{j=i}^n p_jl_j \\ &= \sum_{j=i}^n (p_j\log_2 (1/p_j) - p_j\log_2(2^{l_i})) \\ &= \sum_{j=i}^n p_j\log_2 \frac{2^{-l_i}}{p_j} \end{align*}\]
knowing \(\ln{u} \leq u - 1\), or \(\log_2 u \leq (\log_2 e)(u-1)\)
when
\[\begin{align*} H(X) - \bar{L_k} &= \sum_{j=i}^n p_j\log_2 \frac{2^{-l_i}}{p_j} \\ &\leq (\log_2 e)\sum_{j=i}^n p_j(\frac{2^{-l_i}}{p_j} - 1) \\ &= (\log_2 e)\sum_{j=i}^n ({2^{-l_i}} - p_j) \\ &= (\log_2 e)\left(\sum_{j=i}^n {2^{-l_i}} - 1\right) \\ &\leq 0 \text{, due to Kraft equality} \end{align*}\]
Thus \(H(X) \leq \bar{L_k}\). This inequality is strict unless \(2^{-l_i} = p_j\)
\[\begin{align*} H(X) &\leq \bar{L_k} \\ H(X) &= H_1(X_1) = H_1 \\ nH_{\infty} &\leq nH_1 \text{ because } H_n \text{ is non-increasing} \\ nH_{\infty} &\leq \bar{L_k} \\ H_{\infty} &\leq \bar{L_k}/n \end{align*}\]
8.1.2.2 Lower bound
\[ l_j = \lceil -\log p_j \rceil \\ \]
\[ -\log p_j \leq l_j < - \log p_j +1 \\ \]
\[ \text{this satisfies Kraft inequality because the LHS is equivalent, taking RHS:} \\ \]
\[ \bar{L} = \sum_j p_j l_j < \sum_j p_j (- \log p_j +1) = H(X) + 1 \\ \]
\[ \bar{L}/n < H(X)/n + 1/n \]
When considering X as the block of variables as one variable with expanded alphabet, as n tends to infinity
\[ \bar{L}/n < H(X)/n + 1/n = H(X_1,...X_n)/n + 1/n = H_{\infty}(X) \] which holds for negligible \(\epsilon\)
\[ \bar{L}/n < H_{\infty}(X) + (1+\epsilon)/n \]
8.2 Files used
Name | Source | Description |
---|---|---|
world192 | large | The CIA world fact book |
pi | misc | The first million digits of pi |
random | artificl | 100,000 characters, randomly selected from [a-z|A-Z|0-9|!| ] (alphabet size 64) |
aaa | artificl | The letter ‘a,’ repeated 100,000 times. |
alice29.txt | Official Canterbury | English text |
asyoulik.txt | Official Canterbury | Shakespeare |
lcet10.txt | Official Canterbury | Technical writing |
plrabn12.txt | Official Canterbury | Poetry |
8.3 Files submitted
Rmarkdown
document as the source code for this report.py
file for implementation of new coding schemes and experimentsfiles/
for files downloaded from various copurs