8 Appendix

8.1 Proofs

8.1.1 Theorem 1

8.1.1.1 Part 1

8.1.1.2 Part 2

8.1.1.3 Part 3

8.1.1.4 Part 4

The theorem Cesàro mean states that: If \(a_n \rightarrow a\) and \(b_n = \frac{1}{n}\sum_{i=1}^n a_i\), then \(b_n \rightarrow a\)

8.1.2 Theorem 2

8.1.2.1 Lower bound

Kraft’s inequality: \[ \sum_{i=1}^{n}|\mathcal{X}|^{-l_i} \leq 1 \]

Let \(\bar{L_k}\) be \(E[L_k]\)

\[\begin{align*} H(X) - \bar{L_k} &= \sum_{j=i}^n p_j\log_2 (1/p_j) - \sum_{j=i}^n p_jl_j \\ &= \sum_{j=i}^n (p_j\log_2 (1/p_j) - p_j\log_2(2^{l_i})) \\ &= \sum_{j=i}^n p_j\log_2 \frac{2^{-l_i}}{p_j} \end{align*}\]

knowing \(\ln{u} \leq u - 1\), or \(\log_2 u \leq (\log_2 e)(u-1)\)

when

\[\begin{align*} H(X) - \bar{L_k} &= \sum_{j=i}^n p_j\log_2 \frac{2^{-l_i}}{p_j} \\ &\leq (\log_2 e)\sum_{j=i}^n p_j(\frac{2^{-l_i}}{p_j} - 1) \\ &= (\log_2 e)\sum_{j=i}^n ({2^{-l_i}} - p_j) \\ &= (\log_2 e)\left(\sum_{j=i}^n {2^{-l_i}} - 1\right) \\ &\leq 0 \text{, due to Kraft equality} \end{align*}\]

Thus \(H(X) \leq \bar{L_k}\). This inequality is strict unless \(2^{-l_i} = p_j\)

\[\begin{align*} H(X) &\leq \bar{L_k} \\ H(X) &= H_1(X_1) = H_1 \\ nH_{\infty} &\leq nH_1 \text{ because } H_n \text{ is non-increasing} \\ nH_{\infty} &\leq \bar{L_k} \\ H_{\infty} &\leq \bar{L_k}/n \end{align*}\]

8.1.2.2 Lower bound

\[ l_j = \lceil -\log p_j \rceil \\ \]

\[ -\log p_j \leq l_j < - \log p_j +1 \\ \]

\[ \text{this satisfies Kraft inequality because the LHS is equivalent, taking RHS:} \\ \]

\[ \bar{L} = \sum_j p_j l_j < \sum_j p_j (- \log p_j +1) = H(X) + 1 \\ \]

\[ \bar{L}/n < H(X)/n + 1/n \]

When considering X as the block of variables as one variable with expanded alphabet, as n tends to infinity

\[ \bar{L}/n < H(X)/n + 1/n = H(X_1,...X_n)/n + 1/n = H_{\infty}(X) \] which holds for negligible \(\epsilon\)

\[ \bar{L}/n < H_{\infty}(X) + (1+\epsilon)/n \]

8.2 Files used

Files used in this lab, all downloaded from various corpuses on https://corpus.canterbury.ac.nz/
Name Source Description
world192 large The CIA world fact book
pi misc The first million digits of pi
random artificl 100,000 characters, randomly selected from [a-z|A-Z|0-9|!| ] (alphabet size 64)
aaa artificl The letter ‘a,’ repeated 100,000 times.
alice29.txt Official Canterbury English text
asyoulik.txt Official Canterbury Shakespeare
lcet10.txt Official Canterbury Technical writing
plrabn12.txt Official Canterbury Poetry

8.3 Files submitted

  • Rmarkdown document as the source code for this report
  • .py file for implementation of new coding schemes and experiments
  • files/ for files downloaded from various copurs