Adaptive and Contextual Compression
30/11/2021
1 Introduction
Compression algorithms are a crucial cornerstone for our information age. The amount of data transmitted have been growing exponentially, and our requirement for network rate has been mounting. Better compression means lower cost of transmitting data and it can potentially democratize network usage. Currently, only 4% of the world population live on unlimited data plans, meaning that the quality of compression would influence the cost of living for the rest 96%.
In the short lab, the practical and theoretical aspects of compression algorithms were investigated. On the practical side, variable length codes like Shannon-Fano and Huffman coding as well as stream codes like arithmetic coding were implemented through cam(un)zip
for compression of computer files. On the theoretical side of variable codes, trees and code table structures were studied and Shannon’s classic 1947 paper was perused to understand its prefix-free property[1]. A relationship between the bits output from the compression algorithms and the bytes required for writing on computer was surveyed. It was found that arithmetic coding has the best compression ratio closet to the source entropy (\(H = 4.45\)), followed by Huffman and Shannon-Fano. On the side of corruptibility, Huffman coding limits the propagation of source error, while Arithmetic coding could not.
In the extended lab, the theory supporting the adaptive and contextual coding were studied and used to support the implmentation of these coding schemes. The results of the compressors were compared and further modifications for practical implementation were suggested. The implementation focuses on arithmetic coding algorithms but theories of adaptive Huffman coding was also discussed.