File Compressor and Decompressor written in Go
As part of my explorings with the Go language I decided to write a simple File Compressor, for it is the kind of project in which buffered reading, writing and seeking are staples (otherwise the compressed data may not preserve it's original state). Plus, I had to work with Go's pointers, errors and conditional structures.
I also played around with this neat progress bar package.
This entropy encoding technique consists in creating a binary tree representation of the input data that can be stored and rebuilt later on.
- A list containing the symbols frequency (in case of files, symbols are bytes) is built.
- The list is then sorted. This can be done through a Heap.
- The following steps are to be repeated until there's no symbol left:
- Get the two symbols of smaller frequency from the list.
- Create a tree containing the two elements as children nodes.
- Create a parent node storing the sum of two children elements frequency.
- Add the parent element to the list, that must, after the addition, still have its order preserved.
- Delete the children nodes.
- A code word is then assigned to each element based on its path out of the root.
$ go get
$ go build -o app.bin
$ ./app.bin -c uncompressed compressed
$ ./app.bin -x compressed uncompressed