Office hours (room 339):

- Monday 18-19
- Wednesday 12-13

Address:

Institute of Computer ScienceUniversity of Wrocław

ul. Joliot-Curie 15

PL-50-383 Wrocław

ADVERTISMENTS

4th October, 2012

The first meeting:

The first meeting will take place on Thursday, 11th October, 2012.

3rd January, 2013

The exam:

The exam at the end of semester will take place on Thursday, 24th January, 2013.

The data compression field has always been an important part of computer science and it is becoming increasingly popular and important today. Although computers become faster and data storage becomes less expensive and more efficient, the increased importance of usage of data necessitates the use of at least a small measure of data compression due to vast storage and transmission requirements. The question in many applications is no longer whether to compress data, but what compression method should be applied.

Data compression is the process of converting an input data stream (the source stream or the original raw data) into another data stream (the output, the bitstream, or the compressed stream) that has a smaller size. A stream can be a file, a buffer in memory, or individual bits sent on a communications channel.

Compression can be either be lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by identifying marginally important information and removing it. The process of reducing the size of a data file is popularly referred to as data compression, although its formal name is source coding (coding done at the source of the data, before it is stored or transmitted).

Compression is useful because it helps reduce resources usage, such as data storage space or transmission capacity. Because compressed data must be decompressed to be used, this extra processing imposes computational or other costs through decompression, this situation is far from being a free lunch.

Students should know:

- programming language like
*C++*or*Java*; - algorithms and data structures.

Main handbooks (password: handbook):

- [1] Khalid Sayood:
*Introduction to Data Compression. Third Edition.*Elsevier, 2006. - [2] Adam Drozdek:
*Elements of Data Compression.*Brooks/Cole, 2002. - [3] Peter Wayner:
*Data Compression for Real Programmers.*Morgan Kaufman, 1999.

Other books:

- [4] David Salomon, Giovanni Motta:
*Handbook of Data Compression. Fifth Edition.*Springer, 2010. - [5] David Salomon:
*A Concise Introduction to Data Compression.*Springer, 2008. - [6] Ida Mengyi Pu:
*Fundamental Data Compression.*Elsevier, 2006.

Links:

Meetings:

- Thursday 12-14, room 237.

Read the first section from the handbook [1]:

- 1.1. Compression Techniques
- 1.2. Modeling and Coding

1. Introduction

Questions and exercises:

- An example of simple data compression:
*Morse code*and*Braile code*(find some information about this in the internet). - How does a
*compression algorithm*work? - What is the main difference between
*lossless and lossy compression*? - When we need lossless compression, and when we can use lossy compression?
- How do we calculate
*compression ratio*? Give an example. - What is the
*rate*? How do we calculate the rate? - What is the
*distortion*? - What is
*modeling*and*coding*in the context of data compression? - What is the
*residual*? -
Consider the following sequence of numbers:
x
_{n}=(3, 2, 3, 3, 4, 3, 5, 5, 4, 5, 6, 5, 7, 6, 7, 7) for n=0, 1, 2...

If we were to store the binary representation of these numbers, we would need to use 4 bits per sample. Characterize a structure of the data by an equation of the form ⌊a/b·x+c ⌋ (use 2 bits for coefficients a, b, and c). Calculate the difference the data and the model (the difference between each pair of elements should be 0 or 1). We can obtain compression by storing the parameters of the model and the residual sequence. Perform the encoding. Calculate the compression ratio. -
Consider the following sequence of numbers:
x
_{n}=(24, 21, 20, 22, 25, 23, 26, 27, 29, 28, 25, 24, 21, 22, 19, 18) for n=0, 1, 2...

If we were to store the binary representation of these numbers, we would need to use 5 bits per sample. Note that each value is close to the previous value in the sequence (the difference between each pair of elements should from -3 to 3). We can obtain compression by storing the first value, than in place of subsequet values we store the difference between it and the previous value. Perform the encoding. Calculate the compression ratio.

Read the second section from the handbook [1]:

- 2.2 A Brief Introduction to Information Theory
- 2.3 Models
- 2.4 Coding

2. Mathematical preliminaries for Lossless Compression

Questions and exercises:

Read the second section from the handbook [1]:

- 2.3 Models
- 2.4 Coding

2. Mathematical preliminaries for Lossless Compression

Questions and exercises:

Read the third section from the handbook [1]:

- 3.2 The Huffman Coding Algorithm
- 3.3 Nonbinary Huffman Codes

3. Huffman Coding

Questions and exercises:

Read the second section from the handbook [2]:

- 2.1 Shannon Coding
- 2.2 Shannon-Fano Coding

2. Shannon-Fano Coding

Questions and exercises:

Read the third section from the handbook [1]:

- 3.5 Golomb Codes
- 3.6 Rice Codes
- 3.7 Tunstall Codes

3. Huffman Coding

Questions and exercises:

Implement *Huffman Coding*.
Write a program for encoding and decoding text files using Huffman coding.
Test your program on lrrh.txt file.

Read the third section from the handbook [1] and the handbook [2]:

- 5.4 Adaptive Dictionary
- 5.1 The LZ77 Technique

5. Dictionary Techniques [1]

5. Dictionary Techniques [2]

Questions and exercises:

Read the third section from the handbook [1] and the handbook [2]:

- 5.4 Adaptive Dictionary
- 5.2 The LZ78 Technique

5. Dictionary Techniques [1]

5. Dictionary Techniques [2]

Questions and exercises:

Read the third section from the handbook [2]:

- 5.1.1 The LZSS Technique
- 5.2.1 The LZW Technique

5. Dictionary Techniques [2]

Questions and exercises:

Implement *LZSS* or *LZW* coding technique.
Write a program for encoding and decoding text files using LZSS or LZW coding.
Test your program on lrrh.txt file.

Read the section sixth from the handbook [3]:

- 5 Grammar Compression

5. Grammar Compression [3]

Questions and exercises:

Read the section fourth from the handbook [1] and the handbook [2]:

- 4.2 Introduction
- 4.3 Coding a Sequence
- 4.4 Generating a Binary Code
- 4.1 Implementation of Arithmetic Coding
- 4.1.1 Integer Implementation

4. Arithmetic Coding [1]

4. Arithmetic Coding [2]

Questions and exercises: