Lossless Source Coding
Prof. Bikash Kumar Dey
Department of Electrical Engineering
IIT Bombay
January 24, 2019
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 1 / 52
Outline
1 Introduction: Source Coding
2 Lossless Source Coding
3 Huffman Code
4 Kraft Inequality and Optimal Codes
5 Lempel-Ziv Code
6 Arithmetic Code
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 2 / 52
Source Coding
Objective: To represent source at a lower rate.
Source Channel
x(t) y(t) Destination
or or
x[n] y[n]
A Typical Communication System
Why source coding?
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 3 / 52
Source Coding
x(t)/x[n] may have redundancy or undesired/unimportant extra information
Telephone quality speech signal can be compressed to as little as 1-2Kbps.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 4 / 52
Source Coding
x(t)/x[n] may have redundancy or undesired/unimportant extra information
Example 1: A band-limited continuous time signal, bandwidth W
x(t)
sampling at rate
2W
x[n]
No loss of information
=⇒ intermediate values redundant!
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 5 / 52
Source Coding
x(t)/x[n] may have redundancy or undesired/unimportant extra information
Example 1: A band-limited continuous time signal, bandwidth W
x(t)
sampling at rate
2W
x[n]
No loss of information
=⇒ intermediate values redundant!
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 5 / 52
Source Coding
The channel may not have the “capacity” to communicate at the source
information rate =⇒ need to represent source at a lower rate with some loss
of information
Example: A real valued sample has infinite information. It needs to be quantized
before transmission through a DMC
Non-uniform distribution of the source may allow efficient representation of
the signal at a lower rate
Example: For binary transmission of English text, frequent letters may be
represented by shorter bit sequences, e.g. Morse Code.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 6 / 52
Source Coding
The channel may not have the “capacity” to communicate at the source
information rate =⇒ need to represent source at a lower rate with some loss
of information
Example: A real valued sample has infinite information. It needs to be quantized
before transmission through a DMC
Non-uniform distribution of the source may allow efficient representation of
the signal at a lower rate
Example: For binary transmission of English text, frequent letters may be
represented by shorter bit sequences, e.g. Morse Code.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 6 / 52
Source Coding
The channel may not have the “capacity” to communicate at the source
information rate =⇒ need to represent source at a lower rate with some loss
of information
Example: A real valued sample has infinite information. It needs to be quantized
before transmission through a DMC
Non-uniform distribution of the source may allow efficient representation of
the signal at a lower rate
Example: For binary transmission of English text, frequent letters may be
represented by shorter bit sequences, e.g. Morse Code.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 6 / 52
Source Coding
The channel may not have the “capacity” to communicate at the source
information rate =⇒ need to represent source at a lower rate with some loss
of information
Example: A real valued sample has infinite information. It needs to be quantized
before transmission through a DMC
Non-uniform distribution of the source may allow efficient representation of
the signal at a lower rate
Example: For binary transmission of English text, frequent letters may be
represented by shorter bit sequences, e.g. Morse Code.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 6 / 52
Source Coding
The channel may not have the “capacity” to communicate at the source
information rate =⇒ need to represent source at a lower rate with some loss
of information
Example: A real valued sample has infinite information. It needs to be quantized
before transmission through a DMC
Non-uniform distribution of the source may allow efficient representation of
the signal at a lower rate
Example: For binary transmission of English text, frequent letters may be
represented by shorter bit sequences, e.g. Morse Code.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 6 / 52
Types of Lossy Source Coding
Lossless
I no loss of information
I used by various compression schemes for file storage (e.g. compress in Linux)
Lossy
I with loss of information, e.g. quantization, vector quantization, sub-band
coding
I used by standards like JPEG, MPEG etc.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 7 / 52
Lossless Source Coding
for
Discrete Memoryless Sources
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 8 / 52
Discrete Memoryless source
Source X0 , X1 , X2 ...........
i.i.d.
Xi ∈ {x0 , x1 , · · · , xM−1 }
P{Xi = x0 }, P{Xi = x1 }, · · · , P{Xi = xM−1 } - known
Example: An analog source sampled and quantized
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 9 / 52
If M ≤ 2b , then we can represent the symbols by binary fixed length codes as:
Symbol xi Codes C (xi ) Length li Probability Pi
x0 0 · · · 000 l0 = b P0
x1 0 · · · 001 l1 = b P1
x2 0 · · · 010 l2 = b P2
.. .. .. ..
. . . .
This will require b-bits per symbol.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 10 / 52
Something better possible!
Example:
Symbol: x0 x1 x2 x3
Probability: 0.5 0.25 0.125 0.125
Fixed length: 00 01 10 11
Variable length: 0 10 110 111
Average length for fixed length code = 2,
Average length for variable length code = 1.75
Entropy H(x) = 1.75
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 11 / 52
Desired properties of source codes
Non-singular : xi 6= xj ⇒ C (xi ) 6= C (xj )
Hence, may not be uniquely decodable
Example: {0, 010, 01, 10}
Uniquely Decodable: For any two finite sequences
x1 x2 ....xn 6= x10 x20 .....xm0
⇒ C (x1 )C (x2 )....C (xn ) 6= C (x10 )C (x20 ).....C (xm0 )
Hence, decoder may need to observe future bits
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 12 / 52
Desired properties of source codes
Non-singular : xi 6= xj ⇒ C (xi ) 6= C (xj )
Hence, may not be uniquely decodable
Example: {0, 010, 01, 10}
Uniquely Decodable: For any two finite sequences
x1 x2 ....xn 6= x10 x20 .....xm0
⇒ C (x1 )C (x2 )....C (xn ) 6= C (x10 )C (x20 ).....C (xm0 )
Hence, decoder may need to observe future bits
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 12 / 52
Desired properties of source codes
Prefix code or Instantaneous code: no codeword is a prefix of any other
codeword
-codes can be decoded instantaneously!
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 13 / 52
Desired properties of source codes
Prefix code or Instantaneous code: no codeword is a prefix of any other
codeword
-codes can be decoded instantaneously!
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 13 / 52
Example : Classes of codes
X Singular Non-singular UD, but not prefix
but not UD Prefix
1 0 0 10 0
2 0 010 00 10
3 0 01 11 110
4 0 10 110 111
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 14 / 52
Huffman code
Code Design
Arrange the alphabet in decreasing order of their probability
Assign 0 and 1 as the last bit of the last two symbols
Combine the last two symbols as one and assign the sum of their probabilities
to the new symbol
Repeat this process on the new set of symbol again and again. While
assigning a bit to a derived symbol, the bit is pre-pended to the codewords of
all the contributing symbols.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 15 / 52
Example
X Codeword Probability
1 0.20
2 0.15
3 0.25
4 0.25
5 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Example
X Codeword Probability
3 0.25
4 0.25
1 0.20
2 0.15
5 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Example
X Codeword Probability
3 0.25 0.30
4 0.25 0.25
1 0.20 0.25
2 0 0.15 0.20
5 1 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Example
X Codeword Probability
3 0.25 0.30 0.30
4 0 0.25 0.25 0.30
1 1 0.20 0.25 0.25
2 0 0.15 0.20
5 1 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Example
X Codeword Probability
3 1 0.25 0.30 0.30 0.55
4 0 0.25 0.25 0.30 0.45
1 1 0.20 0.25 0.25
2 0 0 0.15 0.20
5 0 1 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Example
X Codeword Probability
3 0 1 0.25 0.30 0.30 0.55 1
4 1 0 0.25 0.25 0.30 0.45
1 1 1 0.20 0.25 0.25
2 0 0 0 0.15 0.20
5 0 0 1 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 16 / 52
Kraft Inequality:
A prefix code with codeword lengths l1 , l2 , l3 , ..., lM exists if and only if
2−li ≤ 1
P
i
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 17 / 52
Kraft Inequality: Proof
0
0 1
0 0
1
1
0 Necessity:
0 1 0
1
0
1
1
0
1 0 code 01
1
0 0
1
1
0 code 101
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1
0 0
1
1 Necessity:
0
0 1 0 Codewords disqualified by 01:
1
0 010,011,0100,0101, 0110,0111
1
1
0
1 0 code 01
1
0 0
1
1
0
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1
0 0
1
1 Necessity:
0
0 1 0 Codewords disqualified by 01:
1
0 010,011,0100,0101, 0110,0111,0
1
1
0
1 0 code 01
1
0 0
1
1
0
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1
0 0
1 Necessity:
1
0 Codewords disqualified by 01:
0 1 0
1 010,011,0100,0101, 0110,0111,0
0
1 Consider codeword C (xi ), length li
1
0 and let lmax = maxi {li }
1 0 code 01
1
0 0
1
1
0
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1
0 0 Necessity:
1
1
0 Codewords disqualified by 01:
0 1 0 010,011,0100,0101, 0110,0111,0
1
1
0 Consider codeword C (xi ), length li
1 and let lmax = maxi {li }
0
1 0 code 01 No. of descendants of C (i) at level
1
0 lmax = 2lmax −li
0 1
1
0
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1
0 Necessity:
0 1
1 Codewords disqualified by 01:
0 010,011,0100,0101, 0110,0111,0
0 1 0
1 Consider codeword C (xi ), length li
0
1 and let lmax = maxi {li }
1
0 No. of descendants of C (i) at level
0 code 01
1 1 lmax = 2lmax −li
0 0
1 Descendants are disjoint for
1
0 different i
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1 code 001 Necessity:
0 0
1 Codewords disqualified by 01:
1
0 010,011,0100,0101, 0110,0111,0
0 1 0 Consider codeword C (xi ), length li
1
1
0 and let lmax = maxi {li }
1 No. of descendants of C (i) at level
0
1 0 code 01 lmax = 2lmax −li
1
0 0 Descendants are disjoint for
1
1 different i
0
1 0 1 Considering
P lmax −li alllmax
codewords,
0 i 2 ≤ 2
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 Necessity:
1 code 001
0 0 Codewords disqualified by 01:
1
1 010,011,0100,0101, 0110,0111,0
0
1 0 Consider codeword C (xi ), length li
0 1
0 and let lmax = maxi {li }
1
1 No. of descendants of C (i) at level
0 lmax = 2lmax −li
1 0 code 01
1 Descendants are disjoint for
0 0
1 different i
1
0 Considering
P lmax −li alllmax
codewords,
1 0 1 2 ≤ 2
i
0
1 ⇒
P −li
≤1
1 i 2
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 18 / 52
Kraft Inequality: Proof
0
0 1 code 001
0 0
1
1
0 Sufficiency:
0 1 0
1 l1 , l2 , . . . , lM - Given in increasing
0
1 order.
1
0 To construct a prefix code
1 0 code 01
1 For i = 1, 2, · · · , M,
0 0
1 Take the first remaining node at
1
0 level li as C (xi ).
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 19 / 52
Kraft Inequality: Proof
0
0 1 code 001
0 0
1
1
0 Sufficiency:
0 1 0
1 l1 , l2 , . . . , lM - Given in increasing
0
1 order.
1
0 To construct a prefix code
1 0 code 01
1 For i = 1, 2, · · · , M,
0 0
1 Take the first remaining node at
1
0 level li as C (xi ).
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 19 / 52
Kraft Inequality: Proof
0
0 1 code 001
0 0
1
1
0 Sufficiency:
0 1 0
1 l1 , l2 , . . . , lM - Given in increasing
0
1 order.
1
0 To construct a prefix code
1 0 code 01
1 For i = 1, 2, · · · , M,
0 0
1 Take the first remaining node at
1
0 level li as C (xi ).
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 19 / 52
Kraft Inequality: Proof
0
0 1 code 001
0 0
1
1
0 Sufficiency:
0 1 0
1 l1 , l2 , . . . , lM - Given in increasing
0
1 order.
1
0 To construct a prefix code
1 0 code 01
1 For i = 1, 2, · · · , M,
0 0
1 Take the first remaining node at
1
0 level li as C (xi ).
1 0 1
0
1
1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 19 / 52
Kraft Inequality: Comments
Kraft inequality also holds for uniquely decodable codes
UDC with l1 , l2 , · · · , lm exists ⇒ prefix code with l1 , l2 , · · · , lm exists
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 20 / 52
Kraft Inequality: Comments
Kraft inequality also holds for uniquely decodable codes
UDC with l1 , l2 , · · · , lm exists ⇒ prefix code with l1 , l2 , · · · , lm exists
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 20 / 52
Optimal code
A code for an alphabet x1 , x2 , · · · , xm is optimal if there is no code for the same
alphabet with smaller average length
Denote the average length of an optimal code as L.
Theorem
H(x) ≤ L < H(x) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 21 / 52
Optimal code
A code for an alphabet x1 , x2 , · · · , xm is optimal if there is no code for the same
alphabet with smaller average length
Denote the average length of an optimal code as L.
Theorem
H(x) ≤ L < H(x) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 21 / 52
Part I: For any uniquely decodable code, the expected
length L(C , X ) ≥ H(X )
2−lj and qi := 2−li /z.
P
Define z = j P
Known inequality: i pi log q1i ≥ i pi log p1i .
P
equality iff pi = qi ∀i
X X 1
L(C , X ) = pi li = pi log − log(z)
qi
i i
X 1
≥ pi log − log(z) ≥ H(x)
pi
i
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 22 / 52
Part I: For any uniquely decodable code, the expected
length L(C , X ) ≥ H(X )
2−lj and qi := 2−li /z.
P
Define z = j P
Known inequality: i pi log q1i ≥ i pi log p1i .
P
equality iff pi = qi ∀i
X X 1
L(C , X ) = pi li = pi log − log(z)
qi
i i
X 1
≥ pi log − log(z) ≥ H(x)
pi
i
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 22 / 52
Part I: For any uniquely decodable code, the expected
length L(C , X ) ≥ H(X )
2−lj and qi := 2−li /z.
P
Define z = j P
Known inequality: i pi log q1i ≥ i pi log p1i .
P
equality iff pi = qi ∀i
X X 1
L(C , X ) = pi li = pi log − log(z)
qi
i i
X 1
≥ pi log − log(z) ≥ H(x)
pi
i
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 22 / 52
Part I: For any uniquely decodable code, the expected
length L(C , X ) ≥ H(X )
Equality L(C , X ) = H(X ) is satisfied iff
log(z) = 0 ⇒ j 2−lj = 1, i.e., equality in Kraft inequality.
P
1
2 li = log p1i
These conditions suggest good code design philosophy.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 23 / 52
Part II : There exists code with L(C , X ) < H(X ) + 1
For all i, define li = dlog2 ( p1i )e
Kraft inequality satisfied by li s:
X X −dlog ( 1 )e X
2−li = 2 2 pi
≤ pi = 1
i i i
⇒ There is a prefix code with these li s.
X 1
L(C , X ) = pi dlog(
)e
pi
i
X 1
< pi log +1
pi
i
= H(X ) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 24 / 52
Part II : There exists code with L(C , X ) < H(X ) + 1
For all i, define li = dlog2 ( p1i )e
Kraft inequality satisfied by li s:
X X −dlog ( 1 )e X
2−li = 2 2 pi
≤ pi = 1
i i i
⇒ There is a prefix code with these li s.
X 1
L(C , X ) = pi dlog(
)e
pi
i
X 1
< pi log +1
pi
i
= H(X ) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 24 / 52
Part II : There exists code with L(C , X ) < H(X ) + 1
For all i, define li = dlog2 ( p1i )e
Kraft inequality satisfied by li s:
X X −dlog ( 1 )e X
2−li = 2 2 pi
≤ pi = 1
i i i
⇒ There is a prefix code with these li s.
X 1
L(C , X ) = pi dlog(
)e
pi
i
X 1
< pi log +1
pi
i
= H(X ) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 24 / 52
Part II : There exists code with L(C , X ) < H(X ) + 1
For all i, define li = dlog2 ( p1i )e
Kraft inequality satisfied by li s:
X X −dlog ( 1 )e X
2−li = 2 2 pi
≤ pi = 1
i i i
⇒ There is a prefix code with these li s.
X 1
L(C , X ) = pi dlog(
)e
pi
i
X 1
< pi log +1
pi
i
= H(X ) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 24 / 52
Part II : There exists code with L(C , X ) < H(X ) + 1
For all i, define li = dlog2 ( p1i )e
Kraft inequality satisfied by li s:
X X −dlog ( 1 )e X
2−li = 2 2 pi
≤ pi = 1
i i i
⇒ There is a prefix code with these li s.
X 1
L(C , X ) = pi dlog(
)e
pi
i
X 1
< pi log +1
pi
i
= H(X ) + 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 24 / 52
Need to combine many source symbols for source coding
symbols probability best symbol-wise code
1 0.9 1
0 0.1 0
Average length = 1bit
Entropy = −(0.9 log2 (0.9) + 0.1 log2 (0.1))
= 0.469 bits
Symbol-wise coding can assure only L < H(X ) + 1.
Source coding theorem says: L → H(X )
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 25 / 52
How L → H(X ) can be achieved for any DMS:
Combine n samples (Xt , Xt+1 , · · · , Xt+n−1 ) and consider it as a single source
symbol
Entropy of (Xt , Xt+1 , · · · , Xt+n−1 ) is nH(x)
The optimal code length Ln for (Xt , Xt+1 , · · · , Xt+n−1 ) satisfies
nH(x) ≤ Ln < nH(x) + 1
The average code length L = Lnn per actual source symbol satisfies
H(x) ≤ L < H(x) + n1
By increasing n, we can have L as close to H(x) as desired.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 26 / 52
Huffman Code Revisited
Properties
Huffman code is a prefix code.
Huffman code is optimum - no code better than Huffman code for any alphabet.
It can be used to encode optimally with L → H(X ) as n → ∞.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 27 / 52
Huffman Code Revisited
Disadvantages
Need to know/ calculate the symbol probabilities explicitly ⇒ to combine n
symbols of an M-ary source, need to compute M n probabilities.
Can not apply for sources with unknown statistics.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 28 / 52
Lempel-Ziv Coding
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 29 / 52
LZ Code
No explicit knowledge of the source statistics required. Universal.
Asymptotically optimum for memoryless sources.
Compress in Linux uses LZ algorithm.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 30 / 52
LZ Code: How!
1 Maintain a list of substrings appeared before.
2 Store the output string from the source in a buffer.
3 Check if the present string at the buffer is their in the list.
I If yes, then wait for one more symbol to come into the buffer and then go
back to step 3.
I If no, then
F find the substring (buffer string excluding the last symbol) in the list and
transmit its index.
F Transmit the last symbol.
F Empty the buffer.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 31 / 52
Encoding: Example
Input:
0
Output:
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1
0 1
1
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0
00 01 10
1 0
Output: 1 0,0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0 11
00 01 10 11
1 0 11
Output: 1 0,0 01,1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0 11 01
000 001 010 011 100
1 0 11 01
Output: 1 0,0 01,1 10,1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0 11 01 010
000 001 010 011 100 101
1 0 11 01 010
Output: 1 0,0 01,1 100,0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0 11 01 010 00
000 001 010 011 100 101 110
1 0 11 01 010 00
Output: 1 0,0 01,1 100,0 010,0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
Input: 1 0 11 01 010 00 10
000 001 010 011 100 101 110 111
1 0 11 01 010 00 10
Output: 1 0,0 01,1 100,0 010,0 001,0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Encoding: Example
1 There is no apparent compression!!
2 Reason: The input string is too short and 0, 1 are almost equiprobable
3 There will be compression for long strings with unequal frequency of 0 and 1.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 32 / 52
Decoding: Example
Input:
0
Output:
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1
0 1
1
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 00
00 01 10
0 0
Output: 1 0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 00 011
00 01 10 11
0 0 11
Output: 1 0 11
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 00 011 101
000 001 010 011 100
0 0 11 01
Output: 1 0 11 01
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 00 011 101 1000
000 001 010 011 100 101
1 0 11 01 010
Output: 1 0 11 01 010
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 0 011 101 1000 0100
000 001 010 011 100 101 110
1 0 11 01 010 00
Output: 1 0 11 01 010 00
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Decoding: Example
Input: 1 0 011 101 1000 0100 0010
000 001 010 011 100 101 110 111
1 0 11 01 010 00 10
Output: 1 0 11 01 010 00 10
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 33 / 52
Few Improvements
1 For any string S, if both S 0 and S 1 are in the list, then S can be removed
from the list.
2 If S and S 0 are in the list and input is S 1, then the bit 1 need not be
transmitted because it can not be 0.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 34 / 52
Few Improvements
1 For any string S, if both S 0 and S 1 are in the list, then S can be removed
from the list.
2 If S and S 0 are in the list and input is S 1, then the bit 1 need not be
transmitted because it can not be 0.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 34 / 52
Shannon-Fano-Elias Code
or
Arithmetic Code
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 35 / 52
Shannon-Fano-Elias Code
1
p(a)
ax
F(x) =
0
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
1
p(a)
ax
F(x) =
p(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
1
p(a)
x
F(x)
a
F(x) =
p(x) F(x) − 12 p(x)
F(x) − p(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1
p(x)
1
p(a)
ax
F(x) =
p(x) F(x) − 1 p(x)
2
F(x) − 12 p(x) l(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1 = 6 (say)
p(x)
1
p(a)
code
x
for x
a
0.011001 (say)
F(x) =
p(x) ||
F(x) − 1 p(x)
2 l(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1
p(x)
1
p(a)
1 p(x)
l(x) 2
2
ax
F(x) =
p(x)
F(x) − 1 p(x)
2 l(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1
p(x)
1
p(a)
1 p(x)
l(x) 2
2
ax
F(x) =
p(x)
F(x) − 1 p(x)
2 l(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1 = 6 (say)
p(x)
1
p(a)
1 p(x)
l(x) code 2
2
x
for x
a
0.011001 (say)
F(x) =
p(x) ||
F(x) − 1 p(x)
2 l(x)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Shannon-Fano-Elias Code
l(x) = log 1 +1 = 6 (say)
p(x)
1
p(a)
1 p(x)
l(x) 2
2
x
0.011010
a
F(x) =
p(x)
0.011001 (say)
0 x
0 1 6 7
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 36 / 52
Performance of SFA code
1
P P
1. L = i P(x )l
i i = i P(x i ) dlog P(xi ) e + 1
1
P
= i P(xi )dlog P(xi ) e + 1
1
P
< i P(xi )log P(x i)
+2
= H(X ) + 2 ⇒ suboptimal
2. SFA is asymptotically optimum
For combined coding of n symbols,
L = Ln /n < H(X ) + 2/n → H(X ) as n → ∞.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 37 / 52
Arithmetic Code
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 38 / 52
Arithmetic code: What is it!
1 A way of doing SFA encoding of large blocks of symbols sequentially.
2 Sequential encoding reduces delay.
3 Block length n can be increased without increasing encoding/decoding delay.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 39 / 52
Block-wise SFA
Lexicography ordering of (X1 , X2 , · · · , Xn )
Example: The pairs {(x, y ) : x, y ∈ {1, 2, 3})} are ordered as
(1, 1) < (1, 2) < (1, 3) < (2, 1) < (2, 2) < (2, 3) < (3, 1) < (3, 2) < (3, 3).
- Similar to alphabetic ordering in a dictionary.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 40 / 52
Computing Fi (X1 , X2 , · · · , Xi ) iteratively
Denote (x1 , x2 , · · · , xi ) by x(i) . Then
X
Fi (x(i) ) = P(y(i) )
y(i) ≤x(i)
X X
= P(y(i−1) ) + P(x(i−1) , yi )
y(i−1) <x(i−1) yi ≤xi
X
(i−1)
= Fi−1 (x − 1) + P(x(i−1) )P(yi |x(i−1) )
yi ≤xi
(i−1)
= Fi−1 (x − 1) + P(x(i−1) )F (xi |x(i−1) )
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 41 / 52
Block-wise SFA
1 Need not compute all probability masses or the complete joint cumulative
distribution function.
2 Arithmetic coding can tackle source with correlation and thus gives better
compression for source with correlation.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 42 / 52
Encoding
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 43 / 52
Computing Fi (X1 , X2 , · · · , Xi ) iteratively
Fi (x(i) ) = F(i−1) (x(i−1) − 1) + P(x(i−1) )F (xi |x(i−1) )
Subscript in F (·) is omitted for simplicity
F(x (i−1) )
P(x (i−1))
F(x (i−1)− 1)
0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 44 / 52
Computing Fi (X1 , X2 , · · · , Xi ) iteratively
Fi (x(i) ) = F(i−1) (x(i−1) − 1) + P(x(i−1) )F (xi |x(i−1) )
Subscript in F (·) is omitted for simplicity
1 1
F(x (i−1) )
F(x i x (i−1))
P(x (i−1))
P(x i x (i−1))
F(x (i−1)− 1) F(x i−1 x (i−1))
0 0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 44 / 52
Computing Fi (X1 , X2 , · · · , Xi ) iteratively
Fi (x(i) ) = F(i−1) (x(i−1) − 1) + P(x(i−1) )F (xi |x(i−1) )
Subscript in F (·) is omitted for simplicity
1 F(x (i−1)) 1
F(x (i−1) )
F(x (i) ) F(x i x (i−1))
P(x (i−1))
P(x (i) ) P(x i x (i−1))
F(x (i−1)− 1) F(x (i) −1 ) F(x i−1 x (i−1))
0 F(x (i−1)− 1) 0
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 44 / 52
Iterative encoding
1 The final encoded code-string will represent a sub-interval (a, b) in
(Fi (x(i) − 1), Fi (x(i) )). The code-string itself will be the binary representation
of a.
2 If the binary representation of Fi (x(i) ) and Fi (x(i) − 1) have the first t bits
common, then the final code-string will start with those t bits.
3 As i increases, the interval (Fi (x(i) − 1), Fi (x(i) )) will be narrower and
narrower. More and more bits will be common to Fi (x(i) ) and Fi (x(i) − 1).
4 So more and more bits can be encoded as i increases.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 45 / 52
Iterative encoding
1 The final encoded code-string will represent a sub-interval (a, b) in
(Fi (x(i) − 1), Fi (x(i) )). The code-string itself will be the binary representation
of a.
2 If the binary representation of Fi (x(i) ) and Fi (x(i) − 1) have the first t bits
common, then the final code-string will start with those t bits.
3 As i increases, the interval (Fi (x(i) − 1), Fi (x(i) )) will be narrower and
narrower. More and more bits will be common to Fi (x(i) ) and Fi (x(i) − 1).
4 So more and more bits can be encoded as i increases.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 45 / 52
Iterative encoding
1 The final encoded code-string will represent a sub-interval (a, b) in
(Fi (x(i) − 1), Fi (x(i) )). The code-string itself will be the binary representation
of a.
2 If the binary representation of Fi (x(i) ) and Fi (x(i) − 1) have the first t bits
common, then the final code-string will start with those t bits.
3 As i increases, the interval (Fi (x(i) − 1), Fi (x(i) )) will be narrower and
narrower. More and more bits will be common to Fi (x(i) ) and Fi (x(i) − 1).
4 So more and more bits can be encoded as i increases.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 45 / 52
Iterative encoding
1 The final encoded code-string will represent a sub-interval (a, b) in
(Fi (x(i) − 1), Fi (x(i) )). The code-string itself will be the binary representation
of a.
2 If the binary representation of Fi (x(i) ) and Fi (x(i) − 1) have the first t bits
common, then the final code-string will start with those t bits.
3 As i increases, the interval (Fi (x(i) − 1), Fi (x(i) )) will be narrower and
narrower. More and more bits will be common to Fi (x(i) ) and Fi (x(i) − 1).
4 So more and more bits can be encoded as i increases.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 45 / 52
Computing P(xi |x(i−1) )
Special Cases
1 I.I.D. source: P(xi |x(i−1) ) = P(xi )
2 Markov source: P(xi |x(i−1) ) = P(xi |xi−1 )
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 46 / 52
Computing P(xi |x(i−1) )
Models for unknown source statistics
1 Laplace model:
Fa + 1
P Xi = a|x(i−1) = P
b (Fb + 1)
where Fa is the number of a in x(i−1) .
2 Dirichlet model:
Fa + α
P Xi = a|x(i−1) = P
b (Fb + α)
A smaller α tries to learn the statistics faster.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 47 / 52
Computing P(xi |x(i−1) )
Models for unknown source statistics
1 Laplace model:
Fa + 1
P Xi = a|x(i−1) = P
b (Fb + 1)
where Fa is the number of a in x(i−1) .
2 Dirichlet model:
Fa + α
P Xi = a|x(i−1) = P
b (Fb + α)
A smaller α tries to learn the statistics faster.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 47 / 52
Example: File Compression
Binary file ⇒ three source characters:
1 0: will be denoted by a to distinguish from the encoded bit 0.
2 1: will be denoted by b to distinguish from the encoded bit 1.
3 end-of-file character denoted by .
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 48 / 52
Probability model
1 P(|x(i−1) ) = 0.15
Fa +1
2 P(xi = a|x(i−1) ) = 0.85 × Fa +Fb +2
Fb +1
3 P(xi = b|x(i−1) ) = 0.85 × Fa +Fb +2
We take a short file bbba for illustration.
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 49 / 52
Arithmetic encoding: Example
We need to compute the following probabilities:
Context Conditional Probabilities
x(i−1) P(a|x(i−1) ) P(b|x(i−1) P(|x(i−1) )
0.425 0.425 0.15
b 0.28 0.57 0.15
bb 0.21 0.64 0.15
bbb 0.17 0.64 0.15
bbba 0.28 0.57 0.15
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 50 / 52
File compression
Input: b
0 1...
b
a 0 0...
0
Output:
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b
1
b
0 1...
bb
ba
a 0 0...
0
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b
1
b 0 11...
bb 0 10...
ba
a 0 0...
0
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b
1
0 11...
b 0 11...
bb 0 10...
ba bb
0 10...
a 0 0...
0
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b b
1
bb 0 11...
b 0 11...
bb 0 10...
ba bbb
0 10...
a 0 0...
bba
0
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b b a
1
bb 0 11...
b 0 11...
bbb
bb 0 10... 0 101...
bbbb
ba
bbba
a 0 0... 0 100...
bba
0
Output: 1
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b b a
1
bb 0 11...
b 0 11...
bbb
bb 0 10... 0 101...
bbbb
ba
bbba 0 1001...
a 0 0...
bba 0 1000...
0
Output: 1 001
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression
Input: b b b a
1
bb 0 11... 0 10011111
bbb bbba
0 10011110
0 101... 0 10011101
bbbb
bbbab 0 10011100
0 10011011
bbba 0 1001... 0 10011010
bbbaa 0 10011001
bba 0 1000...
0 10011000
0
Output: 1 001 11101
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 51 / 52
File compression: Decoding
Input: 1
0 1...
b
a 0 0...
0
Output:
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0
1
b 0 11...
bb 0 10...
ba
a 0 0...
0
Output: b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0
1
b 0 11...
bb 0 101...
0 100...
ba
a 0 0...
0
Output: b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1
1
bb 0 11...
b 0 11...
bb 0 101...
0 101...
0 100... bbb
ba
0 1001...
a 0 0...
bba 0 1000...
0
Output: b b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1 1
bb 0 11...
bbb
bbbb 0 101...
bbba 0 10011...
0 10010...
bba 0 1000...
Output: b b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1 1 1
0 10011111
bbba
0 10011110
0 10011101
bbbab 0 10011100
0 10011011
0 10011010
bbbaa 0 10011001
0 10011000
Output: b b b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1 1 1 1
0 10011111
bbba
0 10011110
0 10011101
bbbab 0 10011100
0 10011011
0 10011010
bbbaa 0 10011001
0 10011000
Output: b b b
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1 1 1 1 0
0 10011111
bbba
0 10011110
0 10011101
bbbab 0 10011100
0 10011011
0 10011010
bbbaa 0 10011001
0 10011000
Output: b b b a
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
File compression: Decoding
Input: 1 0 0 1 1 1 1 0 1
0 10011111
bbba 0 100111101
0 100111100
0 10011101
bbbab 0 10011100
0 10011011
0 10011010
bbbaa 0 10011001
0 10011000
Output: b b b a
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 52 / 52
Summary
1 Introduction: Source Coding
2 Lossless Source Coding
3 Huffman Code
4 Kraft Inequality and Optimal Codes
5 Lempel-Ziv Code
6 Arithmetic Code
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 53 / 52
Thank You
Prof. B. K. Dey (IIT Bombay) Lossless Source Coding January 24, 2019 54 / 52