MU-MIT Arithmetic Coding GROUP MEMBERS 1.Gidey Leul 6/18/2017 1ECE /MIT/2009
CONTENTS 1. What is Source Coding general concept? 2. What are Commonly Used Data compression algorithms? 3. Problems of Huffman and the need of Arithmetic coding , general comparison? 4. Arithmetic Encoding. 5. Arithmetic Decoding. 6/18/2017 2ECE /MIT/2009
1.GENERAL CONCEPT OF SOURCE CODING Source Coding : encoding information using fewer bits than the original representation. Importance of Data Compression is : A technique to reduce the quantity of data . preserve quality of themultimedia data. Allows more bytes to be packed than uncompressed. Saving storage ,saving Bandwidth . Reduce file transfer time . quick encode & send . i.e,mp4->mp3. 6/18/2017 3ECE /MIT/2009
CONTND…. 6/18/2017 4ECE /MIT/2009
CONTND….. Compression can be either: I. LOSSY or II. lossles FIGURE 1.2 Comparision Loosy vs lossless 6/18/2017 5ECE /MIT/2009
CONTND….. Figure 1.3 Quality comparison for lossy system o What is Lossy Compression?  after compression ,file cannot recover.  reduce image size .  redundant information lost.  i.e -mp3,JPEG,wav 6/18/2017 6ECE /MIT/2009
Contnd… Reserved original , word press. text 6/18/2017 7ECE /MIT/2009
2.COMMONLY USED COMPRESSING ALGORITHMS • i.Huffman Coding • assign short codewords to those input blocks with high probabilities and long codewords to those with low probabilities. 6/18/2017 8ECE /MIT/2009
CONTND….. ii. Run Length Encoding Algorithm:  Run length encoding is simple form of data compression.  consecutive runs of data are stored as single data value .  Each count(indicating how many times that data is repeating) .  It is useful for compressing simple graphic images . EXAMPLE: Input: ZZZZZZZZZZZZCZZZZZZZZZZZZCCCZZZZZZZZZZZZZZZZZZZZZZZZC Output: 12ZC12Z3C24ZC 6/18/2017 9ECE /MIT/2009
III.ARITHMETIC CODING  Mathematical compression. Based on the coding of a input sequence using a rational number in ranges [0,1).  Doesn´t use a discrete number of bits for each. The main idea behind Arithmetic coding is to assign each symbol an interval. 6/18/2017 10ECE /MIT/2009
3.PROBLEMS OF HUFFMAN AND THE NEED OF ARITHMETIC CODING , GENERAL COMPARISON? ARITHMETIC CODING • Huffman coding  Less efficiency  less edible. ……… .....in contrast Faster Enough storage • Arithmetic Coding  redundancy much reduced. Can be used with any model conjunction , adaptiveness ,sharp in any model. …………………………Dis advantage AC  Too slow because mathematical operations . Significant amount of memory. 6/18/2017 11ECE /MIT/2009
CONTND…. 1.5 Figure comparison between Huffman And Arithmetic coding 6/18/2017 12ECE /MIT/2009
4.ARITHMETIC ENCODING STEPS • To code symbol s ,where symbols are numbered from 1 to n and symbol I has the probability pr[i]; • low bound = 𝑖=0𝑝𝑟 𝑠−1 [𝑖] • High bound = 𝑖=0𝑝𝑟 𝑠−1 [𝑖] • Range=high-low • Low=low + range *(low bound) • High=low + range *(high bound) 6/18/2017 13ECE /MIT/2009
CONTND---- • Consider encoding the name MIT CAMPAS Again, we need the frequency of all the characters in the text • char freq. • Space 0.1 • A 0.2 • C 0.1 • I 0.1 • M 0.2 • P 0.1 • S 0.1 • T 0.1 6/18/2017 14ECE /MIT/2009
CONTND---- • . character probability range • space 0.1 [0.00, 0.10) • A 0.2 [0.10, 0.30) • C 0.1 [0.30, 0.40) • I 0.1 [0.40, 0.50) • M 0.2 [0.50, 0.70) • P 0.1 [0.70, 0.80) • S 0.1 [0.80, 0.90) • T 0.1 [0.90, 1.00) • 6/18/2017 15ECE /MIT/2009
CONTND---- • . ENCODING THEWORD MIT CAMPAS • chr low high • 0.0 1.0 • M 0.5 0.7 • I 0.54 0.55 • T 0.549 0.550 • Space 0.5490 0.5491 • C 0.54903 0.54941 • A 0.549301 0.549033 • M 0.5493015 0.5493017 • P 0.54930164 0.54930166 A 0.549301643 0.549316466 S 0.5493016438 0.5493016439 6/18/2017 16ECE /MIT/2009
CONTND • .The final low value, 0.5493016438 will uniquely encode the name MIT CAMPAS. • which in binary is approximately [0.11010 00000, 0.11010 01100).We can uniquely identify this interval by outputting 1101000. • Another example Encoding suppose the alphabet is (a, e, i, O, u, !I, and a fixed model is used with probabilities shown in Table I. Imagine trans mitting the message eaii! .  Initially, both encoder and decoder know that the range is [0, 1).  After seeing the first symbol, e, the encoder narrows it to [0.2, 04, the range the model allocates to this symbol. The second symbol, a, will narrow this new range to the first one-fifth of it, 6/18/2017 17ECE /MIT/2009
CONTND… • . Figure 1.7 frequency of alphabets 6/18/2017 18ECE /MIT/2009
CONTND… • 1.8 Graphical representation of Arithmetic coding. 6/18/2017 19ECE /MIT/2009
HOW THE ARITHMETIC DECODER WORKS?  Decoder detects the last suffix o.23355. Relative to the fixed model of Table I, .The entropy of the five-symbol message eaii! Is taking logarithm of each term would be 4.22 . 6/18/2017 20ECE /MIT/2009
GENERALLY……..  Arithmetic coding typically has a better compression ratio than Huffman coding, as it produces a single symbol rather than several separate code words.  Arithmetic coding is a lossless coding technique. Few disadvantages of arithmetic coding. I . whole code word must be received to start decoding the symbols.  If corrupt bit in the code word, the entire message could become corrupt. II .There is a limit to the precision of the number which can be encoded, thus limiting the number of symbols to encode within a code word.  There also exists many patents upon arithmetic coding, so the use of some of the algorithms also call upon royalty fees. 6/18/2017 21ECE /MIT/2009
CONTND… 6/18/2017 22ECE /MIT/2009

Arithmetic coding

  • 1.
  • 2.
    CONTENTS 1. What isSource Coding general concept? 2. What are Commonly Used Data compression algorithms? 3. Problems of Huffman and the need of Arithmetic coding , general comparison? 4. Arithmetic Encoding. 5. Arithmetic Decoding. 6/18/2017 2ECE /MIT/2009
  • 3.
    1.GENERAL CONCEPT OF SOURCECODING Source Coding : encoding information using fewer bits than the original representation. Importance of Data Compression is : A technique to reduce the quantity of data . preserve quality of themultimedia data. Allows more bytes to be packed than uncompressed. Saving storage ,saving Bandwidth . Reduce file transfer time . quick encode & send . i.e,mp4->mp3. 6/18/2017 3ECE /MIT/2009
  • 4.
  • 5.
    CONTND….. Compression can be either: I.LOSSY or II. lossles FIGURE 1.2 Comparision Loosy vs lossless 6/18/2017 5ECE /MIT/2009
  • 6.
    CONTND….. Figure 1.3 Qualitycomparison for lossy system o What is Lossy Compression?  after compression ,file cannot recover.  reduce image size .  redundant information lost.  i.e -mp3,JPEG,wav 6/18/2017 6ECE /MIT/2009
  • 7.
    Contnd… Reserved original, word press. text 6/18/2017 7ECE /MIT/2009
  • 8.
    2.COMMONLY USED COMPRESSING ALGORITHMS •i.Huffman Coding • assign short codewords to those input blocks with high probabilities and long codewords to those with low probabilities. 6/18/2017 8ECE /MIT/2009
  • 9.
    CONTND….. ii. Run LengthEncoding Algorithm:  Run length encoding is simple form of data compression.  consecutive runs of data are stored as single data value .  Each count(indicating how many times that data is repeating) .  It is useful for compressing simple graphic images . EXAMPLE: Input: ZZZZZZZZZZZZCZZZZZZZZZZZZCCCZZZZZZZZZZZZZZZZZZZZZZZZC Output: 12ZC12Z3C24ZC 6/18/2017 9ECE /MIT/2009
  • 10.
    III.ARITHMETIC CODING  Mathematicalcompression. Based on the coding of a input sequence using a rational number in ranges [0,1).  Doesn´t use a discrete number of bits for each. The main idea behind Arithmetic coding is to assign each symbol an interval. 6/18/2017 10ECE /MIT/2009
  • 11.
    3.PROBLEMS OF HUFFMANAND THE NEED OF ARITHMETIC CODING , GENERAL COMPARISON? ARITHMETIC CODING • Huffman coding  Less efficiency  less edible. ……… .....in contrast Faster Enough storage • Arithmetic Coding  redundancy much reduced. Can be used with any model conjunction , adaptiveness ,sharp in any model. …………………………Dis advantage AC  Too slow because mathematical operations . Significant amount of memory. 6/18/2017 11ECE /MIT/2009
  • 12.
    CONTND…. 1.5 Figure comparisonbetween Huffman And Arithmetic coding 6/18/2017 12ECE /MIT/2009
  • 13.
    4.ARITHMETIC ENCODING STEPS •To code symbol s ,where symbols are numbered from 1 to n and symbol I has the probability pr[i]; • low bound = 𝑖=0𝑝𝑟 𝑠−1 [𝑖] • High bound = 𝑖=0𝑝𝑟 𝑠−1 [𝑖] • Range=high-low • Low=low + range *(low bound) • High=low + range *(high bound) 6/18/2017 13ECE /MIT/2009
  • 14.
    CONTND---- • Consider encodingthe name MIT CAMPAS Again, we need the frequency of all the characters in the text • char freq. • Space 0.1 • A 0.2 • C 0.1 • I 0.1 • M 0.2 • P 0.1 • S 0.1 • T 0.1 6/18/2017 14ECE /MIT/2009
  • 15.
    CONTND---- • . characterprobability range • space 0.1 [0.00, 0.10) • A 0.2 [0.10, 0.30) • C 0.1 [0.30, 0.40) • I 0.1 [0.40, 0.50) • M 0.2 [0.50, 0.70) • P 0.1 [0.70, 0.80) • S 0.1 [0.80, 0.90) • T 0.1 [0.90, 1.00) • 6/18/2017 15ECE /MIT/2009
  • 16.
    CONTND---- • . ENCODINGTHEWORD MIT CAMPAS • chr low high • 0.0 1.0 • M 0.5 0.7 • I 0.54 0.55 • T 0.549 0.550 • Space 0.5490 0.5491 • C 0.54903 0.54941 • A 0.549301 0.549033 • M 0.5493015 0.5493017 • P 0.54930164 0.54930166 A 0.549301643 0.549316466 S 0.5493016438 0.5493016439 6/18/2017 16ECE /MIT/2009
  • 17.
    CONTND • .The finallow value, 0.5493016438 will uniquely encode the name MIT CAMPAS. • which in binary is approximately [0.11010 00000, 0.11010 01100).We can uniquely identify this interval by outputting 1101000. • Another example Encoding suppose the alphabet is (a, e, i, O, u, !I, and a fixed model is used with probabilities shown in Table I. Imagine trans mitting the message eaii! .  Initially, both encoder and decoder know that the range is [0, 1).  After seeing the first symbol, e, the encoder narrows it to [0.2, 04, the range the model allocates to this symbol. The second symbol, a, will narrow this new range to the first one-fifth of it, 6/18/2017 17ECE /MIT/2009
  • 18.
    CONTND… • . Figure 1.7frequency of alphabets 6/18/2017 18ECE /MIT/2009
  • 19.
    CONTND… • 1.8 Graphicalrepresentation of Arithmetic coding. 6/18/2017 19ECE /MIT/2009
  • 20.
    HOW THE ARITHMETICDECODER WORKS?  Decoder detects the last suffix o.23355. Relative to the fixed model of Table I, .The entropy of the five-symbol message eaii! Is taking logarithm of each term would be 4.22 . 6/18/2017 20ECE /MIT/2009
  • 21.
    GENERALLY……..  Arithmetic codingtypically has a better compression ratio than Huffman coding, as it produces a single symbol rather than several separate code words.  Arithmetic coding is a lossless coding technique. Few disadvantages of arithmetic coding. I . whole code word must be received to start decoding the symbols.  If corrupt bit in the code word, the entire message could become corrupt. II .There is a limit to the precision of the number which can be encoded, thus limiting the number of symbols to encode within a code word.  There also exists many patents upon arithmetic coding, so the use of some of the algorithms also call upon royalty fees. 6/18/2017 21ECE /MIT/2009
  • 22.