LZW Compression Algorithm Presented by : keyvan moazami Multimedia course Shamsipour college Spring 2015
Probability base algorithms problems  First approach  analyze a few books and estimate the probabilities of different letters  Then apply Huffman coding o probability depend on context  Adaptive encoding  two pass process  Huffman code customized (letter and group letter) o strategy is expensive but workable o symbol grouping (larger or smaller) o We need different approach for adaptive algorithm 1/14
Lempel-Ziv-Welch (LZW) algorithm  Dictionary base  Used to compress image file , Unix ‘compress’ command , …  Lossless  * Dictionary table never transmitted 2/14
How work LZW  Example : ASCII code.  Typically, every character is stored with 8 binary bits, allowing up to 256 unique symbols for the data .  This algorithm tries to extend this library to 9 to 12 bits per character. The new unique symbols are made up of combinations of symbols that occurred previously in the string.  Does not always compress well , especially with short, diverse strings. But is good for compressing redundant data, and does not have to save the new dictionary with the data :this method can both compress and uncompressing data . 3/14
Example :  I want to compress the following string : Thisisthe By default using uncompressed ASCII , this is equal to 01110100011010000110100101110011011010010111001101110100011010000110010101110011 Or 116 104 105 115 105 115 116 104 101 115 * That’s 80 bits. We can do better 4/14
Example : LZW Algorithm  I want to compress the following string : Thisisthe  Start adding pairs of symbols to dictionary of words .  When we reach a pair that we’ve already placed in the dictionary , replace that pair with our new dictionary value. For this tiny string , we will only need 9-bits , allowing up to 512 unique symbols . 5/14
6/14
Example : LZW Algorithm  I want to compress the following string : Thisisthe current next output Add to dictionary t 116 h 104 t 116 th 256 h 104 i 105 h 104 hi 257 i 105 s 115 i 105 is 258 s 115 i 105 s 115 si 259 i 105 s 115 ‘is’ is in the dictionary ! Check if ‘ist’ is is 258 t 116 is 258 ist 260 t 116 h 104 ‘th’ is in the dictionary ! Check if ‘the’ is th 256 e 101 th 256 the 261 e 101 - e 101 - 7/14
Example : LZW Algorithm  I want to compress the following string : Thisisthe current next output Add to dictionary t 116 h 104 t 116 th 256 h 104 i 105 h 104 hi 257 i 105 s 115 i 105 is 258 s 115 i 105 s 115 si 259 i 105 s 115 ‘is’ is in the dictionary ! Check if ‘ist’ is is 258 t 116 is 258 ist 260 t 116 h 104 ‘th’ is in the dictionary ! Check if ‘the’ is th 256 e 101 th 256 the 261 e 101 - e 101 - 8/14
Example : LZW Algorithm  I want to compress the following string : Thisisthe 116 104 105 115 258 256 101 Or 001110100 001101000 001101001 001110011 100000010 100000000 001100101 Notice that our symbols are now made up of 9 bits each instead of 8 . This is 63 bits long , instead of the original 80 bits. That’s 78.75 % of what it used to be ! Dictionary table not Save ! How decompress data ? 9/14
Uncompressing:  To uncompressing ,we do need to know how many bits were used in the compression, the one new information that is kept with the data when saved to a file (else, the number of bits is specified by a program or a user).  We can read in the characters and build the new dictionary the same way as we did for compressing .When we reach a symbol outside the ASCII range (>255},then we should have that value ready to be seen in the dictionary. 10/14
Example : decompressing LZW  I want to decompress the following string : 116 104 105 115 258 256 101 current next output Add to dictionary 116 104 116 116 104 (256) 104 105 104 104 105 (257) 105 115 105 105 115 (258) 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) 101 - 101 - 11/14
Example : decompressing LZW  I want to decompress the following string : 116 104 105 115 258 256 101 current next output Add to dictionary 116 104 116 116 104 (256) 104 105 104 104 105 (257) 105 115 105 105 115 (258) 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) 101 - 101 - 116 104 105 115 105 115 116 104 101 Or “thisisthe” 12/14
Example : decompressing LZW  I want to decompress the following string : 116 104 105 115 258 256 101 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) Why didn’t we add 258 256 ( 105 115 116 104 ) to the dictionary at (260) ? Weird case : remember , dictionary is made with combinations one character at a time , so you would only add the first symbol of the combination to the current combination when rebuilding the dictionary . 13/14
Thank you

Lzw algorithm

  • 1.
    LZW Compression Algorithm Presentedby : keyvan moazami Multimedia course Shamsipour college Spring 2015
  • 2.
    Probability base algorithmsproblems  First approach  analyze a few books and estimate the probabilities of different letters  Then apply Huffman coding o probability depend on context  Adaptive encoding  two pass process  Huffman code customized (letter and group letter) o strategy is expensive but workable o symbol grouping (larger or smaller) o We need different approach for adaptive algorithm 1/14
  • 3.
    Lempel-Ziv-Welch (LZW) algorithm Dictionary base  Used to compress image file , Unix ‘compress’ command , …  Lossless  * Dictionary table never transmitted 2/14
  • 4.
    How work LZW Example : ASCII code.  Typically, every character is stored with 8 binary bits, allowing up to 256 unique symbols for the data .  This algorithm tries to extend this library to 9 to 12 bits per character. The new unique symbols are made up of combinations of symbols that occurred previously in the string.  Does not always compress well , especially with short, diverse strings. But is good for compressing redundant data, and does not have to save the new dictionary with the data :this method can both compress and uncompressing data . 3/14
  • 5.
    Example :  Iwant to compress the following string : Thisisthe By default using uncompressed ASCII , this is equal to 01110100011010000110100101110011011010010111001101110100011010000110010101110011 Or 116 104 105 115 105 115 116 104 101 115 * That’s 80 bits. We can do better 4/14
  • 6.
    Example : LZWAlgorithm  I want to compress the following string : Thisisthe  Start adding pairs of symbols to dictionary of words .  When we reach a pair that we’ve already placed in the dictionary , replace that pair with our new dictionary value. For this tiny string , we will only need 9-bits , allowing up to 512 unique symbols . 5/14
  • 7.
  • 8.
    Example : LZWAlgorithm  I want to compress the following string : Thisisthe current next output Add to dictionary t 116 h 104 t 116 th 256 h 104 i 105 h 104 hi 257 i 105 s 115 i 105 is 258 s 115 i 105 s 115 si 259 i 105 s 115 ‘is’ is in the dictionary ! Check if ‘ist’ is is 258 t 116 is 258 ist 260 t 116 h 104 ‘th’ is in the dictionary ! Check if ‘the’ is th 256 e 101 th 256 the 261 e 101 - e 101 - 7/14
  • 9.
    Example : LZWAlgorithm  I want to compress the following string : Thisisthe current next output Add to dictionary t 116 h 104 t 116 th 256 h 104 i 105 h 104 hi 257 i 105 s 115 i 105 is 258 s 115 i 105 s 115 si 259 i 105 s 115 ‘is’ is in the dictionary ! Check if ‘ist’ is is 258 t 116 is 258 ist 260 t 116 h 104 ‘th’ is in the dictionary ! Check if ‘the’ is th 256 e 101 th 256 the 261 e 101 - e 101 - 8/14
  • 10.
    Example : LZWAlgorithm  I want to compress the following string : Thisisthe 116 104 105 115 258 256 101 Or 001110100 001101000 001101001 001110011 100000010 100000000 001100101 Notice that our symbols are now made up of 9 bits each instead of 8 . This is 63 bits long , instead of the original 80 bits. That’s 78.75 % of what it used to be ! Dictionary table not Save ! How decompress data ? 9/14
  • 11.
    Uncompressing:  To uncompressing,we do need to know how many bits were used in the compression, the one new information that is kept with the data when saved to a file (else, the number of bits is specified by a program or a user).  We can read in the characters and build the new dictionary the same way as we did for compressing .When we reach a symbol outside the ASCII range (>255},then we should have that value ready to be seen in the dictionary. 10/14
  • 12.
    Example : decompressingLZW  I want to decompress the following string : 116 104 105 115 258 256 101 current next output Add to dictionary 116 104 116 116 104 (256) 104 105 104 104 105 (257) 105 115 105 105 115 (258) 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) 101 - 101 - 11/14
  • 13.
    Example : decompressingLZW  I want to decompress the following string : 116 104 105 115 258 256 101 current next output Add to dictionary 116 104 116 116 104 (256) 104 105 104 104 105 (257) 105 115 105 105 115 (258) 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) 101 - 101 - 116 104 105 115 105 115 116 104 101 Or “thisisthe” 12/14
  • 14.
    Example : decompressingLZW  I want to decompress the following string : 116 104 105 115 258 256 101 115 258 115 115 105 115 (259) 258 256 105 115 105 115 116 (260) 256 101 116 104 116 104 101 (261) Why didn’t we add 258 256 ( 105 115 116 104 ) to the dictionary at (260) ? Weird case : remember , dictionary is made with combinations one character at a time , so you would only add the first symbol of the combination to the current combination when rebuilding the dictionary . 13/14
  • 15.