z Algorithm Presentation Topic : Huffman Coding
z Contents  1.Definition  2.Simulation  3.Time Complexity  4.Application  5.Advantage/Disadvantage
z Definition  Huffman Coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters.  The most frequent character gets the smallest code and the least frequent character gets the largest code.
z Message- BCCABBDDAECCBBAEDDCC Length-20 Letter ASCII Code Binary Form A 65 01000001 B 66 01000010 C 67 01000011 D 68 01000100 E 69 01000101 • For 1 Letter We need 8 bits • For 20 Letters We need 8×20=160 bits In Electric components the alphabet is sent through ASCII code . The ASCII code letter capital A is 65 and we need 8 bis binary to convert 65.
z Simulation A 3 B 5 C 6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 Message : BCCABBDDAECCBBAEDDCC First of all place the counts in increasing order then take minimum and add them now the root node of letter E and A is 5 CountLetter
z Simulation A 3 B 5 C 6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 Message : BCCABBDDAECCBBAEDDCC Then between the root node 5 and counts take the minimum and add them up . Here 5 and 4 are minimum so we add them and make 9 as root node so continue the process.
z Simulation A 3 B 5 C 6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC
z Simulation E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 Mark the left hand edges as 0 and right hand edges as 1 and then traverse from root node to any letter . Suppose we want to go A from root node so the distance will be 001, for B 10 and so on.
z Simulation A B C D E E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 001 10 11 01 000 For A we need 3 bits ,B 2 bits , C 2 bits , D 2 bits, E 3 bits For A, Count =3 So total bits 3*3=9 bits And so on
z Simulation A 3 B 5 C 6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 001 3×3=9 10 5×2=10 11 6×2=12 01 4×2=8 000 2×3=6 Total=45 bits As we see first we do need 160 bits and now we need 45 bits now we have compressed the cost and size.
z Time Complexity  Time complexity of Huffman Coding is O(nlogn) ,where n is the number of unique characters.
z Application Generic File Compression:  Files: GZIP, BZIP, 7Z  Archives : 7z  File System : NTFS,FS+,ZFS Multimedia :  Image : GIF, ZPEG  Sound : Mp3  Video : MPEG, HDTV Communication :  ITU-T T4 Group 3 Fax  V.42 Bis modem  Skype Databases : Google,Facebook,…
z Advantages/Disadvantages Advantages : The Huffman Coding has the minimum average length. Easy to implement and fast. Disadvantages : Requires two passes over the two input (one to compute frequencies, one for coding),thus encoding is slow. Requires storing the Huffman codes(or at least character frequencies)in the encoded file, thus reducing the compression benefit obtained by encoding.
z Thank You

Huffman Coding Algorithm Presentation

  • 1.
  • 2.
    z Contents  1.Definition  2.Simulation 3.Time Complexity  4.Application  5.Advantage/Disadvantage
  • 3.
    z Definition  Huffman Codingis a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters.  The most frequent character gets the smallest code and the least frequent character gets the largest code.
  • 4.
    z Message- BCCABBDDAECCBBAEDDCC Length-20 LetterASCII Code Binary Form A 65 01000001 B 66 01000010 C 67 01000011 D 68 01000100 E 69 01000101 • For 1 Letter We need 8 bits • For 20 Letters We need 8×20=160 bits In Electric components the alphabet is sent through ASCII code . The ASCII code letter capital A is 65 and we need 8 bis binary to convert 65.
  • 5.
    z Simulation A 3 B 5 C6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 Message : BCCABBDDAECCBBAEDDCC First of all place the counts in increasing order then take minimum and add them now the root node of letter E and A is 5 CountLetter
  • 6.
    z Simulation A 3 B 5 C6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 Message : BCCABBDDAECCBBAEDDCC Then between the root node 5 and counts take the minimum and add them up . Here 5 and 4 are minimum so we add them and make 9 as root node so continue the process.
  • 7.
    z Simulation A 3 B 5 C6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC
  • 8.
    z Simulation E-2 A-3 D-4B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 Mark the left hand edges as 0 and right hand edges as 1 and then traverse from root node to any letter . Suppose we want to go A from root node so the distance will be 001, for B 10 and so on.
  • 9.
    z Simulation A B C D E E-2 A-3 D-4B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 001 10 11 01 000 For A we need 3 bits ,B 2 bits , C 2 bits , D 2 bits, E 3 bits For A, Count =3 So total bits 3*3=9 bits And so on
  • 10.
    z Simulation A 3 B 5 C6 D 4 E 2 E-2 A-3 D-4 B-5 C-6 5 9 11 20 Message : BCCABBDDAECCBBAEDDCC 0 0 0 0 1 1 1 1 001 3×3=9 10 5×2=10 11 6×2=12 01 4×2=8 000 2×3=6 Total=45 bits As we see first we do need 160 bits and now we need 45 bits now we have compressed the cost and size.
  • 11.
    z Time Complexity Time complexity of Huffman Coding is O(nlogn) ,where n is the number of unique characters.
  • 12.
    z Application Generic FileCompression:  Files: GZIP, BZIP, 7Z  Archives : 7z  File System : NTFS,FS+,ZFS Multimedia :  Image : GIF, ZPEG  Sound : Mp3  Video : MPEG, HDTV Communication :  ITU-T T4 Group 3 Fax  V.42 Bis modem  Skype Databases : Google,Facebook,…
  • 13.
    z Advantages/Disadvantages Advantages : The HuffmanCoding has the minimum average length. Easy to implement and fast. Disadvantages : Requires two passes over the two input (one to compute frequencies, one for coding),thus encoding is slow. Requires storing the Huffman codes(or at least character frequencies)in the encoded file, thus reducing the compression benefit obtained by encoding.
  • 14.