Basics of Audio Coding Introduction, Basics and Building Blocks
INTRODUCTION • High quality audio compression has found its way from research to widespread applications within a couple of years. • Early research of 15 years ago was translated into standardization efforts of ISO/IEC and ITU-R 10 years ago. Since the finalization of MPEG-1 in 1992, many applications have been devised. • In the last couple of years, Internet audio delivery has emerged as a powerful category of applications. • These techniques made headline news in many parts of the world because of the potential to change the way of business for the music industry.
TECHNIQUES Currently, among others the following applications employ low bit-rate audio coding techniques: • Digital Audio Broadcasting (EUREKA DAB, WorldSpace, ISDB, DRM) • ISDN transmission of high quality audio for broadcast contribution and distribution purposes • Archival storage for broadcasting • Accompanying audio for digital TV (DVB, ATSC, Video CD, ARIB) • Internet streaming (RealAudio, Microsoft Netshow, Apple Quicktime and others) • Portable audio (mpman, mplayer3, Rio, Lyra, YEPP and others) • Storage and exchange of music files on computers
THE BASICS The basic task of a perceptual audio coding system is to compress the digital audio data in a way that • the compression is as efficient as possible, i.e. the compressed file is as small as possible and • the reconstructed (decoded) audio sounds exactly (or as close as possible) to the original audio before compression.
BUILDING BLOCKS-1 • Filter bank: A filter bank is used to decompose the input signal into subsampled spectral components (time/frequency domain). Together with the corresponding filter bank in the decoder it forms an analysis/synthesis system. • Perceptual model: Using either the time domain input signal and/or the output of the analysis filter bank, an estimate of the actual (time and frequency dependent) masking threshold is computed using rules known from psychoacoustics. This is called the perceptual model of the perceptual encoding system.
BUILDING BLOCKS-2 • Quantization and coding: The spectral components are quantized and coded with the aim of keeping the noise, which is introduced by quantizing, below the masking threshold. Depending on the algorithm, this step is done in very different ways, from simple block companding to analysis-by-synthesis systems using additional noiseless compression. • Encoding of bitstream: A bitstream formatter is used to assemble the bitstream, which typically consists of the quantized and coded spectral coefficients and some side information, e.g. bit allocation information.
Learn More in IIT Kharagpur's First Online Certificate Course on Image and Video Communication [Refer: http://goo.gl/hMyYWa ; courses@wiziq.com]

Basics of audio coding

  • 1.
  • 2.
    INTRODUCTION • High qualityaudio compression has found its way from research to widespread applications within a couple of years. • Early research of 15 years ago was translated into standardization efforts of ISO/IEC and ITU-R 10 years ago. Since the finalization of MPEG-1 in 1992, many applications have been devised. • In the last couple of years, Internet audio delivery has emerged as a powerful category of applications. • These techniques made headline news in many parts of the world because of the potential to change the way of business for the music industry.
  • 3.
    TECHNIQUES Currently, among othersthe following applications employ low bit-rate audio coding techniques: • Digital Audio Broadcasting (EUREKA DAB, WorldSpace, ISDB, DRM) • ISDN transmission of high quality audio for broadcast contribution and distribution purposes • Archival storage for broadcasting • Accompanying audio for digital TV (DVB, ATSC, Video CD, ARIB) • Internet streaming (RealAudio, Microsoft Netshow, Apple Quicktime and others) • Portable audio (mpman, mplayer3, Rio, Lyra, YEPP and others) • Storage and exchange of music files on computers
  • 4.
    THE BASICS The basictask of a perceptual audio coding system is to compress the digital audio data in a way that • the compression is as efficient as possible, i.e. the compressed file is as small as possible and • the reconstructed (decoded) audio sounds exactly (or as close as possible) to the original audio before compression.
  • 5.
    BUILDING BLOCKS-1 • Filterbank: A filter bank is used to decompose the input signal into subsampled spectral components (time/frequency domain). Together with the corresponding filter bank in the decoder it forms an analysis/synthesis system. • Perceptual model: Using either the time domain input signal and/or the output of the analysis filter bank, an estimate of the actual (time and frequency dependent) masking threshold is computed using rules known from psychoacoustics. This is called the perceptual model of the perceptual encoding system.
  • 6.
    BUILDING BLOCKS-2 • Quantizationand coding: The spectral components are quantized and coded with the aim of keeping the noise, which is introduced by quantizing, below the masking threshold. Depending on the algorithm, this step is done in very different ways, from simple block companding to analysis-by-synthesis systems using additional noiseless compression. • Encoding of bitstream: A bitstream formatter is used to assemble the bitstream, which typically consists of the quantized and coded spectral coefficients and some side information, e.g. bit allocation information.
  • 7.
    Learn More in IIT Kharagpur's FirstOnline Certificate Course on Image and Video Communication [Refer: http://goo.gl/hMyYWa ; courses@wiziq.com]