Subsections
node6.html#SECTION00311000000000000000
The purpose of audio compression
node6.html#SECTION00312000000000000000
The two parts of audio compression
node6.html#SECTION00313000000000000000
Compression ratios, bitrate and
quality
Introduction
There is a lot of confusion surrounding the terms
audio compression
,
audio
encoding
, and
audio decoding
. This section will give you an overview
what audio coding (another one of these terms...) is all about.
The purpose of audio compression
Up to the advent of audio compression, high-quality digital audio data took
a lot of hard disk space to store. Let us go through a short example.
You want to, say, sample your favorite 1-minute song and store it on your
harddisk. Because you want CD quality, you sample at 44.1 kHz, stereo,
with 16 bits per sample.
44100 Hz means that you have 44100 values per second coming in from your sound
card (or input file). Multiply that by two because you have two channels. Multiply
by another factor of two because you have two bytes per value (that's what 16
bit means). The song will take up
44100 samples/s · 2 channels
· 2 bytes/sample · 60 s/min ~ 10 MBytes
of storage space
on your harddisk.
If you wanted to download that over the internet, given an average 56k modem
connected at 44k (which is a typical case), it would take you (at least)
10000000
bytes · 8 bits/byte / (44000 bits/s) · / (60 s/min) ~ 30 minutes
Just to download one minute of music!
Digital audio coding, which - in this context - is synonymously called digital
audio compression as well, is the art of minimizing storage space (or channel
bandwidth) requirements for audio data. Modern perceptual audio coding techniques
(like MPEG Layer III) exploit the properties of the human ear (the perception
of sound) to achieve a size reduction by a factor of 11 with little or no perceptible
loss of quality.
Therefore, such schemes are the key technology for high quality low bit-rate
applications, like soundtracks for CD-ROM games, solid-state sound memories,
Internet audio, digital audio broadcasting systems, and the like.
The two parts of audio compression
Audio compression really consists of two parts. The first part, called
encoding
,
transforms the digital audio data that resides, say, in a WAVE file, into a
highly compressed form called
bitstream
. To play the bitstream on your
soundcard, you need the second part, called
decoding
. Decoding takes
the bitstream and re-expands it to a WAVE file.
The program that effects the first part is called an audio
encoder
.
LAME
is such an encoder . The program that does the second part is called
an audio
decoder
. One well-known MPEG Layer III decoder is
Xmms
,
another
mpg123
. Both can be found on
http://www.mp3-tech.org
ww.mp3-tech.org
.
Compression ratios, bitrate and quality
It has not been explicitly mentioned up to now: What you end up with after
encoding and decoding is not the same sound file anymore: All superflous information
has been squeezed out, so to say. It is not the same
file
, but it will
sound
the same - more or less, depending on how much compression had
been performed on it.
Generally speaking, the lower the compression ratio achieved, the better the
sound quality will be in the end - and
vice versa
. Table
node6.html#table-soundq
1.1
gives you an overview about quality achievable.
Because compression ratio is a somewhat unwieldy measure, experts use the
term
bitrate
when speaking of the strength of compression. Bitrate denotes
the average number of bits that one second of audio data will take up in your
compressed bitstream. Usually the units used will be kbps, which is
kbits
/
s
,
or 1000
bits
/
s
. To calculate the number of bytes
per second of audio data, simply divide the number of bits per second by eight.
Table 1.1:
Bitrate versus sound quality
Bitrate
Bandwidth
Quality comparable to or better than
16 kbps
4.5 kHz
shortwave radio
32 kbps
7.5 kHz
AM radio
96 kbps
11 kHz
FM radio
128 kbps
16 kHz
near CD
160-180 kbps
(variable bitrate)
20 kHz
perceptual transparency
256 kbps
22 kHz
studio
