5 - Audio compression (mp3 etc.)

5 - Audio compression (mp3 etc.) Jul 5, 2011 13:55:32 GMT

Quote

Post by JohnG on Jul 5, 2011 13:55:32 GMT

MP3, MiniDisc, AAC, WMA and all that audio compression stuff.

Having taken a brief look in previous articles at MIDI and wav files, now it's time to take a simple look at some of the mechanisms used to compress audio files that have been digitised and I'll begin with mp3.

As usual I'll start with a simple definition. What is an mp3 file? It's a file containing digitised audio (0s and 1s) that represents something like the original wave file that it was derived from. How closely it represents the original will depend upon three factors, first the bit rate that we decide to encode at, 64kbps (kilo bits per second), 128Kbps or faster (the higher the bit rate the better the quality), second the quality of the piece of software or hardware that is doing the compression and third the complexity of the audio file we are trying to encode.

As a general rule, from my experience, the longer the encoder takes to convert from wav to mp3 the better the resulting sound quality. In other words be patient when converting files. It's a very complex process.

What we end up with is a file (filename.mp3) that can be played on a computer (with the appropriate application) or, more likely, end up on some sort of portable device with a big memory chip or even a tiny hard disk within it and a set of earphones or miniature loudspeakers. It could also be burnt to CD or DVD and played on a player with mp3 capabilities e.g some Sony CD Walkman's etc.

Now for a definition, mp3 stands for MPEG1 (Motion Picture Experts Group) audio layer III. It was originally developed by Frauenhofer IIS. This international standard defines how the sound track of a film is digitally encoded. It has however become a commonly used standard for recording compressed audio for home use.

So, we may start with a wav file, sampled at 44,100 samples per second, with 16 bits per sample in stereo i.e. an audio CD quality track. We then put this file through an encoder (usually software) which splits the incoming audio signal into several frequency bands, analyses each band seperately and throws away the information that the human ear is less likely to be able to hear (psychoacoustic encoding). As it processes the audio it puts little markers in the file (headers) followed by chunks (frames) of encoded audio. These headers tell the mp3 player what to do, i.e. how to process, the frame of data that follows. The software assembles all these frames with their headers into the right sequence and we end up with an mp3 file.

Yup, it throws away a large quantity of the original signal. That's how we get from the original 1,411,200 bits/sec. to only 128,000 bits/sec for example. It is therefore known as a "lossy" encoding technique. This discarded information can never be recovered. So if, at a later stage, we wanted to create a wav file from the mp3 it won't, repeat will not be the same as the original wav file. If we then reprocess that wav file to create another mp3 at a different bit rate the process of throwing away data will be repeated. So once a file has been compressed once using the mp3 process don't reprocess it.

Here are a couple of references for those who want to dig deeper:
en.wikipedia.org/wiki/Audio_data_compression and en.wikipedia.org/wiki/MP3

Okay. That'll do for mp3. What about the others e.g. MiniDisc? Well, this one is a similar encoding system developed by Sony that uses another psychoacaustic system called ATRAC (Adaptive TRansform Acoustic Coding). It is more modern than mp3 and claims to offer better quality for the same bit rate. The more modern players use a system called ATRAC3. See en.wikipedia.org/wiki/Atrac.

We also have AAC (Advanced Audio Coding) a later (better?) encoding method than mp3 also developed by Frauenhofer IIS. Some mp3 players also support this. See en.wikipedia.org/wiki/Advanced_Audio_Coding and last but not least, WMA (Windows Media Audio) a later system developed by Microsoft also claiming improved performance over mp3. See en.wikipedia.org/wiki/Advanced_Audio_Coding.

To my ears, at least, these later systems generally do sound better. For me music mp3s slower than 128kbps don't sound too good unless it's just a voice soundtrack. For classical music for my own use I encode at a minimum of 256kbps! MiniDisc, to me, sounds pretty reasonable for most things but "Yer pays yer money and yer takes yer choice!"

What about lossless audio compression? Yes, it can be done. Audio, unlike regular data, cannot be massively compressed as there are very few repeating patterns of data within it. (Zip compression relies upon repeating patterns as one of several lossless techniques). I know of three systems, Monkey's Audio, FLAC and Shorten. They are typically only able to compress by a little more than 50%. Many audio engineers use one of these techniques for archiving recordings. I use Monkey's audio. You can find more about it here: en.wikipedia.org/wiki/Audio_data_compression.

Well, I hope that's helped some of you. Any questions, fire away.

© John L. Garside, 2007.

MIDI tutorials

MIDI tutorials by JohnG aka SysExJohn

5 - Audio compression (mp3 etc.)

Post by JohnG on Jul 5, 2011 13:55:32 GMT