How bad is mp3 really?

I'm convinced that mp3 has gotten a bad reputation for quality because it's often misused by the rip-and-share crowd, and because there is some poor encoding software out there. When done right, it can be tolerable. And there's a reasonable way to evaluate exactly what mp3 is doing to your sound.

Bear in mind, mp3 is a data encoding standard (ISO-MPEG Audio Layer-3, IS 11172-3 and IS 13818-3), not a particular algorithm or piece of software. When it first appeared in the late 1980s, mp3-encoding software was limited by what was then practical in desktop computers or DSP chips, and this affected the sound. As computers and chips got more powerful, software designers could use more processing cycles to get better quality.

There are more powerful standards now, such as AAC, but mp3 survives for more than just casual music listening. In fact, it's often the default format used by broadcast and film professionals (including yours truly) for real-time audio transfers over ISDN lines, using multi-thousand dollar dedicated hardware codecs.

Look what they've done to my song

Any psychoacoustic data reduction scheme, including mp3 and AAC, has to clobber some audio data; these aren't lossless compression like PKZIP. In the most basic terms, mp3 analyzes the incoming audio by modeling human hearing, and attempts to remove the details that most people's nervous systems would never pass to their brains. It then applies a zip-like algorithm to what remains. There's a more complete explanation at the Fraunhofer Institute... they invented the format. (AAC builds on mp3, by applying an additional set of rules about human hearing.)

The technology is scalable, in that you can tell the encoder how much detail you're willing to throw away. This is what the BITRATE and SAMPLE RATE settings in mp3 software do. Reasonably, the lower those rates, the smaller the files and more sacrifice to the audio. At extreme settings (8 kbps, 6 kHz sampling) the resulting file is slightly more than 1% of its original size. Of course, nobody would pipe important sound through something that extreme. But portable music players often run at 64 or 128 kbps... a significant reduction, when you consider that normal music CDs have a data rate of 1,411 kbps (after error correction and other overhead; the gross rate is around 4 megabits).

How important is the data that mp3 takes away at those settings? Will having it missing hurt your soundtrack experience?

The Tests

To find out, I made a montage of male and female speech and singing, along with music in various genres. It's excerpted from the CD that comes with my book Audio Postproduction. The music is from DeWolfe Music, New York, protected by copyright, and used by permission.

I encoded the montage as an mp3 at various bitrates. I then compared the decoded results to the original mathematically, by subtracting them from the original. This is actually a fairly simple technique of inverting the polarity of one file and combining it with the other.

You can understand the technique easily with a graphic example. Let's say my original file is a string of letters...

We run that file through data reduction software - in this case, an imaginary visual equivalent of mp3, set to remove all vowels:

In order to tell exactly what's been taken away, we flip the polarity of the compressed version - making black into white, and white into black:

Now it's just a simple matter of mixing them, so the white letters erase their black equivalents. What's left is what the compression took away:

Piece of cake.

In audio it's almost as simple, using a multi-track audio program that lets you flip the polarity of one channel. I did it in the cross-platform, open source Audacity.

Hear the results

As a baseline, listen to the original montage. (This is a high-bitrate mp3 version to save download time, but trust me: I did the experiments using the original, uncompressed 16 bit 44.1 kHz CD file.)

I turned it into a 128 kbps mp3 file, decoded it, and compared the two. Here's what's left, or what the mp3 process took away from the original. You'll have to listen carefully: it's not much. (This and the following files use Apple Lossless encoding - which doesn't affect audio quality at all. It requires QuickTime 6.0 or higher, free for Mac or Windows from Apple.)

Even at 64 kbps, you don't lose much. Here's the result.

You can hear a difference when encoded at 32 kbps (like this), but notice that it's strictly high frequencies. That's because the standard for 32 kbps lowers the sample rate as well. But nobody recommends this bitrate for music.

Do it yourself

Feel free to try this with your own source material. If you want to replicate my tests exactly, use an encoder that uses the open source LAME library. (I did mine in Bias Peak.)

Also, it's absolutely critical that you align the original and decoded versions perfectly, to the sample. Audacity makes this easy by zooming and slipping one track against the other:

Once you've aligned the decoded mp3 track against the original, mix the two together and listen. Try different kinds of source material... you'll probably find that certain kinds of music, like overly processed and very loud dance pieces with vocals, lose more through mp3 than quiet acoustic pieces. If you think you've guessed the reason for this, drop me a line.

-- (c) 2005 Jay Rose. Posted 8/3/05

Return to DV Readers' page.