File Compression 101

The following information was contributed by Christian T. Kelley-Madera — make sure to go check out his fantastic ensemble fantasy/comedy podcast, the Once and Future Nerd!

One note before we get started – file compression is not to be confused with audio compression, which is a tool for automatically making the quiet sections louder and the louder sections quieter.

What’s file compression?

File commpression discards “inessential” data in a file to make the file smaller, and therefore more portable.

Imagine a big sheet of paper with a beautiful, high quality drawing on it. As it is, you’re seeing the drawing in all its glory, but it’s big and bulky and hard to get around. You could probably roll it up into one of those cardboard tubes to make it a little easier to get around without damaging the drawing, but you’ll have to be careful how you roll and unroll it, and it will still be kinda big. You could also fold it up a bunch of times or crumple it into a ball, which will make it much easier to carry around but you will also definitely damage the drawing, and not be able to get that stuff back when you un-crumple it.

“Lossless” compression is like the cardboard tube. Files will be smaller than when you recorded them, but still pretty big. You won’t lose any data. Common lossless codecs are FLAC (free lossless audio codec) and ALAC (Apple lossless audio codec). Not all devices can easily play these file types.

“Lossy” compression is like folding it up. It’s easy to get around, but you’ll undeniably lose some quality. MP3 is the most common lossy compression codec. It makes very small files and basically any device that can connect to the internet these days can play them back. And the good news is – the engineers who designed MP3 were very clever about which bits of data they discarded. Most untrained ears can’t hear what is lost. But if you were to go compress the compressed version again, things could start to get gnarly.

If we go back to our drawing analogy, say you wanted your friend to enjoy the drawing. (Also, for the sake of this metaphor, let’s just assume scanners and camera phones don’t exist.) You might keep the original drawing yourself, stored somewhere safe. Then, you might make a photocopy for your friend, and fold up that photocopy to mail to them. Then if another friend asked to see it, you’d go back to the original and copy that again.

Likewise, with audio, it’s smart to keep a lossless or ideally uncompressed “master” file for anything you release. Then for every different representation you have to make, you can make compressed copies of that master. (Common uncompressed audio file types are WAV and AIF.) For example, you might keep a WAV of your final stereo mix somewhere safe. Then you might compress the WAV to MP3 for your regular podcast feed, and you might also compress the WAV to ALAC as a high-quality version for your Patreon subscribers or similar.

OK. That’s all well and good. But…what buttons should I click? I’ll spare you the nitty gritty of what all the following numbers mean – that’s probably a subject for another post. But if you just need to know what settings to choose for now, I recommend:

Record at 44.1 kHz/24bit .WAV or .AIF. (If your microphone or audio interface can’t do 24bit, 16bit will be good enough for now.) There’s some argument to be made for recording at 48 kHz/24 bit, but there’s not really a good reason to go much higher than that – you’ll just have huge files with little benefit to sound quality.*

Then for your podcast feed: compress to a 44.1 kHz / 160 kbps stereo MP3. (If you’re just doing talk radio, you can save additional space and download time by doing a 128 kpbs mono file.)

*One caveat here – if you know you’re going to be doing heavy FX work on a particular piece of audio, especially messing with speed and/or pitch – then it might be worth it to record at 96 kHz if your hardware allows it. But if not, don’t stress.

But wait, I’m confused. 44.1/160 looks higher than 44.1/24. You made it sound like the second number should be smaller. Okay, I didn’t want to do any math here, but you’ve forced my hand. 160 kbps stands for kilobits per second. Whereas 24 bits actually means 24 bits per sample. And the samples happen 44,100 times per second (hence kHz). So 44.1 kHz/24 bit – without compression – would math out to something like 1,058 kbps. Almost ten times as much data as 160 kbps. (This has been the entirety of what I remember from AP Physics.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s