Must-know audio file formats in 2023
Explore the major audio file formats you need to be aware of in audio production in this guide that covers their advantages and limitations, and highlights when to use which.
The discipline of music production is awash with confusing file formats. Understanding them is both bewildering and crucial, as different formats have unique characteristics that can significantly impact the quality, compatibility, and efficiency of your audio projects.
Fear not: in this comprehensive guide, we will explore the major audio file formats you need to be aware of, covering their advantages and limitations, and highlighting when to use which.
Follow along with music production plugins included in
Music Production Suite
Music Production Suite 7
What are audio file formats?
At their core, audio file formats are digital containers that store and represent audio data. They encapsulate and describe the sonic characteristics of your individual audio tracks in a digital form, allowing them to be shared, played back, or manipulated in various digital environments.
Different audio file formats employ different encoding techniques to organize audio data, ultimately shaping the quality, file size, and compatibility of the audio file.
Why are there so many different audio file formats?
There’s a few reasons—namely audio fidelity, storage efficiency, and proprietary protections.
Regarding the first of these three: different formats offer varying levels of audio fidelity. Uncompressed formats like WAV and AIFF prioritize the preservation of audio quality without loss of data. These formats are commonly used during professional audio recording, as well as editing, mixing, and mastering. However, fidelity takes up lots of space, which is what brings our second consideration into the mix: efficiency of storage.
See, digital space comes at a premium (usually around $20 a month if you’re into cloud storage on a professional level). Force users to stream digital files online mix and you’re talking about an exponential cost to the listener’s phone bill. This is why we have compressed formats like MP3 and AAC. They don’t exude fidelity, but they do take up less space. We’ll get into how in a minute.
The third major reason is good old-fashioned protection. Remember those pesky M4P files back in the days of the original iTunes? Remember how you couldn’t play them on more than five devices unless you burned them to a CD and re-imported them as MP3s?
This was nominally done as copyright protection, but obviously it also kept you chained to iTunes. These days you also see propriety or slightly proprietary audio file formats. WMA files come to mind, as you have to go through extra steps to convert them for use in many audio environments.
Encoding methods and compression techniques
Common audio file formats utilize different methods to achieve a balance between audio quality and file size. Common uncompressed formats, such as WAV (Waveform Audio File Format) and AIFF (Audio Interchange File Format), use a technique called pulse-code modulation to preserve audio data in its original, uncompressed state. We’ll cover how PCM works a little later.
These uncompressed formats offer high fidelity. They are lossless—no data is lost in the making of an uncompressed WAV file.
On the other hand, lossy compression formats, such as MP3 (MPEG Audio Layer-3) and AAC (Advanced Audio Coding), employ algorithms that discard audio data to achieve a significant reduction in file size. They often use psychoacoustic filters to determine which parts of the audio can safely be discarded while representing the original sound.
These files are lossy—data is discarded or lost in the encoding of an MP3.
Think of lossy files like the skim milk of file formats. Sure, skim milk might taste like the real thing on the initial swig, but the aftertaste will be weaker and less satisfying.
By way of an example, here’s a mix bounced down to a WAV file and viewed as a spectrogram in RX:
Here’s the same mix, but now i’m showing you an mp3 file:
To the untrained eye, this might look similar, but zoom in and you can see some pretty strange changes. Here’s the WAV up close:
And here’s an MP3 zoomed in:
See those jagged black spots I’ve circled? That’s some of the missing information the MP3 algorithm has decided to discard. Where did it go? It’s gone, gone, altogether gone! The MP3 algorithm decided you didn’t need it, and threw it away.
There is no way at this point in time to perfectly reconstitute that missing data.
Indeed, a lot of audio information has been thrown away on the journey from an uncompressed wav file to a compressed one. I can actually show you the audible difference between the same mix’s WAV bounce and MP3 bounce:
That’s quite an audible difference, enough to identify the tune. It almost feels like a strange sound effect—like something you would do to a sound if you wanted to mangle it.
One more bit of pertinent information for you: iZotope Ozone has a feature that allows you to preview what you’ll lose in the transfer from WAV file to MP3 or AAC. Press the “codec” button by the I/O meters, and you’ll get to this window:
When you solo the artifacts, you’ll be able to hear the projected difference between the uncompressed file output and MP3-style encoding in real-time. You’ll also be able to hear how your mix might sound as a low-res file. Just remember to turn this off when you’re exporting your mix!
Now let’s dive into various file formats, beginning with the uncompressed ones.
Uncompressed file formats
As we said before, uncompressed audio formats aim to preserve the fidelity of the audio, doing so without regard to the file size. Say your recording interface is set up to capture 48 kHz/24-bit audio. The resulting, uncompressed file will store all that data at 48 kHz and 24-bits. It won’t lose any of the information. Let's take a look at some of these common audio file formats.
WAV (Waveform Audio File Format):
The (WAV Waveform Audio File Format) format is widely regarded as the standard for pro-audio quality and compatibility. WAV files are supported by most (if not all) digital audio workstations (DAWs) and media players.
The WAV file was introduced by Microsoft and IBM in 1991 as part of the Resource Interchange File Format (RIFF). Among common users, WAV files gained popularity due to their simplicity and compatibility across a wide array of platforms. Although WAVs were first designed by the company behind Windows, they were always compatible with both Windows and Mac operating systems.
BWAV Files
BWAV (Broadcast Wave) files are an extension of the WAV file format, but they’re specifically designed for broadcasting applications where metadata and labeling is key for organizational purposes. BWAV files support tagging of project details such as timecode, track names, and more.
Even today BWAV files are used in post-production, which is why they are worth mentioning as distinct entities from common WAV files. If you work in film, television, or newer media such as podcasting, you’ll often work with BWAV files.
AIFF (Audio Interchange File Format):
Developed by Apple, the AIFF format shares many similarities with the WAV format. It is an uncompressed format that also offers high audio quality and compatibility, particularly within the Mac ecosystem.
There are some behind-the-scenes differences between WAV and AIFF files, particularly in how the files hold onto their information, but essentially they serve the same purpose.
WAV files tend to be supported everywhere, but that isn’t the case with AIFF files. MacOS and IOS platforms play nicely with AIFF, but you might find it hard to upload an AIFF file to various sites on the internet, as they’ll only take a WAV.
PCM (Pulse-code modulation)
I told you I’d cover the basic science behind common uncompressed formats, so here goes:
In file formats such as WAV and AIFF, an analog audio signal is sampled at regular intervals, and these intervals are quantized to a specific numerical value, like a plot on a graph. This is a simplified explanation of the PCM process.
The waveform’s amplitude is measured at specific intervals of time (44,100 times a second for CD-quality files, 48,000 times a second for movies, higher for high-res files) and given a specific value corresponding to its dynamic range (usually somewhere along a 16 or 24 bit resolution).
These values are stored as binary data in the file, so the computer can easily call up and play the data without “creatively reinterpreting” the sound.
WAV and AIFF files use this technique for capturing audio information in the digital realm. However, other methods are sometimes employed, which brings us to our next section:
DSD (Direct Stream Digital)
DSD is a specialized audio format intended for audiophiles and those who are easily charmed by shiny objects. Unlike traditional PCM-based formats that use multiple bits to describe the dynamic range of audio samples, DSD employs a very different approach.
Like PCM, DSD involves encoding the audio as a continuous stream of very high-frequency pulses. In DSD, an extraordinarily high sampling rate is used—upwards of 2.8224 MHz (64 times the sampling rate of CDs).
Now, here’s where it gets really different: instead of the customary 16 or 24 bits of the PCM format, DSD uses a 1-bit system. The dynamic range isn’t assigned to one of 16 or 24 bits. Instead, the dynamic range is charted in a relational manner: either the current sample has a different amplitude from the previous one or it doesn’t. The choice in binary (1 for different, and 0 for the same).
This is a very simplified explanation, but for our purposes, it does the trick.
DSD files are highly specialized. While DSD formats undoubtedly carry more information than PCM formats, it is still up for debate whether DSD files are any more “realistic” when compared to high-resolution PCM formats in double-blind studies.
Some audiophiles will tell you they hear a difference, but every ear is prone to confirmation bias, as evinced by the audiophile MoFi scandal of 2022.
Lossy compressed audio formats
Now we’ll cover audio formats that are both compressed and lossy. They may skimp on audio quality, but they’re great for mass-market streaming, as they don’t take up too much room on a hard drive, or take too many bites/bytes (both words apply) out of your data plan.
MP3 (MPEG-1 Audio Layer-3):
In the late 1990s, the MP3 nearly brought the recording industry to its knees. The format was designed by European engineers with the goal of reducing file size while staying true to the original source recording.
The team behind MP3s used psychoacoustic models to filter out parts of the audio they deemed unnecessary to communicate the information in a given song. With sizes much smaller than the original wav files, MP3s were the perfect vehicles for peer-to-peer sharing services such as Napster and Limewire, which took off like wildfire as the 20th century came to a close.
These days, individual piracy has largely been replaced by corporate piracy (streaming services that don’t pay out their artists honestly), but the MP3 still remains the most popular lossy format.
AAC (Advanced Audio Coding)
AAC is similar to MP3 in principle, but the format was designed to have more sonic fidelity to the original source. It supports higher sample rates and more channels. The history behind the AAC is long and boring; suffice to say that these days the format is less widely used than MP3s, but sounds marginally better. Apple platforms often default to AAC for lossy deliverables.
In my experience, AACs are sent internally within production teams to evaluate a project when WAV files would take up too much space—but MP3s still tend to be the final lossy deliverable to the market.
Ogg Vorbis
Ogg Vorbis is another form of lossy, data-compressed encoding. But with a name like Ogg Vorbis, you know it has to be open source. And because Ogg Vorbis files are open source, this particular lossy format has an interesting and somewhat anarchic history, allowing for more possibilities in the metadata department.
Ogg Vorbis doesn’t get a ton of play compared to AAC and MP3 formats, with one noticeable exception: Spotify uses Ogg Vorbis for its higher-quality streaming.
WMA (Windows Media Audio)
The WMA file is a creation born from the minds of Microsoft. Like other lossy codecs, the WMA discards data deemed unimportant to create a smaller file. WMA files are optimized for Windows-based systems, working with players like the Windows Media Player. You don’t often see them in the production environment, but they do come up every once in a while. When you get one, you often have to convert it to some other kind of file, because Apple-based DAWs don’t handle WMA files.
Lossless compressed audio formats
Over the years, people have developed codecs that can compress file size without sacrificing audio fidelity. They sound exactly like WAV files, but take up less space. We’ll cover a couple of them now.
FLAC (Free Lossless Audio Codec):
FLAC was actually developed by the same people who brought us Ogg Vorbis around the year 2000. The FLAC file uses lossless encoding of linear pulse-code modulation data (the PCM we talked about earlier), achieving a result that is identical to a WAV file at a smaller file size. It’s the middle-ground, with a file size bigger than an MP3 but not as big as a WAV.
Many hi-res streaming outlets deliver audio in the FLAC format. In fact, Qobuz streams their high-res file in FLAC.
ALAC (Apple Lossless Audio Codec)
ALAC is Apple’s take on a lossless, compressed file. It’s much like FLAC, except it isn’t open-source. Instead, it works very well within Apple’s ecosystem. It’s the only game in town if you want to listen to compressed lossless audio on an IOS device’s built-in audio options.
WMA Lossless
You can think of WMA Lossless as the Windows equivalent of ALAC: it’s the lossless compressed format of choice for Windows Media Players.
Choosing the right bitrate
When encoding a WAV file into a lossy audio format such as MP3 or AAC, selecting the appropriate bitrate is essential. Higher bit rates result in better audio quality, but also larger file sizes. Finding the right balance between file size and audio quality is crucial for efficient music distribution and storage. Oftentimes distribution companies will tell you what specs they require out of their MP3 deliverables.
You’ll find that your DAW of choice will give you plenty of bitrate options when it comes to exporting an MP3:
It’s worth noting that MP3s and other lossy formats can add clipping distortion during the rendering process, especially if the file is already pushing anything close to 0 dBFS.
For this reason, iZotope RX has a feature that helps MP3 files sound better upon export, pictured here:
If you check that “prevent clipping” box and choose “normalize,” RX will intelligently ride the level of the file upon encoding to the resulting file won’t incur inter-sample peak distortion on playback. The process takes longer, but is sonically worth it, especially if you’re delivering loud MP3s to a delivery service.
Embrace the power of audio formats
In the vast world of audio production, understanding audio file formats empowers you to make informed decisions about preserving audio quality, ensuring compatibility, and optimizing storage efficiency. By familiarizing yourself with the major formats, you can choose the most suitable format for each stage of your music production process.
Remember to consider factors like audio fidelity, file size, platform compatibility, and specific workflow requirements. Remember to always check the format of your export when you’re bouncing down a track—you don’t want to wind up with the wrong one!
Hopefully, this article has given you sufficient information to make your choices with confidence. With knowledge of file formats in tow, you can confidently navigate the ever-evolving landscape of audio file formats and unlock the full potential of your musical creations.