What's the difference between audio file formats?
In this article, you’ll learn the difference between MP3, WAV, M4A, and AIFF audio formats as well as when to use each format and the pros and cons of each.
As technology advanced in the realm of recorded sound, so did the way that we listen to music. Within a few decades after the innovation of music recording, the record was born. The modern iteration of the vinyl record was established fully in the middle of the 20th century, and was the standard for many years. Analog formats ruled as the 1970s and 1980s brought us cassettes and 8-tracks…and then CDs arrived, and the digital revolution in audio took off in earnest.
Here we are in the 2020s, and we’ve grown used to the variety of audio file formats we come across while sharing, streaming, or working with music. If you’re an engineer, which format is best for your project? When you’re finished, on what service will it be heard? What format does that service use? What type of files should you give to your clients? The options may seem confusing, but we’ll break it down here.
General audio file types
When differentiating between audio files, the overarching categories are uncompressed audio, lossless audio, and lossy audio formats. Compression in this case, is not the dynamic range compression we use when processing audio, but rather data compression.
Uncompressed audio, refers to a file that has not been data compressed.
Lossless audio is a data compressed format that preserves all the original data. You can think of it similarly to a ZIP file. The file is encoded (compressed) and then when decoded all the information is retained.
Compressed lossy audio uses data compression to make the size of the audio file smaller by removing inaudible material. This allows for easier transmission of the audio data (say, over email, or on a website). This compression is destructive; that is, after the audio has been compressed it isn’t possible to regain that information
Uncompressed audio file formats
WAV
AIFF
Lossless audio file formats
ALAC
FLAC
Lossy audio file formats
MP3
M4A
Defining audio file formats
Uncompressed audio
We understand tape, records, etc. as analog formats. Analog audio refers to a continuous-time signal where electrical voltages are analogous with sound pressure levels. Digital audio, by contrast, is a discrete-time signal captured into numerical samples by way of an analog-to-digital converter using pulse code modulation, or PCM. PCM samples the audio at uniform intervals. The process of quantization converts each analyzed sample to the nearest digital value. Linear PCM (or LPCM) is similar but unlike PCM the quantization values are linear and proportional to amplitude. In most cases, PCM is the catchall term for both.
A bit of a backstory on “sampling,” for context. The Nyquist-Shannon Sampling theorem states (broadly speaking) that to sample a signal accurately, the rate of sampling must be greater than two times the highest frequency. So to faithfully recreate a 22 kHz signal (which is above the highest frequency nearly all humans can hear), we must sample it at 44.1 kHz – at least the minimum and maximum amplitude of the signal.
Another important thing to note is bit-depth. The on and off of a digital system (1s and 0s) uses a system called binary. A byte is made up of 8 bits; a word is one or more bytes. 16 bits is 16 values in a word, and 24 bits is 24 values in a word. Bit-depth is how much data is included in each digital word. The more bits, the higher the resolution, which affects not only the dynamic range of a file, but the signal to noise ratio.
.WAV file
.WAV files – Waveform Audio File Format – are uncompressed audio files, developed by Microsoft and IBM back in 1991. They utilize LPCM encoding. WAV files are one of the more popular digital audio formats and a gold standard in studio recording. WAV was one of the first digital audio formats, and quickly became a staple across all platforms. Despite decades of progress, it still maintains its position as one of the world’s leading pro audio formats.
WAV files capture and recreate an original audio waveform at the highest quality without affecting or altering the sonic characteristics of the sound in any way. WAV uses PCM (Pulse Code Modulation) to encode the data by slicing it into small chunks to provide the highest quality possible. It’s a lossless file format, meaning that there is no data loss whatsoever. So what gets captured and recorded is the closest mathematical/digital representation of the original audio waveform—no noticeable audio quality loss happens in the process.
At minimum, you want to track 24-bit, 44.1kHz files, to capture the full dynamic range of human hearing as well as minimize noise and allow for full representation of dynamic range.
Another related file is the .BWF – Broadcast Wave File. It has the same information as a .WAV file from an audio quality perspective, but contains extra header information that can be useful for broadcast. This may contain timecode, and other information about the file itself (max momentary and integrated loudness, date of origin, name of originator, etc). Not all systems can read this metadata, but it can be helpful in film and broadcast situations.
WAV files are also uncompressed, meaning that the data is stored as-is in full original format that doesn’t require decoding. This provides enormous versatility allowing for superb editing and manipulation.
.AIFF file
.AIFF (Audio Interchange File Format) files are functionally the same, but are a bit differently encoded. AIFF files are the Apple equivalent to .WAV files. They also use LPCM encoding, but have the ability to save more metadata into the header of the file. If you are working on a Mac with Logic or GarageBand, this is the file type you are mostly likely to encounter.
For the most part, whether you use .AIFF or .WAV files comes down to personal preference, and the artists you collaborate with. In either case, these are the formats you want to be working with for tracking, mixing, or mastering. It provides studio-grade audio recording and playback. Offering sample rate and bit depth options just like WAV files, AIFF registers the audio waveform as accurate samples (slices) using PCM to offer the highest possible audio recording quality and sound replication. Just like WAV, AIFF also stores data in uncompressed, lossless format, meaning you get no quality loss, just pure sonic happiness.
So what’s the difference between the two? It mainly boils down to history. AIFF was created by Macintosh in 1988, allowing full studio-quality audio recording and playback on Apple computers. WAVE was created from a partnership between Microsoft Windows and IBM in 1992, so WAV files played back natively on Windows machines. Nowadays both formats can be recorded and played back natively on any operating system, so they’re easily interchangeable, offering the same high-quality audio, regardless of format.
.WAV vs. .AIFF
So if WAV and AIFF can both offer the same highest studio-quality audio, which one should you choose? Well, that will really depend on your use case. For starters, the historical prevalence still stands today. WAV files are more popular on Windows, whereas AIFF files keep their ground on Macs. If you’re planning to send your audio files to the studio for further overdubbing or mixing, consistency with your session is important, so talk with your sound engineer about what format they plan to use in the session, and make sure your audio bounces match. The great news is, regardless of which of the two formats you choose, you will achieve exactly the same superb audio quality.
Lossless audio file formats
.FLAC file
FLAC stands for “Free Lossless Audio Codec” and is an open source, data compressed file that retains all the file’s information in the encoding and decoding process. After compression, the file is usually reduced between 50-70%. The amount of data compression chosen in the encoding process dictates the percentage, as well as how long it takes to encode. The code has been optimized so that decoding speeds stay about the same.
.ALAC file
Not wanting to be left out, Apple created its own file, ALAC, which stands for “Apple Lossless Audio Codec,” and is functionally similar to .FLAC. It usually is placed in the MP4 container, which has the extension .m4a - this same extension is used for Apple’s lossy audio codec, but the encoding is different - the container is the same. This format is used for Apple Music’s Lossless Audio playback. The files use more CPU power to decode, vs .FLAC.
Lossy audio file formats
.MP3 file
Uncompressed audio formats like WAV and AIFF provide gorgeous sound quality, but at the cost of high file size. With the boom of internet file-sharing in the mid-90s, people quickly realized sending uncompressed files over dial-up connections was impractical—and oftentimes impossible. Which is why MP3s (MPEG-2 Audio Layer III) were born.
The most common type of lossy audio file formats is the .mp3. This format is the 3rd layer for the MPEG-1 format, which was then expanded further via the MPEG-2 format into the .mp3 format we know of today. It was developed mainly by the Fraunhofer society. The files are encoded using perceptual encoding to reduce the quality or eliminate entirely information that has been determined to be beyond what most humans can hear.
While a three-minute song would average 30MB in WAV or AIFF format, that same song converted to MP3 would take up a tenth of the space—only around 3MB. With compression algorithms that were capable of achieving impressively small file sizes, MP3 became a staple of the internet era and has maintained its strong position to date.
However, small file size came at the cost of sound quality. Take the pair of images above. On the left, you can see every little wrinkle and color vividly. A highly compressed image (on the right), however, becomes very pixelated and loses all of the clarity and detail. The same happens when you compress an audio file.
Different compression formats use varying methods to re-encode the data in a way that saves space. But this saving of space means some data has to get lost in the process. Usually, high frequencies are the first ones to go, as the majority of people can’t hear the details in really high frequencies. The lower the encoding quality, the more frequencies and details will get lost in your audio.
Having said that, modern compression algorithms allow for higher bitrates, which, in turn, means that they’re able to achieve high compression ratios with little noticeable loss to the quality of the audio. Bitrate represents the amount of data conveyed per second of audio content, with the general rule of thumb being: smaller bitrates = smaller file sizes. So if you want to maintain good quality, yet still make use of the fact that MP3s are easy to share with friends and family, keep your bitrate above 128 Kbps (kilobits per second).
.M4A/MP4 file
M4A (MPEG-4 Part 14) files were Apple’s response to MP3s. The MP4 (MPEG-4) container is used in the creation of .m4a files. As we read above, this extension can be used to designate ALAC files, but it can also be used as the extension for Apple’s lossy files. Often seen as the successor to the MP3, this Mac-centric compressed audio format found its true place with the birth of the iTunes Store, where it became the primary format for all music purchases made through the online music store. It is still the preferred format for all audio included in apps that are released on the Mac and iOS App Stores, as well as Nintendo and PlayStation products. With more and more developers including support for M4A, it’s quickly becoming the go-to audio format for compressed audio files.
M4A files are encoded with the lossy Advanced Audio Coding (AAC) codec, which is able to provide the same bitrates as MP3s, yet achieve tighter compression. This results in smaller file sizes, all while delivering higher audio quality. It’s like a golden unicorn, which is why it’s become such a popular format for light-weight audio deliveries.
Although many audio players can playback M4A files across various platforms, the audio format still can’t compete with MP3’s universal compatibility, which is why MP3s still rule the world due to their cross-platform adoption.
The MP4 container can also include video, images, subtitles - and audio.
.MP3 vs. .M4A
The majority of desktop and mobile devices sold nowadays come with native support for MP3 and M4A files alike. For higher quality results, I recommend you choose M4A, which can offer higher sonic results at the same settings, all while still resulting in smaller file sizes than MP3. On the other hand, if guaranteed compatibility is what you need most, MP3 will probably be the wiser choice of the two.
.OGG file
This open source format takes its name from the video game Netrek. It is also similar to .mp3 and is used in Spotify’s streaming service. The OGG container isn't restricted to a particular codec, however it is commonly used to store audio with the Vorbis codec; similar quality to .mp3 and lower bit rate.
Which format should I use?
Uncompressed audio, without a doubt, should be used for tracking and mixing and mastering. That way the music retains the highest sound quality while engineers and mixers are working on the tracks. AIFF and WAV, again, while both uncompressed, tend to be a choice of preference. If you receive AIFF files but tend to work in WAV format, you’ll be able to pull them into any DAW of choice and work on them in WAV format without any degradation in quality.
As you’re working, if a client wants to share files, it is common to email an .mp3 bounce of a ref for quick back and forth – .mp3s, while not highest quality, can be good for file sharing amongst clients, giving quick mix notes and an approval to move forward. Smaller file size, and a good enough quality to make broad summaries on what you’re listening to.
For listening to the final mix, it is better to use a file sharing service such as WeTransfer, Hightail, Dropbox, etc. – the higher bandwidth will be more accurate for conscientious analysis.
I tend to deliver high bitrate MP3s along with full resolution files for my final masters, so if my client wants to share the MP3s I don’t have to worry about something going wrong on their end with the conversion – I can quality check everything and make sure it all sounds good.
FLAC and ALAC are not inherently better than WAV files, but they do allow for the sharing of very large files in a more compact package. The main detriment to sending an engineer FLAC files is that it isn’t supported by many DAWs - so in order to import them you must either use compatible software or find a way to convert it. This can get complicated, so if you plan on sending files in this format talk to your engineer first so you can have a discussion if it might be better to send WAV or AIFF instead. ALAC files are supported by Pro Tools, but might not be by other DAWs. When in doubt, ask your engineer!
How to bounce a session
You want to send your friend your latest song. You could send your session file, but what if they don’t have your DAW? And on the off chance they can open the file, if they’re missing just one plug-in you used, your song won’t sound the same. How can you bypass all of this stress?
You should bounce your session instead.
What is bouncing in audio?
Bouncing is the process of rendering your entire project as a single stereo audio file that can be played on any device. It’s the process of down-mixing all of your tracks into a two-channel (left speaker and right speaker) audio source. Unlike a session file, an audio bounce means you can send the audio file to your grandma and know that she’ll be able to listen to it without any special equipment.
When bouncing, you’re presented with multiple options: audio format, sample rate, bit depth, and sometimes even normalization. Each one of them is important, so I recommend you check out Griffin Brown’s great explanation of sample rate and bit depth in his Basics of Digital Audio article.
Conclusion
I hope that this guide was able to shine some light on the difference between popular audio file formats and when to use them. Most modern DAWs allow you to bounce your song in multiple formats at once. As a general rule of thumb, I recommend you choose one uncompressed lossless audio format (AIFF or WAV) and one compressed Lossy audio format (M4A or MP3). That way, regardless of what kind of format you need, you have it ready and you don’t have to re-open your session just to re-bounce the song in a new format. Additionally, if you have your song bounced in at least one uncompressed lossless format, there are plenty of great audio converters on the market that will allow you to convert your song into any of the other audio formats when you need them.
Audio formats have changed quite a bit since the days of only records. Hopefully this article gives a good overview of the most common digital audio formats in use today, as well as when to use them. Communication with your client about what their needs are is important to understand what you need to deliver, and with this overview you have a good jumping point!
The audio world is filled with many options, and the four basic formats above are just a few of over a dozen different audio formats. Ultimately, whatever use case you may have, you’ll find an audio format that’s able to fulfill your needs – including a few compressed lossless file formats. Now that you know how to use the basics, I can’t wait to hear the music you create.