Audio File Formats: MP3, AAC, OGG, FLAC, WAV, AIFF — A Practical Guide
Audio formats split into two worlds: lossy (smaller files, some quality loss) and lossless (perfect quality, larger files). Understanding which to use and when is straightforward once you know the tradeoffs — bitrate, codec efficiency, browser support, metadata support, and use case. This guide covers all of it.
Lossy vs Lossless Explained
Every audio format is either lossy, lossless, or uncompressed. Understanding the difference changes how you think about every other decision.
Discards audio data permanently using psychoacoustic models — parts of the sound humans are least likely to notice.
Formats: MP3, AAC, OGG Vorbis, Opus, WMA
Size: 3-10 MB per song
Compresses without losing any data. Decompresses to the exact original PCM audio.
Formats: FLAC, ALAC, APE
Size: 15-30 MB per song
Raw PCM audio. No compression at all. Identical quality to lossless but much larger files.
Formats: WAV, AIFF
Size: 30-50 MB per song
How Lossy Compression Works
Lossy audio codecs use psychoacoustic models — mathematical representations of what humans hear. The core insight: human hearing is not linear or uniform. We're less sensitive to sounds that occur simultaneously with louder sounds (masking). We're less sensitive to very high frequencies. We're less sensitive to sounds below ~20 Hz or above ~20 kHz.
An MP3 encoder analyzes the audio and allocates bits where they matter most perceptually, discarding or reducing bits in ranges where the codec predicts you won't notice. Higher bitrates = more bits allocated = less discarded = closer to the original.
Can You Hear the Difference?
In controlled blind tests (ABX testing), most people cannot reliably distinguish 320 kbps MP3 from lossless CD audio. Some trained listeners can detect differences in specific challenging passages — complex polyphonic music, recordings with lots of cymbal and high-frequency detail, or ambience recordings. But for most music, most of the time, on consumer-grade headphones or speakers, it's genuinely indistinguishable.
The reason audiophiles insist on lossless has less to do with direct listening and more to do with future-proofing: if you archive in FLAC, you can always create a perfectly accurate MP3 later. If you archive in MP3, you cannot get the original quality back without the original source.
Format Deep Dive
MP3 (MPEG-1 Audio Layer III)
MP3 is 30+ years old and still the most universally supported audio format. The standard was finalized in 1993. Its patents expired in 2017, making it fully free to implement. Despite being an aging codec, MP3 at 256-320 kbps is transparent enough that most modern workflows still use it for distribution.
Technical details: Uses Modified Discrete Cosine Transform (MDCT) with a psychoacoustic filter bank. Supports stereo modes including mid-side stereo, which is more efficient for correlated stereo content. Maximum bitrate: 320 kbps. Sample rates: 32, 44.1, 48 kHz.
Strengths: Universal compatibility — literally every audio playback device and app supports MP3. Very fast to encode and decode. The 320 kbps ceiling is adequate for all realistic use cases.
Weaknesses: Older codec — AAC and Opus produce better quality at the same bitrate. No support for lossless or high-resolution audio. ID3 tag metadata can be inconsistently implemented.
Use when: Maximum compatibility is required. Distributing audio to the widest possible audience. The oldest target devices or software must work.
AAC (Advanced Audio Coding)
AAC was designed in the mid-1990s as the successor to MP3, and it delivers on that promise — noticeably better quality at the same bitrate, especially below 256 kbps. Apple chose AAC for iTunes, iPod, and later Apple Music. YouTube uses AAC. Most streaming services use AAC or a proprietary variant.
Technical details: Also MDCT-based but with a more sophisticated psychoacoustic model and larger transform blocks. The "LC" (Low Complexity) profile is used for streaming. AAC-HE (High Efficiency) adds spectral band replication for low bitrates. Container: typically M4A for audio-only, MP4 for video. AAC-LC at 256 kbps is Apple's standard for Apple Music.
Strengths: Better quality than MP3 at equal bitrates. Widely supported on modern devices. Efficient for streaming. Handles high-frequency content better than MP3.
Weaknesses: Slight latency in decoding (historically an issue in games/interactive audio, now mostly solved). Some encoder quality variation — Fraunhofer's encoder produces better results than many open-source implementations at lower bitrates.
Use when: Apple ecosystem. Streaming. Any modern context where you want better quality than MP3 without changing file size. YouTube uploads.
OGG Vorbis
Vorbis is the audio codec inside OGG containers. It's fully open source with no licensing fees — which is why game developers, Spotify (historically), and open platforms adopted it. Quality is comparable to AAC at the same bitrate.
Technical details: Uses MDCT with a floor model for tonal components and a residual model for noise. The OGG container is flexible and also hosts other codecs (Opus, Speex, FLAC). Vorbis targets quality settings (typically -q0 to -q10) rather than fixed bitrates, though bitrate ranges can be specified.
Strengths: Completely free. Good quality. The quality-based encoding produces consistently good results without tweaking bitrate. Supported in major browsers and Linux platforms.
Weaknesses: Poor native support on Apple platforms. The iOS ecosystem largely ignores OGG. Losing ground to Opus for streaming and voice.
Use when: Game audio (Unity supports it natively, which is huge). Linux desktop applications. Open-source projects. Situations where licensing matters.
FLAC (Free Lossless Audio Codec)
FLAC is the standard for lossless audio archiving. It typically compresses CD audio to 50-60% of the raw WAV size with no quality loss whatsoever. The format is open, the spec is documented, and support is broad.
Technical details: Uses linear prediction and Rice coding. Compression level (0-8) trades encoding speed for file size; the difference is small (a few percent) — use level 8 for archiving, level 5 for real-time encoding. Supports up to 32-bit depth and 655,350 Hz sample rate (far beyond any practical audio). Supports embedded Vorbis Comment tags and album art.
Strengths: Perfect fidelity. Widely supported on desktop (Windows, Linux, macOS with a plugin, Android). Best archiving format. Full metadata support including embedded artwork. Open and free.
Weaknesses: Not natively supported on iOS (requires third-party apps). Not universal on embedded systems. Files are 3-5x larger than equivalent MP3.
Use when: Archiving any audio you care about. Music library archival. Source files for re-encoding. Master recordings. Any time you want to preserve the original quality indefinitely.
WAV (Waveform Audio File Format)
WAV is essentially raw PCM audio with a small header. No compression, no quality loss, enormous files. It's the default for audio recording software and the format that every DAW can import without issues.
Technical details: Container format that typically holds uncompressed PCM audio. Can technically hold compressed audio (WAV with ADPCM encoding exists) but almost never does in practice. 16-bit 44.1 kHz stereo (CD quality) produces 10 MB per minute. 24-bit 96 kHz produces 34 MB per minute.
Strengths: Universal compatibility with audio software. No encoding artifacts whatsoever. Direct interface to hardware. Supported everywhere, no decoder needed.
Weaknesses: No compression means huge files. Limited metadata support (LIST INFO chunks work but aren't widely used). 4 GB file size limit in standard format (RF64 extension removes this). No native album art support.
Use when: Audio production workflows where you're editing and re-editing audio. Recording sessions. Exchanging audio between applications. Sound effects libraries where full fidelity is critical and storage cost is acceptable. Anything where you'll be processing the audio further.
AIFF (Audio Interchange File Format)
AIFF is Apple's equivalent of WAV — uncompressed PCM audio in an Apple-designed container. Developed in 1988 based on IFF. AIFF-C (AIFF Compressed) exists but is rarely used. Functionally equivalent to WAV for most purposes, but with better metadata support and no 4 GB size limit.
Use when: Mac-based audio production. Logic Pro workflows. Situations where you're working entirely in Apple software. Otherwise, WAV is equally suitable and more universally recognized.
Opus
Opus is the newest codec in this list and in some ways the most impressive. Developed by Xiph.Org and standardized by IETF in 2012, it's designed to handle everything from voice to high-quality music, adapting dynamically to the content. WebRTC uses Opus. Discord uses Opus. Many modern streaming APIs use Opus.
Technical details: Combines the SILK codec (excellent for voice, used in Skype) and CELT codec (excellent for music) and intelligently switches between them or blends them. Variable bitrate from 6 kbps to 510 kbps. Handles 8-192 kHz sample rates. Very low encoding latency (as low as 2.5ms), making it suitable for interactive applications.
Strengths: Outperforms MP3 and AAC at all bitrates, especially below 128 kbps. Excellent for voice and speech (podcasts, VoIP). Native WebRTC support. Native browser support in Chrome, Firefox, Edge, Safari.
Weaknesses: Relatively new — less software support than MP3 or AAC. OGG/Opus has limited support in some media players. Not suitable for archiving (lossy codec).
Use when: Web audio that needs to be efficient (podcast streams, voice chat). WebRTC applications. Any web application where you control the format. Real-time communication.
Bitrate and Quality Tradeoffs
Bitrate measures data per second of audio. Higher bitrate = more data = better quality = larger file. For lossy formats, more bits means fewer compromises in the psychoacoustic model.
MP3 and AAC Bitrate Guide
| Bitrate | Quality | 4-min song size | Best Use |
|---|---|---|---|
| 64 kbps | Poor — artifacts audible | ~2 MB | Spoken word, podcasts at low bandwidth |
| 128 kbps | Acceptable — artifacts audible on music | ~4 MB | Background music, low-bandwidth streaming |
| 192 kbps | Good — artifacts rarely noticeable | ~6 MB | Casual listening, web audio |
| 256 kbps | Very good — transparent for most people | ~8 MB | Music distribution, Spotify "Very High" |
| 320 kbps | Excellent — maximum MP3 quality | ~10 MB | When you want the best MP3 possible |
AAC vs MP3 at Same Bitrate
AAC consistently produces better perceptual quality than MP3 at the same bitrate. The difference is most noticeable at lower bitrates:
- 64 kbps: AAC is significantly better — MP3 sounds hollow, AAC is passable
- 128 kbps: AAC sounds noticeably cleaner and fuller than MP3
- 256 kbps: Both sound very good; difference is minimal
- 320 kbps: Essentially equivalent; both are close to lossless
Opus Performance
Opus at 128 kbps generally sounds better than MP3 at 320 kbps for voice content, and comparable or better for music. For streaming podcasts and voice communication, Opus at 32-64 kbps produces quality that MP3 can't match even at 128 kbps.
VBR vs CBR
CBR (Constant Bitrate): The same number of bits per second throughout the file. Simple, predictable file sizes, easier to seek. Legacy streaming systems preferred CBR because it's predictable to buffer.
VBR (Variable Bitrate): Allocates more bits to complex passages, fewer to simple ones. Produces smaller files at the same perceptual quality, or better quality at the same file size. The quality setting is a target quality level, not a target bitrate. Almost always the better choice for general use.
# VBR encoding examples
# LAME (MP3) - VBR quality levels (0=best, 9=worst)
lame -V 0 input.wav output.mp3 # ~245 kbps average, highest quality
lame -V 2 input.wav output.mp3 # ~190 kbps, transparent quality
lame -V 4 input.wav output.mp3 # ~165 kbps, good quality
lame -V 6 input.wav output.mp3 # ~130 kbps, acceptable
# FFmpeg - AAC VBR
ffmpeg -i input.wav -c:a aac -q:a 2 output.m4a # ~96 kbps VBR
ffmpeg -i input.wav -c:a aac -q:a 4 output.m4a # ~128 kbps VBR
# Opus - VBR is default, specify quality or bitrate
ffmpeg -i input.wav -c:a libopus -b:a 128k output.opus
Metadata and ID3 Tags
Audio metadata stores information about the track — title, artist, album, year, genre, track number, and artwork. Each format has its own metadata system, which is one reason migrating between formats requires re-tagging.
ID3 Tags (MP3)
MP3 uses ID3 tags, stored at the beginning (ID3v2) or end (ID3v1) of the file. ID3v2 is the modern version and supports all practical needs: embedded artwork, lyrics, multiple tags of the same type, Unicode text. ID3v1 is ancient (128 bytes fixed) and should be ignored.
Common ID3v2 frames:
TIT2 Title
TPE1 Lead artist/performer
TALB Album
TYER Year (TDRC in ID3v2.4)
TRCK Track number (e.g., "3" or "3/12")
TCON Genre (either text or numeric genre index)
APIC Attached picture (album art, embedded JPEG or PNG)
USLT Unsynchronized lyrics
COMM Comment
Vorbis Comments (FLAC, OGG)
FLAC and OGG use Vorbis Comments — a simple, flexible key=value system with no fixed schema. Conventional tag names:
TITLE=Track Name
ARTIST=Artist Name
ALBUM=Album Name
DATE=2024
TRACKNUMBER=3
TRACKTOTAL=12
GENRE=Electronic
DESCRIPTION=Comment here
METADATA_BLOCK_PICTURE=(embedded base64 image data)
Vorbis Comments support multiple values per key (ARTIST=Alice\nARTIST=Bob for featured artists), Unicode, and arbitrary custom fields — more flexible than ID3.
AAC/iTunes Metadata (M4A/MP4)
AAC files in M4A containers use iTunes-style atoms. The tag names are less readable but map to the same information:
©nam Title
©ART Artist
©alb Album
©day Year
trkn Track number
©gen Genre
©lyr Lyrics
covr Album artwork
Reading and Writing Tags in Python
from mutagen.mp3 import MP3
from mutagen.id3 import ID3, TIT2, TPE1, TALB, APIC
from mutagen.flac import FLAC
# Read MP3 tags
audio = MP3('song.mp3')
tags = ID3('song.mp3')
print(tags['TIT2'].text[0]) # Title
print(tags['TPE1'].text[0]) # Artist
# Write MP3 tags
tags = ID3('song.mp3')
tags['TIT2'] = TIT2(encoding=3, text='New Title') # encoding=3 = UTF-8
tags['TPE1'] = TPE1(encoding=3, text='Artist Name')
# Embed album art
with open('cover.jpg', 'rb') as img:
tags['APIC'] = APIC(
encoding=3,
mime='image/jpeg',
type=3, # Cover (front)
desc='Cover',
data=img.read()
)
tags.save()
# Read FLAC tags
audio = FLAC('song.flac')
print(audio['title'][0])
print(audio['artist'][0])
audio['title'] = ['New Title']
audio.save()
# Duration and technical info
audio = MP3('song.mp3')
print(f"Duration: {audio.info.length:.1f}s")
print(f"Bitrate: {audio.info.bitrate} bps")
print(f"Sample rate: {audio.info.sample_rate} Hz")
Batch Tag Editing
import os
from mutagen.mp3 import MP3
from mutagen.id3 import ID3, TALB
# Set album tag on all MP3s in a directory
album_name = "My Album 2024"
for filename in os.listdir('.'):
if filename.endswith('.mp3'):
tags = ID3(filename)
tags['TALB'] = TALB(encoding=3, text=album_name)
tags.save()
print(f"Tagged: {filename}")
Browser Audio Support
Browser audio support matters for web developers. The HTML5 <audio> element and the Web Audio API both depend on the browser's ability to decode the format.
| Format | Chrome | Firefox | Safari | Edge | Notes |
|---|---|---|---|---|---|
| MP3 | Yes | Yes | Yes | Yes | Universal — safest fallback |
| AAC (M4A) | Yes | Yes | Yes | Yes | Preferred on Apple devices |
| OGG Vorbis | Yes | Yes | No | Yes | No Safari support — needs fallback |
| Opus (WebM) | Yes | Yes | Yes* | Yes | *Safari 16.4+ |
| WAV | Yes | Yes | Yes | Yes | Large files, best for short clips |
| FLAC | Yes | Yes | Yes* | Yes | *Safari 11+, iOS 11+ |
Providing Multiple Sources
Use multiple <source> elements to cover all browsers:
<audio controls>
<source src="audio.opus" type="audio/ogg; codecs=opus">
<source src="audio.ogg" type="audio/ogg; codecs=vorbis">
<source src="audio.mp3" type="audio/mpeg">
Your browser doesn't support HTML audio.
</audio>
Detecting Support with JavaScript
const audio = document.createElement('audio');
// Returns "probably", "maybe", or "" (empty = no support)
const canPlayMp3 = audio.canPlayType('audio/mpeg');
const canPlayOgg = audio.canPlayType('audio/ogg; codecs=vorbis');
const canPlayOpus = audio.canPlayType('audio/ogg; codecs=opus');
const canPlayFlac = audio.canPlayType('audio/flac');
function supportsFormat(type) {
return audio.canPlayType(type) !== '';
}
Streaming Considerations
Bitrate and Buffer Size
For web streaming, match your bitrate to your expected network conditions. A 320 kbps MP3 stream requires 40 KB/s of sustained bandwidth. On a 1 Mbps mobile connection, that's only 3% of available bandwidth — fine. On an unreliable 2G connection, it will constantly buffer.
- Podcast streaming: 64-128 kbps MP3 or Opus — works on any reasonable connection
- Music streaming: 128-256 kbps AAC or 320 kbps MP3 — standard quality tiers
- Voice/VoIP: 32-64 kbps Opus — Opus excels here
- High-quality music: FLAC or 256+ kbps AAC — for audiophile streaming (Tidal, Apple Music Lossless)
Adaptive Bitrate Streaming
For music streaming apps, HLS (HTTP Live Streaming) lets you provide multiple quality levels and have the player automatically switch based on network conditions:
# Generate HLS streams at multiple bitrates with FFmpeg
ffmpeg -i input.wav \
-codec:a libfdk_aac -b:a 64k -map 0:a hls_64k.m3u8 \
-codec:a libfdk_aac -b:a 128k -map 0:a hls_128k.m3u8 \
-codec:a libfdk_aac -b:a 256k -map 0:a hls_256k.m3u8
Seeking in Streams
VBR files are slightly harder to seek accurately (because you can't calculate byte offset from time directly), but all modern players handle this correctly using TOC (Table of Contents) headers in MP3 or index data in other formats. Seeking in constant bitrate streams is mathematically exact.
Format Licensing and Streaming Services
Streaming services choose formats partly based on licensing costs and patent status. MP3's patents expired in 2017. AAC patents remain active but are licensed at scale. OGG and Opus are completely free. This is why Spotify's backend uses OGG Vorbis for Spotify Free and AAC or Vorbis for premium — license-free is cheaper at millions of streams.
When to Use Which Format
- Maximum compatibility is the priority
- Target devices include old hardware, cars, non-smart speakers
- Distributing to users who might use any player
- You need 320 kbps (maximum MP3 bitrate)
- Target is Apple ecosystem (iPhone, Mac, iPod)
- Uploading to YouTube
- You want better quality than MP3 at same bitrate
- Building for modern mobile where AAC hardware decoding is standard
- Game audio (Unity, Godot, Unreal handle it natively)
- Linux or open-source application
- You need to avoid patent/licensing concerns
- Web audio where you can rely on Chrome/Firefox
- Web audio or WebRTC applications
- Podcasts or voice streaming
- VoIP or real-time communication
- You want the best quality-to-size ratio
- Archiving music you care about
- Master copies you'll re-encode later
- Audiophile listening on desktop
- Any lossless audio storage need
- Audio production and editing
- Importing into DAWs
- Short sound effects where simplicity matters
- Exchanging audio between audio software
Conversion Workflows
The Golden Rule of Audio Conversion
Always convert from the highest quality source available. Never transcode between lossy formats if you can avoid it — you're compounding quality loss. The chain should be:
Source (WAV/FLAC/recording) → Lossy distribution copy (MP3/AAC/Opus)
↓
Never convert this back to lossy
or to a different lossy format
FFmpeg Conversion Examples
# WAV to MP3 (320 kbps CBR)
ffmpeg -i input.wav -codec:a libmp3lame -b:a 320k output.mp3
# WAV to MP3 (VBR, highest quality)
ffmpeg -i input.wav -codec:a libmp3lame -q:a 0 output.mp3
# FLAC to AAC (256 kbps)
ffmpeg -i input.flac -codec:a aac -b:a 256k output.m4a
# MP3 to FLAC (don't expect quality improvement — just container change)
ffmpeg -i input.mp3 -codec:a flac output.flac
# WAV to Opus (128 kbps)
ffmpeg -i input.wav -codec:a libopus -b:a 128k output.opus
# Batch convert all FLAC to MP3 in current directory
for f in *.flac; do
ffmpeg -i "$f" -codec:a libmp3lame -q:a 0 "${f%.flac}.mp3"
done
Browser-Based Converters
For one-off conversions without installing FFmpeg, the Audio Tools handle all common conversions in-browser:
Converting MP3 to FLAC doesn't restore quality — it preserves what exists.
Format Comparison Table
| Format | Type | Typical Size | Quality | Browser | Best Use |
|---|---|---|---|---|---|
| MP3 | Lossy | 3-10 MB | Good (at 256+ kbps) | Universal | Distribution |
| AAC | Lossy | 3-8 MB | Better than MP3 | Universal | Apple/streaming |
| OGG Vorbis | Lossy | 3-8 MB | Comparable to AAC | No Safari | Games/Linux |
| Opus | Lossy | 2-6 MB | Best at low bitrate | Good | Web/voice |
| FLAC | Lossless | 15-30 MB | Perfect | Good | Archiving |
| WAV | Uncompressed | 30-50 MB | Perfect | Universal | Production |
| AIFF | Uncompressed | 30-50 MB | Perfect | Good | Mac production |