Base64 Encoding Explained: How It Works and When to Use It
Base64 turns binary data into plain ASCII text. Here's exactly how the encoding works, when it's the right tool, and when it's the wrong one — with code examples in JavaScript, Node.js, and Python.
Plenty of systems were built to handle text, not arbitrary binary. SMTP (email), HTTP headers, XML, JSON — all of them work on the assumption that data is printable ASCII. If you try to shove raw binary bytes through them, things break. Base64 solves this by mapping every possible byte sequence to a set of 64 safe printable characters. The result is longer, but it passes through any text-based system without corruption. The "64" comes from the character set: 26 uppercase letters, 26 lowercase letters, 10 digits, plus two symbols. That's 64 characters total, each representing 6 bits of data.What Base64 Actually Is
Every 6-bit value from 0 to 63 maps to one character. Here's the full table:The Encoding Table
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0 | A | 16 | Q | 32 | g | 48 | w |
| 1 | B | 17 | R | 33 | h | 49 | x |
| 2 | C | 18 | S | 34 | i | 50 | y |
| 3 | D | 19 | T | 35 | j | 51 | z |
| 4 | E | 20 | U | 36 | k | 52 | 0 |
| 5 | F | 21 | V | 37 | l | 53 | 1 |
| 6 | G | 22 | W | 38 | m | 54 | 2 |
| 7 | H | 23 | X | 39 | n | 55 | 3 |
| 8 | I | 24 | Y | 40 | o | 56 | 4 |
| 9 | J | 25 | Z | 41 | p | 57 | 5 |
| 10 | K | 26 | a | 42 | q | 58 | 6 |
| 11 | L | 27 | b | 43 | r | 59 | 7 |
| 12 | M | 28 | c | 44 | s | 60 | 8 |
| 13 | N | 29 | d | 45 | t | 61 | 9 |
| 14 | O | 30 | e | 46 | u | 62 | + |
| 15 | P | 31 | f | 47 | v | 63 | / |
The = character is a padding marker, not part of the 64-character alphabet itself.
URL-safe Base64 substitutes - for + (value 62) and _ for / (value 63). Everything else stays the same.
Let's encode the two-character string Step-by-Step Encoding
Hi manually.
Step 1 — Get the byte values
H = ASCII 72, i = ASCII 105.
Step 2 — Write each byte as 8-bit binary
H = 01001000
i = 01101001Concatenated: 0100100001101001 — 16 bits total.
Step 3 — Split into 6-bit groups
010010 | 000110 | 1001??The last group is only 4 bits. Pad it with zeros on the right to make 6: 100100.
Step 4 — Look up each value in the table
| 6-bit group | Decimal | Base64 char |
|---|---|---|
010010 | 18 | S |
000110 | 6 | G |
100100 | 36 | k |
Step 5 — Add padding
Two bytes of input means three 6-bit groups, but Base64 output must be in multiples of 4 characters. Three characters + one = = four. Result: SGk=.
You can verify this: btoa("Hi") in a browser console returns exactly SGk=.
Padding rules
- Input length divisible by 3 → no padding
- 1 byte remainder →
==appended - 2 bytes remainder →
=appended
The overhead is exactly 4/3, or approximately 33.3%. Here's why:Size Overhead Math
| Original size | Base64 size | Increase |
|---|---|---|
| 1 KB | ~1.37 KB | ~370 bytes |
| 10 KB | ~13.7 KB | ~3.7 KB |
| 100 KB | ~137 KB | ~37 KB |
| 1 MB | ~1.37 MB | ~370 KB |
| 10 MB | ~13.7 MB | ~3.7 MB |
MIME email encoding adds line breaks every 76 characters, which adds another ~2% on top. So a 1 MB email attachment becomes roughly 1.4 MB encoded.
When you gzip Base64 text, much of the overhead disappears — Base64 output compresses well because its character set is small and patterns repeat. But gzip on the wire doesn't help with in-memory or parsed sizes in the browser.
A data URI is a way to embed file content directly in a URL. The format is:Image Embedding and Data URIs
data:[mediatype][;base64],<encoded-data>
Practical examples
<!-- PNG embedded in an img tag -->
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUh...">
<!-- SVG embedded in CSS as a background -->
.icon {
background-image: url('data:image/svg+xml;base64,PHN2Zy...');
}
<!-- Inline SVG in CSS without Base64 (often better for SVGs) -->
.icon {
background-image: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg">...</svg>');
}
SVGs don't actually need Base64 encoding in CSS. URL-encoding the raw SVG is often smaller because SVG text compresses better than Base64. Only use Base64 for binary formats like PNG or JPEG.
Email image embedding
Many email clients block external image URLs for privacy reasons. Embedding images as Base64 data URIs in the HTML body (not as MIME attachments) gets around this for webmail clients, though some email clients still restrict data URIs. The safest approach for email is MIME inline attachments with Content-ID references:
Content-Type: multipart/related; boundary="boundary"
--boundary
Content-Type: text/html
<img src="cid:logo@example.com">
--boundary
Content-Type: image/png
Content-Transfer-Encoding: base64
Content-ID: <logo@example.com>
iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJ...
When to use data URIs for images
| Scenario | Recommendation |
|---|---|
| Small icons, sprites (<2 KB) | Good — one less HTTP request |
| Inline SVGs in HTML | Good — or just use raw <svg> directly |
| Email images | Good — MIME inline works reliably |
| Images used on multiple pages | Avoid — no browser caching across pages |
| Images larger than 10 KB | Avoid — the overhead isn't worth it |
| Background images in external CSS files | Avoid — bloats the CSS file, delays rendering |
Convert any image to a Base64 data URI with the Image to Base64 tool.
Other Real-World Uses
HTTP Basic Authentication
The Authorization: Basic header encodes username:password in Base64:
// username:password → dXNlcm5hbWU6cGFzc3dvcmQ=
Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=This is transport encoding, not security. Always use HTTPS. The Base64 portion decodes trivially — it's just there because the HTTP spec requires the header to be ASCII.
Binary data in JSON APIs
JSON has no binary type, so binary payloads get Base64-encoded:
{
"filename": "receipt.pdf",
"contentType": "application/pdf",
"content": "JVBERi0xLjQKJeLjz9MKMSAwIG9iag..."
}This is common in webhook payloads, document generation APIs, and anywhere binary data needs to cross a JSON boundary.
MIME email attachments
Content-Type: application/pdf
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="report.pdf"
JVBERi0xLjQKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZw...
Storing binary in text columns
Old database schemas sometimes have only VARCHAR or TEXT columns available. Base64 lets you store a small binary blob as text. This is a last resort — if you control the schema, use a proper binary column (BLOB, BYTEA) instead.
Cryptographic outputs
Hash functions and ciphers output raw bytes. Base64 is the standard way to represent them as printable strings:
// SHA-256 hash as Base64
const hash = await crypto.subtle.digest('SHA-256', data);
const b64 = btoa(String.fromCharCode(...new Uint8Array(hash)));
// → "n4bQgYhMfWWaL+qgxVrQFaO/TxsrC4Is0V1sFbDwCgg="
// JWT tokens are three Base64url-encoded sections joined by dots
header.payload.signature
Standard Base64 uses URL-Safe Base64
+ and /. Both have meaning in URLs — + represents a space in query strings, and / is a path separator. Embedding standard Base64 in a URL requires percent-encoding those characters, which makes the string longer and uglier.
URL-safe Base64 (RFC 4648 §5) makes two substitutions:
| Standard Base64 | URL-safe Base64 | Why |
|---|---|---|
+ | - | Safe in URLs |
/ | _ | Safe in URLs and filenames |
= (padding) | Often omitted | Padding is redundant if length is known |
JWT tokens use URL-safe Base64 without padding. So does Python's base64.urlsafe_b64encode() and Node's Buffer.toString('base64url').
Converting between the two is trivial:
// Standard → URL-safe
const urlSafe = standard.replace(/+/g, '-').replace(///g, '_').replace(/=/g, '');
// URL-safe → Standard (add padding back)
const standard = urlSafe.replace(/-/g, '+').replace(/_/g, '/')
-
'='.repeat((4 - urlSafe.length % 4) % 4);
Code Examples
JavaScript (Browser)
// Basic encode/decode (Latin-1 only)
const encoded = btoa('Hello, World!'); // "SGVsbG8sIFdvcmxkIQ=="
const decoded = atob('SGVsbG8sIFdvcmxkIQ=='); // "Hello, World!"
// UTF-8 encode (handles emoji, CJK, etc.)
function encodeUtf8(str) {
const bytes = new TextEncoder().encode(str);
const binary = String.fromCharCode(...bytes);
return btoa(binary);
}
// UTF-8 decode
function decodeUtf8(b64) {
const binary = atob(b64);
const bytes = Uint8Array.from(binary, c => c.charCodeAt(0));
return new TextDecoder().decode(bytes);
}
// Encode a File object to Base64 (e.g., from a file input)
async function fileToBase64(file) {
const arrayBuffer = await file.arrayBuffer();
const bytes = new Uint8Array(arrayBuffer);
const binary = bytes.reduce((acc, b) => acc + String.fromCharCode(b), '');
return btoa(binary);
}
// Create a downloadable file from Base64
function downloadFromBase64(b64, filename, mimeType) {
const link = document.createElement('a');
link.href = data:${mimeType};base64,${b64};
link.download = filename;
link.click();
}
Node.js
// String encoding
const encoded = Buffer.from('Hello, World!', 'utf8').toString('base64');
// → "SGVsbG8sIFdvcmxkIQ=="
const decoded = Buffer.from('SGVsbG8sIFdvcmxkIQ==', 'base64').toString('utf8');
// → "Hello, World!"
// URL-safe Base64 (Node 14+)
const urlSafe = Buffer.from('Hello!').toString('base64url');
// File to Base64
import { readFileSync } from 'fs';
const imageBase64 = readFileSync('photo.jpg').toString('base64');
const dataUri = data:image/jpeg;base64,${imageBase64};
// Base64 back to file
import { writeFileSync } from 'fs';
const buffer = Buffer.from(base64String, 'base64');
writeFileSync('output.jpg', buffer);
// Streaming approach for large files (avoids loading entire file into memory)
import { createReadStream, createWriteStream } from 'fs';
import { Transform } from 'stream';
const base64Transform = new Transform({
transform(chunk, encoding, callback) {
callback(null, chunk.toString('base64'));
}
});
createReadStream('large-file.bin')
.pipe(base64Transform)
.pipe(createWriteStream('large-file.b64'));
Python
import base64
Encode bytes to Base64
encoded = base64.b64encode(b'Hello, World!')
→ b'SGVsbG8sIFdvcmxkIQ=='
Decode Base64 to bytes
decoded = base64.b64decode('SGVsbG8sIFdvcmxkIQ==')
→ b'Hello, World!'
String to Base64 (handle encoding explicitly)
text = "Héllo, Wörld!"
encoded = base64.b64encode(text.encode('utf-8')).decode('ascii')
URL-safe Base64
url_safe = base64.urlsafe_b64encode(b'Hello+World/Test')
→ b'SGVsbG8rV29ybGQvVGVzdA==' (+ and / replaced with - and _)
File to Base64
with open('image.png', 'rb') as f:
image_b64 = base64.b64encode(f.read()).decode('ascii')
data_uri = f"data:image/png;base64,{image_b64}"
Validate before decoding (catches padding errors)
import binascii
try:
data = base64.b64decode(untrusted_input, validate=True)
except binascii.Error as e:
print(f"Invalid Base64: {e}")
PHP
<?php
// Encode
$encoded = base64_encode('Hello, World!');
// → "SGVsbG8sIFdvcmxkIQ=="
// Decode
$decoded = base64_decode('SGVsbG8sIFdvcmxkIQ==');
// → "Hello, World!"
// File to data URI
$imageData = base64_encode(file_get_contents('image.png'));
$dataUri = "data:image/png;base64,{$imageData}";
// URL-safe Base64
function base64url_encode(string $data): string {
return rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
}
function base64url_decode(string $data): string {
return base64_decode(strtr($data, '-_', '+/') . str_repeat('=', 3 - (3 + strlen($data)) % 4));
}
Performance Tradeoffs
Size increase
The 33% overhead is the main cost. A page with a 50 KB CSS file that embeds five 2 KB icons as Base64 has added about 3.3 KB to its CSS. That's acceptable. A page that embeds a 200 KB hero image as Base64 has added 66 KB to its HTML — that's not acceptable because the HTML can't be cached separately from the page content.
No separate caching
When you embed an image as a data URI in an HTML file, the image is cached only as part of that HTML response. If 100,000 users visit your page, all 100,000 of them download that image data as part of the HTML. If you serve it as a separate file with a long Cache-Control: max-age, returning users download it zero additional times.
Parsing overhead
Browsers decode Base64 data URIs synchronously during HTML/CSS parsing. A large data URI blocks the parser. External files load asynchronously and in parallel. For anything over 2–3 KB, external files with HTTP/2 multiplexing almost always win.
Encoding/decoding CPU cost
Encoding and decoding Base64 in the browser is fast for small payloads. At scale — say, encoding thousands of images server-side per second — the CPU overhead adds up. Node.js Buffer operations are implemented in C++ and are fast. Python's base64 module is pure Python for most operations; the binascii module uses C extensions and is faster for bulk processing.
When Base64 is clearly the right choice
- Inlining a 1 KB SVG icon directly in CSS to eliminate one HTTP request on a high-traffic page
- Sending a PDF to a document-signing API that only accepts JSON
- Embedding a company logo in HTML email where external images get stripped
- Storing a user's avatar thumbnail (under 2 KB after resizing) in a database text field for fast retrieval without a separate storage lookup
When Base64 is the wrong choice
- Any image over 10 KB on a public webpage — use
<img src="...">with proper caching - Images shared across multiple pages — they can't be cached independently when embedded
- Large file transfers — use multipart/form-data or presigned upload URLs instead
- "Hiding" data — it provides no obfuscation to anyone who knows what Base64 looks like
Tools
Quick Reference
Use Base64 when
Skip Base64 when