Universal Text Encoding Converter - 50+ Encodings
Convert between 50+ text encodings: UTF-8, Shift-JIS, GB2312, Windows-1252. Fix mojibake, migrate legacy systems. Browser-based, no uploads.
Source (Input)
Select the current encoding of your text
Drop text file here or click to browse
Supports .txt, .csv, .log, .xml, .html, etc.
Target (Output)
Select the desired output encoding
Common Mojibake Patterns & Fixes
| You See (Mojibake) | Should Be | Cause | Solution |
|---|---|---|---|
é | é | UTF-8 read as Windows-1252 | Source: Windows-1252 → Target: UTF-8 |
è | è | UTF-8 read as ISO-8859-1 | Source: ISO-8859-1 → Target: UTF-8 |
€ | € | UTF-8 Euro sign read wrong | Source: Windows-1252 → Target: UTF-8 |
?????? | Any non-ASCII | Characters not in encoding | Convert to UTF-8 (supports all characters) |
Ñлово | слово (Cyrillic) | UTF-8 Cyrillic read as Latin | Source: Windows-1252 → Target: UTF-8 |
| Random symbols | 日本語 (Japanese) | Wrong Japanese encoding | Try Shift-JIS, EUC-JP, ISO-2022-JP |
| Garbled Chinese | 中文 (Chinese) | Wrong Chinese encoding | Try GBK, GB2312, Big5, GB18030 |
한글 broken | 한글 (Korean) | Wrong Korean encoding | Try EUC-KR as source |
Encoding Families Quick Reference
Unicode (Modern Standard)
- UTF-8: Variable-width, 1-4 bytes, universal support
- UTF-16: 2 or 4 bytes, used by Windows internally
- Use for: All new projects, web, databases
ISO-8859 Series (Legacy European)
- ISO-8859-1 (Latin-1): Western Europe, no Euro
- ISO-8859-15 (Latin-9): Western Europe with Euro
- ISO-8859-2: Central/Eastern Europe
- ISO-8859-5: Cyrillic
- Use for: Legacy Unix, old databases
Windows Codepages
- Windows-1252: Western European (superset of Latin-1)
- Windows-1251: Cyrillic (most common)
- Windows-1250: Central European
- Use for: Windows legacy apps, old emails
East Asian (CJK)
- Shift-JIS: Japanese Windows, most common
- EUC-JP: Japanese Unix/Linux
- ISO-2022-JP: Japanese email (JIS)
- GBK/GB2312: Simplified Chinese
- Big5: Traditional Chinese
- EUC-KR: Korean legacy
- Use for: Legacy Asian system integration
Cyrillic Family
- Windows-1251: Most common, Windows systems
- KOI8-R: Russian Internet, email, Unix
- KOI8-U: Ukrainian variant
- ISO-8859-5: Standard Cyrillic
- Use for: Russian/Ukrainian legacy data
When to Use Each
- New projects: Always UTF-8
- Legacy migration: Source encoding → UTF-8
- Government/bank data: Check source system docs
Processing...