URL Encoding (Percent-Encoding), Explained
β± 2 min readURLs can only safely carry a limited set of characters, and some of those β ? & = / # β have jobs in the URL's own grammar. Percent-encoding is how everything else (spaces, ampersands inside values, non-English text) travels without breaking the structure.
Try the URL Encoder / Decoder β
The mechanics: %XX
Each unsafe byte is replaced by % followed by its two-digit hex value, using the character's UTF-8 bytes. A space becomes %20. An ampersand inside a value becomes %26. A Γ© becomes %C3%A9 β two bytes, two escapes.
Reserved vs unreserved characters
- Unreserved β letters, digits, - _ . ~ β never need encoding.
- Reserved β : / ? # [ ] @ ! $ & ' ( ) * + , ; = β legal in a URL but only in their structural roles; encode them when they appear inside a value.
- The classic failure: a search query like 'fish & chips' sent unencoded β the & silently splits it into two parameters.
encodeURI vs encodeURIComponent
JavaScript ships both. encodeURIComponent encodes everything reserved β use it for individual values you insert into a query string. encodeURI leaves the URL's structural characters (/ ? & =) intact β only for encoding a complete URL that is already structurally correct. Nine times out of ten, you want encodeURIComponent.
Two bugs to avoid
- Double-encoding: encoding an already-encoded value turns %20 into %2520 β visible as literal '%20' text in the destination page. Encode exactly once, at the moment you build the URL.
- Decoding too early: decode a query string before splitting on &, and any encoded %26 inside a value becomes a real & that corrupts the parse. Split first, then decode each part.
Frequently asked questions
What is the difference between %20 and + for spaces?
%20 is the universal percent-encoding of a space. + means space only inside application/x-www-form-urlencoded data (classic HTML form submissions); elsewhere a + is a literal plus sign.
Why does Γ© turn into %C3%A9?
Percent-encoding works on bytes. Γ© is two bytes in UTF-8 (0xC3 0xA9), so it encodes as two escapes.
When should I NOT encode?
Don't encode the structural characters of a URL you're assembling (the / between path segments, the ? and & of the query). Encode the values you slot into that structure.