URL Encoding (Percent-Encoding), Explained

⏱ 2 min read

URLs can only safely carry a limited set of characters, and some of those β€” ? & = / # β€” have jobs in the URL's own grammar. Percent-encoding is how everything else (spaces, ampersands inside values, non-English text) travels without breaking the structure.

Try the URL Encoder / Decoder β†’

The mechanics: %XX

Each unsafe byte is replaced by % followed by its two-digit hex value, using the character's UTF-8 bytes. A space becomes %20. An ampersand inside a value becomes %26. A Γ© becomes %C3%A9 β€” two bytes, two escapes.

Reserved vs unreserved characters

  • Unreserved β€” letters, digits, - _ . ~ β€” never need encoding.
  • Reserved β€” : / ? # [ ] @ ! $ & ' ( ) * + , ; = β€” legal in a URL but only in their structural roles; encode them when they appear inside a value.
  • The classic failure: a search query like 'fish & chips' sent unencoded β€” the & silently splits it into two parameters.

encodeURI vs encodeURIComponent

JavaScript ships both. encodeURIComponent encodes everything reserved β€” use it for individual values you insert into a query string. encodeURI leaves the URL's structural characters (/ ? & =) intact β€” only for encoding a complete URL that is already structurally correct. Nine times out of ten, you want encodeURIComponent.

Two bugs to avoid

  • Double-encoding: encoding an already-encoded value turns %20 into %2520 β€” visible as literal '%20' text in the destination page. Encode exactly once, at the moment you build the URL.
  • Decoding too early: decode a query string before splitting on &, and any encoded %26 inside a value becomes a real & that corrupts the parse. Split first, then decode each part.

Frequently asked questions

What is the difference between %20 and + for spaces?

%20 is the universal percent-encoding of a space. + means space only inside application/x-www-form-urlencoded data (classic HTML form submissions); elsewhere a + is a literal plus sign.

Why does Γ© turn into %C3%A9?

Percent-encoding works on bytes. Γ© is two bytes in UTF-8 (0xC3 0xA9), so it encodes as two escapes.

When should I NOT encode?

Don't encode the structural characters of a URL you're assembling (the / between path segments, the ? and & of the query). Encode the values you slot into that structure.