Main Page | See live article | Alphabetical index

UTF-7

UTF-7 (7-bit Unicode Transformation Format) is a variable-length character encoding that is used to represent Unicode-encoded text using a stream of ASCII character for use in MIME messages.

MIME technically requires that the encoding used to send email is ASCII, so any email that uses a Unicode encoding is invalid. However, this restriction is universally ignored. UTF-7 allows mail to use Unicode but also follow the standards.

Description

UTF-7 is standardized as RFC 1642, A Mail-Safe Transformation Format of Unicode.

Characters below 0x80 (hexadecimal notation) within the ASCII range (except for the + character) are encoded as-is. Any character above 0x80 is encoded with an escape sequence of a + byte followed by the UTF-16 representation, encoded in Base64, and terminated with a - byte. Literal + characters are encoded as +-.

Examples