Main Page | See live article | Alphabetical index

Interword separation

Interword separation is the set of symbol or spacing conventions used by the orthography of a script to separate words. According to Paul Saenger in Spaces between Words, the early Semitic languages -- which had no vowels -- had interword separations, but languages with vowels (principally Greek and Latin) lost the separation, not regaining it until much later.

Types of separations

The ancient Anatolian hieroglyphs frequently (but not always) used vertical lines to separate words. Similarly, Linear B used short vertical lines. However, this technical advance mostly died out.

One reference implies that Phoenician originally used slashes and dots to mark word boundaries. It continues to say that Hebrew and Aramaic scribes borrowed the slash and dot advance, and in Aramaic used a space.

Ethiopic inscriptions used a vertical line, but on paper was written as two dots, resembling a colon. This double-dot symbol also appears in ancient Turkic.

The Romans used the interpunct -- a small dot -- to separate words for a while before abandoning it.

Because Hebrew script and Arabic script don't have vowels, it is particularly important to recognize word boundaries. While Hebrew and Arabic have always used spaces between words, some letters also have different shapes depending upon their position.

Five Hebrew letters take a different shape when they are at the end of a word. Arabic characters have up to three different shapes, depending upon whether they are at the beginning, middle, or end of a word. Additionally, characters can have yet another shape when they stand alone, as headings in an index.

The Nastaliq version of the Arabic script also uses vertical space to separate words. The beginning of each word is written high up above the baseline, while the end of the word is low, near the baseline. (The line of text ends up looking a little bit like the teeth of a saw.) While Nastaliq script is sometimes used to write Arabic, it's more often used for Farsi, Uyghur, Pushtu, and Urdu.

Rediscovery of spaces in Latin

The Irish appear to have been the first to consistently use blank spaces to delimit word boundaries in the Latin alphabet, somewhere around 600 AD to 800 AD. (As Gaelic is from a different branch of the Indo-European language family than Latin, the Irish would have had much more difficulty reading Latin than the Romans would have had. Thus they would have had greater incentive to make reading Latin easier.)

See also Space (punctuation).