Main Page | See live article | Alphabetical index

Ubykh language

Ubykh is a language of the Northwestern Caucasian group, which was spoken by the people of the same name up until the early 1990s. It is characterised, like most other Northwest Caucasian languages, by the following features:

Ubykh is known in the linguistic literature by many names: Ubykh (derived from Abdzakh Circassian wbkh) and variants Ubikh, Ubih (Turkish) and Oubykh (French), and Pekhi (derived from Ubykh twaqh) and its Germanicisedicised variant Pkhy.

Table of contents
1 Phonetics
2 Grammar
3 Lexicon
4 Evolution
5 History


Unfortunately, since Ubykh is so consonantally complex, a satisfactory ASCII transcription for it is not yet in place. A phonemic transcription that can be used is as follows:


Open    a, aa
Close    (schwa)

Vowel Behaviour

Two (arguably three) basic phonemic vowels occur: /a/, // (
schwa, as in English /about/ and /aa/ which, despite the phonemic notation, does not differ from /a/ in length.

Ubykh presents many vowel allophones, since so many consonants are present. Ten basic phonetic vowels appear, derived from the two phonemic vowels adjacent to labialised or palatalised consonants. These ten phonetic vowels are a e i o u a: e: i: o: u:. The phonetic vowels are the standard five found in many of the world's languages, such as Georgian, and the same five vowels with phonetic length. In general, the following rule applies: Cwa > Co; Cja > Ce; Cw > Cu, and Cj > Ci.

Other, more complex vowels have been noted in Ubykh: on occasion, nasal sonorants (particularly n) may decay into vowel nasality. For instance, /naynsy/ young man has been noted as SAMPA /nE~ys'_w/, not /nayns'_w/ as the phonemic notation would indicate.

a appears initially very frequently, particularly in the function of the definite article. is extremely restricted initially, appearing only in doubly transitive verb forms where all three arguments are third person. Even then, itself may be dropped to provide an even shorter form:  =>  =>  (0)
He gave it to him

Both vowels appear without restriction finally, although when is unstressed finally, it tends to be dropped: tw father becomes the definite form the father.


                       Voiced  Voiceless  Ejective  Nasal  Approximant
Bilabial stop          b       p          p'        m      w
Phar. bilabial stop    b       p          p'        m      w
Bilabial fricative     v       f
Alveolar stop          d       t          t'        n      r
Alveolar fricative     z       s
Alveolar affricate     j       c          c' 
Alv. labialised stop   dw      tw          tw' 
Alveolar lateral               lh         lh'/l'           l
Postalveolar fric.     zh      sh                          y
Postalveolar affr.     jh      ch         ch' 
Postalv. lab. fric.    zhw     shw
Alveolopalatal fric.   zj      sj
Alveolopalatal affr.   jj      cj         cj' 
Alv-pal. lab. fric.    zy      sy
Alv-pal. lab. affric.  jy      cy         cy' 
Retroflex fric.        zr      sr
Retroflex affr.        jr      cr         cr' 
Velar stop             g *     k *        k' *
Velar fricative        g       k
Palatalised velar stop gj      kj         kj' 
Labialised velar stop  gw      kw         kw' 
Uvular stop                    q          q' 
Uvular fricative       gh      qh
Pal. uvular stop               qj         qj' 
Pal. uvular fric.      ghj     qhj
Labialised uvular stop         qw         qw' 
Lab. uvular fricative  ghw     qhw
Phar. uvular stop              q          q' 
Phar. uvular fric.     gh      qh
Phar. lab. uvular stop         qw         qw' 
Phar. lab. uv. fric.   ghw     qhw
Glottal                        h

* borrowed from Turkish and Circassian

SAMPA-compliant rendition of the above is available in the Ubykh phonology article.

/r/ and /h/ are two of Ubykh's rarest consonant phonemes, freeing them for use as digraphicic elements. Underlining marks a pharyngealised consonant, /h/ marks postalveolarisation and frication of dorsal sounds (a convention from English orthography), /r/ marks retroflexion (a convention from Vietnamese orthography) and apostrophe marks ejectivity.

Consonant Behaviour

Eighty-three basic consonants are noted. Bilabial, labiodental, alveolar, palatoalveolar, alveolopalatal, retroflex, velar, uvular and glottal consonants are all present. Labialisation is present on all classes barring the glottal, bilabial, labiodental and retroflex consonants; palatalisation may be noted on uvulars and velars. Eighty of the 83 consonants are found in native vocabulary. The plain velars [k' g k] are found only in loanwords: gaarga crow (from Turkish), kawar slat, batten (from Abdzakh Adyghe), mak'f estate, legacy. As well, the pharyngealised labial consonants p and p' are almost exclusively noted in words where they are associated with another pharyngealised consonant (for instance, q'aap'a handful), but are occasionally found outside this context (the verb root t'ap' is an example, meaning to explode, to burst).

Some consonants are extremely rare: g (fricative) is noted in the words adga Circassia and ga testis, and v is noted in just six words: va (four homophones meaning oak, to spy on, moustache and acorn), vacr'kj' meaning spark, vasra firebrand, ava thick (of fabric) and sp'ava coarse flour. The frequency of consonants in Ubykh is very variable; together, the two phonemes n and q' account for over 20% of the consonant phonemes encountered.

Far fewer allophones of consonants are noted, mainly because a small acoustic difference can be phonetic when so many consonants are involved. The alveolopalatal labialised fricatives were sometimes realised as alveolar labialised fricatives, and the uvular stop q had an allophonic glottal stop due to the influence of the Kabardian and Adyghe languages spoken in the same area.

All consonants can appear word-initially. Restrictions on word-final consonants have not yet been investigated; however, Ubykh has a preference for open syllables (CV) over closed ones (VC or CVC). The pharyngealised consonants m, w, p and p' have not been noted word-finally.


Linguistically speaking, Ubykh is agglutinative and polysynthetic, meaning that many sentence components can be incorporated into one word:

we shall not be able to go back

a.w.q'a.q''.ba 3sg-OBJ.2sg-SUBJ.say.PAST.PERF.COND if you had said it

Ubykh is often also extremely concise in its word forms. The word meaning they had given them to us - sjantwq'ayt' - is just two syllables in length, compared to seven in English.


The noun system in Ubykh is quite simple. Ubykh presents four noun cases (the oblique-ergative case may be argued to be two homophonous cases with differing function, thus presenting five cases in total): A pair of postpositions, -laaq and -ghaafa, have been noted as synthetic datives (cf. a.xj.laaq a.s.twad().aw I will send it to the prince), but their status as cases is best discounted.

Nouns do not distinguish grammatical gender; feminine gender is distinguished in the verb paradigm only. The definite article is a-: att the man. There is no indefinite article, but za-(root)-gwara (literally one-(root)-certain) translates French un and Turkish bir: za.naynsy.gwara a certain young man.

Number is only marked directly on the noun in the ergative case, with -na. The number marking of the absolutive argument is either by usage of alternate roots (e.g. akwn blas he is in the car vs akwn blazhwa they are in the car) or by a verb suffix -aa: akj'an he goes, akj'aan they go. Interestingly, the appearance of the second person plural prefix sy- triggers this plural suffix regardless of whether that prefix represents the ergative, the absolutive or the oblique argument:

syastwaan   I give you all to him (absolutive)
ssyntwaan he gives me to you all (oblique)
assytwaan  you all give it/them to me (ergative)
Note that in this last sentence, the plurality of it (a-) is obscured, and the meaning can be either I give it to you all or I gave them to you all.

Adjectives, in most cases, are simply suffixed to the noun: chbzjya pepper with plh red becomes chbzjyaplh red pepper.

Postpositions are rare; most locative semantic functions, as well as some non-local ones, are provided with preverbal elements: a.w.s.qhja.tx.n you wrote it for me. However, there are a few postpositions: sghwa s.gjacr', like me; a.xj.laaq, near the prince.


A past-present-future distinction of verb tense exists (the suffixes -q'a and -awt represent past and future) and an imperfective aspect suffix is also found (-yt' , which can combine with tense suffixes). Dynamic and stative verbs are contrasted, as in Arabic, and verbs have several nominal forms. Any verb root may be treated as a noun by using noun case-endings with it: a.q'a.n he spoke, s.q'a my speech.

Verbs agree with the subject, the direct object and the indirect object. Pronominal benefactives are also part of the verbal complex:
He gives it to you for me

Gender only appears as part of the second person paradigm, and then only at the speaker's discretion. The feminine second person index is qha-, which behaves exactly like all other pronominal prefixes:
He gives it to you (masc/fem) for me He gives it to you (fem) for me

Note that the normal second person prefix w- can represent either a male or female referent, borne out by the fact that the free pronoun for second person singular is wghwa, and *qhaghwa is no longer used:

wghwa w.gjacr' 
like you (masc/fem)

wghwa qha.gjacr' you 2sg(fem) like you (fem)

*qhaghwa qha.gjacr' ??? 2sg(fem)

Questions may be marked grammatically, using verb suffixes or prefixes:

wana a.w.bya.q'
that 3sg-OBJ.2sg-SUBJ.see.PAST.QUESTION(yes/no)
Did you see that?

w.p'c'a.y What is your name?

Other types of questions, involving the pronouns where and what, may also be marked only in the verbal complex:

Where are you going?

saa.w.q'a.q''.y what.2sg-SUBJ.say.PAST.IMPF.QUESTION(complex) What had you said?

Many local and other functions are provided by preverbal elements, and it is in this that Ubykh is hideously complex. Some indicate simple direction: y- signifies towards the speaker:

You went

w.y.kj'a.q'a 2sg.hither.go.PAST You came here

However, preverbs can have meanings that would take up entire phrases in English. The preverb ycy'aa signifies on the earth or in the earth, for instance:

gha.dya a.ycy'aa.naa.lh.q'a
They buried his body (lit. They put his body in the earth)

and faa signifies that an action is done out of, into or with regard to a fire:

a.mjja.n za.cjcjaqja faa.s.tqhw.n fire.1sg-SUBJ.extract.PRES
I take a brand out of the fire 


Native Vocabulary

Ubykh syllables have a strong tendency to be CV, although VC and CVC also exist. Consonant clusters are not so large as in Abzhui Abkhaz or in Georgian, being almost always of two terms. Three-term clusters exist in two words - ndgha sun and psta to swell up, but psta is a loan from Adyghe, and ndgha is more often pronounced ndgha when it appears alone. Compounding plays a large part in Ubykh and, indeed, in all Northwest Caucasian semantics. There is no verb to love, for instance; one says I love you this way:

ch'a.n w.z.bya.n
good.ADV 2sg-OBJ.1sg-SUBJ.see.PRES
I see you well

Reduplication occurs in some roots, often those with onomatopoeic values (qhaqha to curry(comb) from qha to scrape; k'rk'r, to cluck like a chicken (a loan from Adyghe); warqwarq, to croak like a frog).

Roots and affixes can be as small as one phoneme; f, for instance, the root for to eat, drops its final schwa in emphatic imperative forms (w.f eat it!) and in the past tense (a.s.f.q'a I ate it). However, some words may be as long as seven syllables (although these are usually compounds): shqw'awsjalhadcja staircase.

Foreign Loans

The majority of loanwords in Ubykh are derived from either Adyghe or Turkish. Towards the end of Ubykh's life, a large influx of Adyghe words was noted; Hans Vogt's Ubykh dictionary of some 3000 roots notes more than a hundred words of Abdzakh Adyghe origin. The phonemes g, k, and k' (all as stops, not fricatives) were borrowed from Turkish and Adyghe. lh' also appears to be an Adyghe loan, although at a greater time depth. It is possible, too, that g (fricative) is a loan from Adyghe.

The following words are all loans:

alaman   Germany, German (Turkish)
aslan    lion (Turkish)
brw    drill, auger (Turkish, via Abdzakh)
ga       testis (Adyghe)
gaarga   crow (Turkish)
kawar    slat, batten (Abdzakh)
k'rk'r to cluck like a chicken (Adyghe)
wrs    Russia, Russian (Russian)

Some words are borrowed from less influential stock: xwa pig is believed to be borrowed from a proto-
Semitic *huka, and agjar slave from an Iranian root.


In the scheme of Northwest Caucasian evolution, Ubykh is the most divergent language of the Abkhaz-Abaza branch, and has a number of features which are unique even within that family. It has fossilised palatal class markers where all other Northwest Caucasian languages preserve traces of an original labial class: the Ubykh word for heart, gj, corresponds to gw (where stands for schwa), which is the word for heart in Abkhaz, Abaza, Kabardian and Adyghe.

Ubykh also possesses groups of pharyngealised consonants otherwise found in the Northwest Caucasian family only in some dialects of Abkhaz and Abaza. All other NWC languages possess true pharyngeal consonants, but Ubykh is the only language to use pharyngealisation as a feature of secondary articulation.

With regard to the other languages of the family, Ubykh is closer to Abkhaz than to any other member, but is quite close, both lexically and grammatically, to Adyghe.


Ubykh was spoken in the Caucasus Mountains of Georgia until 1875, when the Russian Tsar of the time drove the Ubykhs out of Georgia. The Ubykh people eventually came to settle in Turkey, but the process of language death had already begun. Turkish and Circassian became the lingue franchi of the Ubykh people, since these were the languages far more likely to be used in everyday exchanges. Ubykh eventually was spoken only in the household, then only by the elders of the people. A large influx of loan words entered Ubykh from both Turkish and Circassian as a result of the use of those languages. Finally, on the 7th of October, 1992, the Ubykh language died, when its last fluent speaker - a farmer named Tevfik Esen - passed away in his sleep.

Fortunately, thousands of pages of material and many audio recordings had been collected and collated by a number of linguists, including Georges Dumzil, Hans Vogt and George Hewitt, with the help of some of the last speakers:

Julius von Mszros, a Hungarian linguist, visited Turkey in 1930 to take down some material on Ubykh. While his work was marred by a phonemic transcription which was inadequate for representing all the phonemes of Ubykh, his work Die Pkhy-Sprache was extensive and accurate inasmuch as his transcription allowed, and the book set the foundation for Ubykh linguistics. The Frenchman Georges Dumzil also visited Turkey in 1930 to record some Ubykh, and would eventually become the most celebrated linguist of Ubykh of all time.

Dumzil published a collection of Ubykh folktales in the late 1950s, and soon thereafter, the realisation that Ubykh only had two phonemic vowels soon became widely accepted. Hans Vogt, a Norwegian, followed Dumzil's work on myth-telling with a monumental dictionary. While the dictionary contained many errors, it is still one of the masterpieces of Ubykh linguistics, and along with a corrigendum written by Dumzil, which appeared a few years later, remains a vital tool in the study of Ubykh.

Later in the 1960s and into the early 1970s, Dumzil began to concern himself with the etymology of Ubykh, publishing a series of notes in various journals on Ubykh etymology in particular and Northwest Caucasian etymology in general. Finally, in 1975, the monolith of Ubykh linguistics appeared, Dumzil's Le Verbe Oubykh, which gave a comprehensive account of the verbal and nominal morphology of the language.

Into the 1980s - and indeed, until today - Ubykh linguistics has slowed drastically. No other major treatises have been published; however, one Dutch linguist is currently trying to compile a new Ubykh dictionary incorporating the corrections to Vogt's dictionary of 1963, and a separate project is also underway in Australia. Some of the Ubykh people are also showing interest in relearning their difficult language.

A partial Ubykh to English dictionary (MS Word format) can be downloaded from " class="external">

Most material collated on Ubykh was printed in French. The following books are considered to be the foundation of Ubykh linguistics:

People who have published literature on Ubykh include

thefamouseccles, the author of this page, is learning Ubykh. If anyone's interested in joining him, or in knowing more about this language, he's happy to answer questions about it.