Welsh orthography

Last updated

Welsh orthography uses 29 letters (including eight digraphs) of the Latin script to write native Welsh words as well as established loanwords. [1] [2]

Contents

Majuscule forms (also called uppercase or capital letters)
A B C CH D DD E F FF G NG H I J L LL M N O P PH R RH S T TH U W Y
Titlecase forms
ABCChDDdEFFfGNgHIJLLlMNOPPhRRhSTThUWY
Minuscule forms (also called lowercase or small letters)
abcchdddefffgnghijlllmnopphrrhstthuwy

Welsh orthography makes use of multiple diacritics, which are primarily used on vowels, namely the acute accent (acen ddyrchafedig), the grave accent (acen ddisgynedig), the circumflex (acen grom, to bach, or hirnod) and the diaeresis (didolnod). They are considered variants of their base letter, i.e. they are not alphabetised separately.

The letter j has only recently been accepted into Welsh orthography: for use in words borrowed from English which retain the /dʒ/ sound, even when it originally was not represented by j in English orthography, as in garej ("garage") and ffrij ("fridge"). Older borrowings of English words containing /dʒ/ resulted in the sound being pronounced and spelled in various other ways, resulting in occasional doublets such as Siapan and Japan ("Japan"). [lower-alpha 1]

The letters k, q, v, x, z are sometimes used in technical terms, like kilogram, volt and zero, but in all cases can be, and often are, nativised: cilogram, folt and sero. [3]

History

A 19th-century Welsh alphabet printed in Welsh, without <j>  or <rh>  Welsh alphabet card italic C19th.jpg
A 19th-century Welsh alphabet printed in Welsh, without j or rh

The earliest samples of written Welsh date from the 6th century and are in the Latin alphabet (see Old Welsh). The orthography differs from that of modern Welsh, particularly in the use of p, t, c to represent the voiced plosives /b,d,ɡ/ non initially. Similarly, the voiced fricatives /v,ð/ were written b, d. [4]

By the Middle Welsh period, this had given way to quite a bit of variability: Although b, d, g were now used to represent /b,d,ɡ/, these sounds were also often written as in Old Welsh, while /v/ could be denoted by u, v, , f, w. In earlier manuscripts, moreover, fricatives were often not distinguished from plosives (e.g. t for /θ/, now written th). [5] The grapheme k was also used, unlike in the modern alphabet, particularly before front vowels. [4] The disuse of this letter is at least partly due to the publication of William Salesbury's Welsh New Testament and William Morgan's Welsh Bible, whose English printers, with type letter frequencies set for English and Latin, did not have enough k letters in their type cases to spell every /k/ as k, so the order went "C for K, because the printers have not so many as the Welsh requireth"; [6] this was not liked at the time, but has become standard usage.

In this period, ð (capital Ð) was also used interchangeably with dd, such as the passage in the 1567 New Testament: A Dyw y sych ymaith yr oll ðeigre oddiwrth y llygeid, which contains both ð and dd. Elsewhere, the same word is spelt in different ways, e.g. newydd and newyð. [7]

The printer and publisher Lewis Jones, one of the co-founders of Y Wladfa , the Welsh-speaking settlement in Patagonia, favoured a limited spelling reform which replaced Welsh f/v/ and ff/f/ with v and f, and from circa 1866 to 1886 Jones employed this innovation in a number of newspapers and periodicals he published and/or edited in the colony. [4] However, the only real relic of this practice today is the Patagonian placename Trevelin ("mill town"), which in standard Welsh orthography would be Trefelin.

In 1928, a committee chaired by Sir John Morris-Jones standardised the orthography of modern Welsh.

In 1987, a committee chaired by Professor Stephen J. Williams made further small changes,[ which? ] introducing j . Not all modern writers adhere to the conventions established by these committees. [8]

Letter names and sound values

"N" and "S" indicate variants specific to the northern and southern dialects of Welsh. Throughout Wales an alternative system is also in use in which all consonant letters are named using the corresponding consonant sound plus a schwa (e.g. cy/kə/ for èc). In this system the vowels are named as below.

LetterNameCorresponding soundsEnglish approximation
aa/a,ɑː/cat (short) / father (long)
bbi/b/bat
cèc/k/case
chèch/χ/No English equivalent; similar to loch in Scottish, but pronounced further back.
d [* 1] di/d/day
ddèdd/ð/these
ee/ɛ,eː/bed (short) / closest to hey (long)
fèf/v/of
ffèff/f/four
gèg/ɡ/gate
ngèng/ŋ/thing
h [* 2] aets/h/hat
ii, i dot (S)/ɪ,iː,j/bit (short) / machine (long) / yes (as consonant; before vowels)
jje/d͡ʒ/jump (only found in loanwords, usually from English but still in wide use such as jeli ('jelly', IPA: [dʒɛlɪ] ) and jîns ('jeans', IPA: [dʒɪnz] )
lèl/l/lad
llèll/ɬ/not present in English; a voiceless alveolar lateral fricative. A bit like what the consonant cluster "hl" would sound like.
mèm/m/mat
nèn/n/net
oo/ɔ,oː/Short, like "bog" in RP; long like dawn in RP or stove in Scottish English
ppi/p/pet
phffi/f/phone
rèr/r/Rolled R
rhrhi/r̥/Voiceless rolled R
s [* 1] ès/s/sat
t [* 1] ti/t/stick
thèth/θ/thin
uu (N), u bedol (S)/ɨ̞,ɨː/ (N), [* 3]
/ɪ,iː/ (S)
for Southern variants: bit (short) / machine (long); in Northern dialects /ɨ̞,ɨː/ not found in English. Identical to "î" and "â" in Romanian, and similar to the "e" in English roses.
ww/ʊ,uː,w/push (short) / pool (long) / wet (as consonant)
y [* 4] /ɨ̞,ɨː,ə/ (N), [* 3]
/ɪ,iː,ə,əː/ (S)
for Southern variants: bit (final syllable, short) / machine (final syllable, long)
above (other places, short) / roses /ɨ̞,ɨː/, found in certain dialects of English that differentiate "Rosa's" and "roses", for example, General American.
Notes
  1. 1 2 3 The sequence si indicates /ʃ/ when followed by a vowel; similarly, di and ti sometimes indicate /dʒ/ and /tʃ/ respectively when followed by a vowel, although these sounds are spelled j and ts in loanwords like jẁg "jug" and wats "watch".
  2. In addition to representing the phoneme /h/, h indicates voicelessness in the graphemes mh, nh, ngh and rh. The digraph ph – which indicates the aspirate mutation [ disambiguation needed ] of p (e.g. ei phen-ôl) – may also be found very occasionally in words derived from Greek (e.g. Pharo), although most words of Greek origin are spelt with ff (e.g. ffotograff).
  3. 1 2 In the North, the letters u and y are occasionally pronounced /ɪ,iː/, the same as in the South, rather than /ɨ̞,ɨː/. This is usually the case when the preceding vowel is /ɪ/ or when y is preceded or followed by g/ɡ/ or followed by w/u/, forming a diphthong. "Morffoleg y Gymraeg". Geiriadur yr Academi. Bangor University. Retrieved 25 July 2014.
  4. The vowel letter y indicates /ə/ in unstressed monosyllabic words (e.g. y "the", fy "my") or non-final syllables (regardless of whether these are stressed or not), but /ɨ̞,ɨː/ (N) or /ɪ,iː/ (S) in word-final syllables (again, regardless of stress).

Diphthongs

OrthographyNorthern dialectsSouthern dialectsEnglish (approximation only)
ae/ɑːɨ̯/, /eːɨ̯//ai̯/, /ɛi̯/eye, may
ai/ai̯//ai̯/eye
au/aɨ̯/, /a//ai̯/, /ɛ/eye. Realised as bet (south) and cat (north) in plural endings.
aw/au̯,ɑːu̯//au̯/how
ei/ɛi̯//ɛi̯/As in eight
eu/əɨ̯//əi̯/As in height
ew/ɛu̯,eːu̯//ɛu̯/Roughly like Edward with the d removed: E'ward, or Cockney pronunciation of -ell in words like well, hell.
ey/e.ɨ̯//e.ɪ/Two distinct vowels.
iw/ɪu̯//ɪu̯/not usually present in English except in the interjection Ew!; closest to 'i-oo' (short i). A small number of English dialects have this sound in words that have "ew" or "ue". Such words, in the majority of English dialects that distinguish ew/ue and oo, would usually have /juː/ instead. See the Phonological history of English consonant clusters article for more information.
oe/ɔɨ̯,ɔːɨ̯//ɔi̯/boy
oi/ɔi̯//ɔi̯/boy
ou/ɔɨ̯,ɔːɨ̯//ɔi̯/boy
ow/ɔu̯//ɔu̯/goal
uw/ɨu̯//ɪu̯/not present in English; closest to 'i-oo' (short i)
wy/ʊ̯ɨ,u̯ɨ//ʊ̯i/not present in English; closest to gooey
yw/ɨu̯,əu̯//ɪu̯,əu̯//ɪu̯/ not present in English; closest to 'i-oo' (short i)
/əu/ like "goat" in Received Pronunciation or like "house" in Canadian English

Diacritics

Welsh makes use of a number of diacritics.

The circumflex (ˆ) is mostly used to mark long vowels, so â, ê, î, ô, û, ŵ, ŷ are always long. However, not all long vowels are marked with a circumflex, so the letters a, e, i, o, u, w, y with no circumflex do not necessarily represent short vowels; see § Predicting vowel length from orthography.

The grave accent (`) is sometimes used, usually in words borrowed from another language, to mark vowels that are short when a long vowel would normally be expected, e.g. pas/paːs/ (a cough), pàs/pas/ (a pass/permit or a lift in a car); mwg/muːɡ/ (smoke), mẁg/mʊɡ/ (a mug).

The acute accent (´) is sometimes used to mark a stressed final syllable in a polysyllabic word. Thus the words gwacáu (to empty) and dicléin (decline) have final stress. However, not all polysyllabic words with final stress are marked with the acute accent (Cymraeg "Welsh" and ymlaen "forward/onward", for example, are written with none). The acute may also be used to indicate that a letter w represents a vowel where a glide might otherwise be expected, e.g. gẃraidd/ˈɡʊ.raið/ (two syllables) "manly", as opposed to gwraidd/ˈɡwraið/ (one syllable) "root".

Similarly, the diaeresis (¨) is used to indicate that two adjoining vowels are to be pronounced separately (not as a diphthong). However, it is also used to show that the letter i is used to represent the cluster /ij/ which is always followed by another vowel, e.g. copïo (to copy) pronounced /kɔ.ˈpi.jɔ/, not */ˈkɔp.jɔ/.

The grave and acute accents in particular are very often omitted in casual writing, and the same is true to a lesser extent of the diaeresis. The circumflex, however, is usually included. Accented vowels are not considered distinct letters for the purpose of collation.

Predicting vowel length from orthography

As mentioned above, vowels marked with the circumflex are always long, and those marked with the grave accent are always short. If a vowel is not marked with a diacritic, its length must be determined by its environment; the rules vary a bit according to dialect. [9] [10]

In all dialects, only stressed vowels may be long; unstressed vowels are always short.

An unmarked (stressed) vowel is long:

An unmarked vowel is short:

The vowel y, when it is pronounced /ə/, is always short[ contradictory ][ citation needed ] even when it appears in an environment where other vowels would be long: cyfan (whole) /ˈkəvan/. When pronounced as a close or near-close vowel (/ɨ/ or /ɨ̞/ in the North, /i/ or /ɪ/ in the South), y follows the same rules as other vowels: dydd (day) /ˈdɨːð/ (North) ~ /ˈdiːð/ (South), gwynt (wind) /ˈɡwɨ̞nt/ (North) ~ /ˈɡwɪnt/ (South).

Before l, m, n, and r, unmarked vowels are long in some words and short in others:

vowellong
short
igwin
/ɡwiːn/
 (wine)prin
/prɪn/
 (scarcely)
ehen
/heːn/
 (old)pen
/pɛn/
 (head)
ydyn
/dɨːn/~/diːn/
 (man)gwyn
/ɡwɨ̞n/~/ɡwɪn/
 (white)
wstwmo
/ˈstuːmo/
 (bank up a fire)amal
/ˈamal/
 (often)
ecelyn
/ˈkeːlɪn/
 (holly)calon
/ˈkalɔn/
 (heart)

(The last four examples are given in South Welsh pronunciation only since vowels in nonfinal syllables are always short in North Welsh.)

Before nn and rr, vowels are always short: onn/ˈɔn/ (ash trees), ennill/ˈɛnɪɬ/ (to win), carreg/ˈkarɛɡ/ (stone).

In Northern dialects, long vowels are stressed and appear in the final syllable of the word. Vowels in non-final syllables are always short. In addition to the rules above, a vowel is long in the North before a consonant cluster beginning with s: tyst/tɨːst/ (witness). Before ll, a vowel is short when no consonant follows the ll: gwell (better) /ɡwɛɬ/ It is long when another consonant does follow the ll: gwallt/ɡwɑːɬt/ (hair).

In Southern dialects, long vowels may appear in a stressed penultimate syllable as well as in a stressed word-final syllable. Before ll, a stressed vowel in the last syllable can be either long (e.g. gwell "better" /ɡweːɬ/) or short (e.g. twll "hole" /tʊɬ/). However, a stressed vowel in the penult before ll is always short: dillad/ˈdɪɬad/ (clothes).[ citation needed ] Before s, a stressed vowel in the last syllable is long, as mentioned above, but a stressed vowel in the penult is short: mesur (measure) /ˈmɛsir/. Vowels are always short before consonant clusters: sant/sant/ (saint), gwallt/ɡwaɬt/ (hair), tyst/tɪst/ (witness).

Digraphs

The Middle-Welsh LL ligature.
Unicode: U+1EFA and U+1EFB. Lldigraph.png
The Middle-Welsh LL ligature.
Unicode: U+1EFA and U+1EFB.

While the digraphs ch, dd, ff, ng, ll, ph, rh, th are each written with two symbols, they are all considered to be single letters. This means, for example that Llanelli (a town in South Wales) is considered to have only six letters in Welsh, compared to eight letters in English. Consequently, they each take up only a single space in Welsh crosswords. Ll itself had actually been written as a ligature in Middle Welsh.

Sorting is done in correspondence with the alphabet. For example, la comes before ly, which comes before lla, which comes before ma. Automated sorting may occasionally be complicated by the fact that additional information may be needed to distinguish a genuine digraph from a juxtaposition of letters; for example llom comes after llong (in which the ng stands for /ŋ/) but before llongyfarch (in which n and g are pronounced separately as /ŋɡ/).

Although the digraphs above are considered to be single letters, only their first component letter is capitalised when a word in lower case requires an initial capital letter. Thus:

Llandudno, Ffestiniog, Rhuthun, etc. (place names)
Llŷr, Rhian, etc. (personal names)
Rhedeg busnes dw i. Llyfrgellydd ydy hi. (other sentences starting with a digraph)

The two letters in a digraph are both capitalised only when the whole word is in uppercase:

LLANDUDNO, LLANELLI, Y RHYL (as on a poster or sign)

The status of the digraphs as single letters is reflected in the stylised forms used in the logos of the National Library of Wales (logo) and Cardiff University (logo).

See also

Related Research Articles

<span class="mw-page-title-main">Diacritic</span> Modifier mark added to a letter

A diacritic is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek διακριτικός, from διακρίνω. The word diacritic is a noun, though it is sometimes used in an attributive sense, whereas diacritical is only an adjective. Some diacritics, such as the acute ⟨á⟩, grave ⟨à⟩, and circumflex ⟨â⟩, are often called accents. Diacritics may appear above or below a letter or in some other position such as within the letter or between two letters.

<span class="mw-page-title-main">Y</span> Penultimate letter of the Latin alphabet

Y, or y, is the twenty-fifth and penultimate letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide. According to some authorities, it is the sixth vowel letter of the English alphabet. Its name in English is wye, plural wyes.

The acute accent, ◌́, is a diacritic used in many modern written languages with alphabets based on the Latin, Cyrillic, and Greek scripts. For the most commonly encountered uses of the accent in the Latin and Greek alphabets, precomposed characters are available.

The circumflex is a diacritic in the Latin and Greek scripts that is also used in the written forms of many languages and in various romanization and transcription schemes. It received its English name from Latin: circumflexus "bent around"—a translation of the Greek: περισπωμένη.

<span class="mw-page-title-main">Â</span> Latin letter A with circumflex

Â, â (a-circumflex) is a letter of the Inari Sami, Skolt Sami, Romanian, Vietnamese and Mizo alphabets. This letter also appears in French, Friulian, Frisian, Portuguese, Turkish, Walloon, and Welsh languages as a variant of the letter "a". It is included in some romanization systems for Khmer, Persian, Balinese, Sasak, Russian, and Ukrainian.

<span class="mw-page-title-main">Digraph (orthography)</span> Pair of characters used to write one phoneme

A digraph or digram is a pair of characters used in the orthography of a language to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

German orthography is the orthography used in writing the German language, which is largely phonemic. However, it shows many instances of spellings that are historic or analogous to other spellings rather than phonemic. The pronunciation of almost every word can be derived from its spelling once the spelling rules are known, but the opposite is not generally the case.

Italian orthography uses 21 letters of the 26-letter Latin alphabet to write the Italian language. This article focuses on the writing of Standard Italian, based historically on the Florentine dialect, and not the other Italian dialects.

The Catalan and Valencian orthographies encompass the spelling and punctuation of standard Catalan and Valencian. There are also several adapted variants to the peculiarities of local dialects of Insular Catalan.

The first Slovak orthography was proposed by Anton Bernolák (1762–1813) in his Dissertatio philologico-critica de litteris Slavorum, used in the six-volume Slovak-Czech-Latin-German-Hungarian Dictionary (1825–1927) and used primarily by Slovak Catholics.

In an alphabetic writing system, a silent letter is a letter that, in a particular word, does not correspond to any sound in the word's pronunciation. In linguistics, a silent letter is often symbolised with a null sign U+2205EMPTY SET. Null is an unpronounced or unwritten segment. The symbol resembles the Scandinavian letter Ø and other symbols.

Dutch orthography uses the Latin alphabet. The spelling system is issued by government decree and is compulsory for all government documentation and educational establishments.

French orthography encompasses the spelling and punctuation of the French language. It is based on a combination of phonemic and historical principles. The spelling of words is largely based on the pronunciation of Old French c. 1100–1200 AD, and has stayed more or less the same since then, despite enormous changes to the pronunciation of the language in the intervening years. Even in the late 17th century, with the publication of the first French dictionary by the Académie française, there were attempts to reform French orthography.

<span class="mw-page-title-main">Spanish orthography</span> System for writing in Spanish

Spanish orthography is the orthography used in the Spanish language. The alphabet uses the Latin script. The spelling is fairly phonemic, especially in comparison to more opaque orthographies like English, having a relatively consistent mapping of graphemes to phonemes; in other words, the pronunciation of a given Spanish-language word can largely be predicted from its spelling and to a slightly lesser extent vice versa. Spanish punctuation uniquely includes the use of inverted question and exclamation marks: ⟨¿⟩⟨¡⟩.

The Portuguese language began to be used regularly in documents and poetry around the 12th century. Unlike neighboring Romance languages that adopted formal orthographies by the 18th century, the Portuguese language did not have a uniform spelling standard until the 20th century. The formation of the Portuguese Republic in 1911 was motivation for the establishment of orthographic reform in Portugal and its overseas territories and colonies. Brazil would adopt an orthographic standard based on, but not identical to, the Portuguese standard a few decades later.

<span class="mw-page-title-main">Portuguese orthography</span> Alphabet and spelling

Portuguese orthography is based on the Latin alphabet and makes use of the acute accent, the circumflex accent, the grave accent, the tilde, and the cedilla to denote stress, vowel height, nasalization, and other sound changes. The diaeresis was abolished by the last Orthography Agreement. Accented letters and digraphs are not counted as separate characters for collation purposes.

Greek orthography has used a variety of diacritics starting in the Hellenistic period. The more complex polytonic orthography, which includes five diacritics, notates Ancient Greek phonology. The simpler monotonic orthography, introduced in 1982, corresponds to Modern Greek phonology, and requires only two diacritics.

Papiamento has two standardised orthographies, one used on the island of Aruba and the other on the islands of Curaçao and Bonaire. The Aruban orthography is more etymological in nature, while the other is more phonemic. Among the differences between the two standards, one obvious difference is the way the name of the language is written. In Aruba it is written Papiamento, while in Curaçao and Bonaire it is written Papiamentu. The governments of Curaçao and Aruba formally standardised orthographic rules in 1976 and 1977, respectively.

References

  1. "Yr Wyddor Gymraeg/The Welsh Alphabet" . Retrieved 4 March 2015.
  2. "Alphabets" . Retrieved 30 May 2017.
  3. Thomas, Peter Wynn (1996) Gramadeg y Gymraeg. Cardiff: University of Wales Press: 757.
  4. 1 2 3 Watkins, T. Arwyn (1993) "Welsh" in Ball, Martin J. with Fife, James (Eds) The Celtic Languages. London/New York: Routledge: 289-348.
  5. Evans, Simon D. (1964) A Grammar of Middle Welsh. Dublin: ColourBooks Ltd.
  6. English and Welsh,[ dead link ] an essay by J. R. R. Tolkien
  7. Testament Newydd (1567) Pen 21 [The 1567 New Testament, Revelation 21].
  8. Thomas, Peter Wynn (1996) Gramadeg y Gymraeg. Cardiff: University of Wales Press: 749.
  9. Awbery, Gwenllian M. (1984). "Phonotactic constraints in Welsh". In Ball, Martin J.; Jones, Glyn E. (eds.). Welsh Phonology: Selected Readings. Cardiff: University of Wales Press. pp. 65–104. ISBN   0-7083-0861-9.
  10. Morris Jones, J. (1913). A Welsh Grammar, Historical and Comparative  . Oxford: Clarendon Press. pp. 11–18, 65–74.
  11. Rhys, John (December 2003). Example of a book using the "ll" ligature. Adegi Graphics LLC. ISBN   9781402153075 . Retrieved 20 September 2014.
  1. While the International Rugby Club uses the term "Siapan" in Welsh, sources such as Yr Atlas Cymraeg Newydd and the Welsh Wikipedia use the term "Japan".