The Latin alphabet, also called the Roman alphabet, as used by the English language consists of the following characters:

A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z

History The Latin alphabet derives mainly from the Etruscan alphabet. According to Hammarstrm (in Jensen 521), the letters for B, D, O, X hail from a Southern Italian Greek alphabet. However, there are Etruscan abecedaria with B, D, O, X (Sampson 108). Rix (203) claims that the sound values of those letters in Latin are to be attributed to Greek influence, the letters themselves were probably all present when the Romans took over the alphabet from the Etruscans (Wachter 33).

It is uncontested that the alphabet is mainly of Etruscan origin. The sound value of C proves that clearly. Etruscan had no voiced plosives, so this symbol - derived from the Greek gamma - came to stand for the unvoiced /k/ in Etruscan - as later in Latin. Jensen (521) notes that the letters C, K, Q were originally used in Latin according to Etruscan usage: C in front of /e, i/; K in front of /a/; Q in front of /u, o/. The letters thus stand for different allophones of /k/ (in the case of Latin, also /g/ and probably the phonemes /k_w/ and /g _w/ in the case of QU and GU). These spelling rules are due to the names of the letters: gamma or gemma; kappa; qoppa or quppa (Wachter 15). In Etruscan there was no /o/, so Q was used both in front of /o/ and /u/ in Latin. Y and Z were later additions taken from the Greek alphabet. G was created approximately in the 3rd century BC by Spurius Carvilius Ruga as a modification of C (Sampson 109). F (digamma) stood for /w/ in both Etruscan and Latin, but the Romans simplified the FH-/f/combination to F /f/. The semi-vowels /w, j/ and the vowels /u, u:, i, i:/ were written with the same letters, namely V and I respectively.

There was no 'U'; instead, there was the semi-vowel 'V'. There was no 'W', although 'V' was pronounced as the modern English 'W'. They didn't have the letter 'J', instead they had the semi-vowel 'I'. Because 'C' was hard in Classic Latin, 'K' was used for words borrowed from Greek, such as the abbreviations "K." or "Kal." for "kalendae" (the first day of a month).


Use in other languages In the course of its history, the Latin alphabet was used for new languages, and therefore, some new letters and diacritics were created, e.g.:

  • the cedilla in (originally a little z written below the c) that symbolized /ts/ in Romance
  • the hacek in Slavonic languages, used to mark palatalised versions of the base letter, e.g. č.
  • the tilde in Spanish or some Portuguese vowels (originally a little n written above the letter) used to mark the elision of a former N, and then later to mark nasalisation of the base letter.

W is a letter made up from two U's. It was added in late Roman times to represent a Germanic sound. U and J were originally not distinguished from V and I respectively. In Old English, thorn þ, edh ð and wynn[?] ƿ - a Runic letter - were added. In modern Icelandic, thorn and edh are still used. The additional letters added in German are special presentations of earlier ligature forms (ae → ä, ue → ü or ſsß). French adds the circumflex to record elided consonants that were present in earlier forms and are often still present in the modern English cognate forms (Old French hostel → French hôtel = English hotel or Late Latin pasta → Middle French paste → French pâte and English paste).

Some Slavic languages use the latin alphabet rather than the Cyrillic. Among these, Polish uses a variety of ligatures with z to represent special phonetic values, and a dark l[?] - ł - for a sound similar to w. Czech uses diacritics as in Dvořak. The Slavic regions which stayed with the Orthodox church generally use Cyrillic instead which is much closer to the Greek alphabet. Hausa uses three additional consonants: ɓ, ɗ and ƙ.

Collating in other languages Languages that use the roman alphabet have varying collating rules:

  • In French[?] and English[?], characters with diaeresis (, , , , , ) are treated just like their un-accented versions. If two words differ only by an accent in French, the one with the accent is greater.
  • In German umlaut (Ä,Ö,Ü) are treated generally just like their non-umlauted versions; ß is always sorted as ss. This makes the alphabetic order Arg, Ärgerlich, Arm, Assistent, Aßlar, Assoziation. For phone directories and similar lists of names, the umlauts are to be collated like the letter combinations "ae", "oe", "ue". This makes the alphabetic order Udet, belacker, Uell, lle, Ueve, xkll, Uffenbach.
  • In the Swedish alphabet, "W" is seen as a variant of "V" and not a separate letter. It is however recognised and maintained in names, like in "William". The alphabet also has three extra vowels placed at its end (..., X, Y, Z, , , ).
  • The same extra vowels as in Swedish are also present in the Danish and Norwegian alphabets but in a different order and with different glyphs (..., X, Y, Z, , , ). Also, "Aa" collates as an equivalent to "". The Danish alphabet sees "W" as a variant of "V".
  • Some languages have more complex rules: for example, Spanish treated (til 1997) "CH" and "LL" as single letters, giving an ordering of CINCO, CREDO, CHISPA and LOMO, LUZ, LLAMA. This is not true anymore since in 1997 RAE adopted the more normal usage, and now LL is collated between LI and LO, and CH between CE and CI. The only Spanish specific collating question is Ñ (eñe) as a different letter collated after N.
  • In Dutch the combination IJ was formerly to be collated as Y (or sometimes, as a separate letter Y < IJ < Z), but is currently mostly collated as 2 letters (II < IJ < IK). Note that a word starting with ij that is written with a capital I is also written with a capital J, e.g. the town IJmuiden (mun. Velsen) and the river IJssel.
  • The Hungarian language has accents, umlauts, and double accents. The accent is ignored in collating, and the double accent, which indicates a long umlaut vowel, is treated as equal to the umlaut.
  • In Icelandic, Þ is added, and D is followed by Ð.
  • Both letters were also used by Anglo-Saxon scribes who also used the Runic letter Wynn to represent /w/.
  • Þ (called thorn; lowercase þ) is also a Runic letter, some scholars derive it from Latin D.
  • Ð (called eth; lowercase ð) is the letter D with an added stroke.
  • In Polish, specifically Polish letters derived from the Latin alphabet are collated after their originals: A, Ą, B, C, Ć, D, E, Ę, ..., L, Ł, M, N, Ń, O, Ó, P, ..., S, Ś, T, ..., Z, Ź, Ż.
  • In Esperanto, consonants with circumflex accents (ĉ, ĝ, ĥ, ĵ, ŝ), as well as ŭ (u with breve), are counted as separate letters and collated separately (c, ĉ, d, e, f, g, ĝ, h, ĥ, i, j, ĵ ... s, ŝ, t, u, ŭ, v, z).


