People keep asking me where to find information about how to write a country or language name in the native language and script for links to localized versions of their web pages, so I thought I'd put a list together myself.

I make no guarrantees of correctness at this stage. It is still a work in progress. Starting the 3 May 2003, I have shown in the Notes column individual informants who have confirmed or corrected entries. These entries are likely to be correct. List sources may not always be 100% correct, so caution is advised.

Please send corrections or additional entries to ishida@w3.org, preferably using utf-8 encoded HTML attachments for native language text. Thanks!

This page is encoded using UTF-8 (Unicode). In many cases you will need appropriate fonts, and sometimes rendering capabilities, to view the text correctly.

Note: The source created by Michael Everson appears to (for the most part) provides translations for the phrase 'the X language', rather than the short form that would normally appear in a list of languages. Many of these are therefore listed in the notes column.

English ISO Native form Notes
Abkhaz ab аҧсуа бызшәа
Afrikaans af Afrikaans
Albanian sq shqip Geonames lists shqipe - please tell me if you are in a position to authoritatively validate either of these.
Amharic am አማርኛ
Arabic ar العربية Informant: Najib Tounsi
Armenian hy Հայերեն Everson: հայերեն լեզու
Azerbaijani az azərbaycan Everson: azərbaycan dil.
Bambara bm Bamanankan Informant: Don Osborn
Basque eu euskera Werner Fröhlich: '"euskara" being the adjective and "euskera" the language name'.
Belorussian be Беларуская Everson: Беларуская мова.
Bulgarian bg Български Informant: Ivan Herman
Catalan ca Català
Chinese zh 中文 Note that written Chinese usually uses a simplified or traditional form - the relevant list entries are shown immediately below.
Chinese (Simplified script) zh-Hans 简体中文 Informant: Ivan Herman. This is Chinese written with the Simplified version of the script (used principally in mainland China and Singapore). Since zh-Hans was only introduced recently, zh-CN is a common ISO code sequence for this.
Chinese (Traditional script) zh-Hant 繁體中文 Informant: Ivan Herman. This is Chinese written with the Traditional version of the script (used principally in mainland Taiwan and Hong Kong). Since zh-Hant was only introduced recently, zh-TW is a commonly used ISO code sequence for this.
Croatian hr Hrvatski
Czech cs čeština Informant: Martin Prachař. "Český jazyk (=Czech language) is locally used mainly as a name of a subject to study at schools". Werner Frölich: "Český ... is the general adjective Czech. If you refer to the language, it's čeština".
Danish da Dansk
Dutch nl Nederlands
Estonian et Eesti
Ewe ee Ɛʋɛ
Finnish fi suomi Informants: Ossi Nykänen, Åke Persson. Everson: Suomen kieli.
French fr français
French (Canadian) fr-CA français canadien
Fula/Fulah/Fulani ff Fulfulde, Pulaar, Pular Informant: Don Osborn. "Name varies by region."
Galician gl Galego
Georgian ka Everson: ქართული ენა
German de Deutsch Informant: Klaus Birkenbihl
Greek el Ελληνικά
Hausa ha Hausa
Hebrew he עברית Informant: Michel Bercovier
Hindi hi हिंदी
Hungarian hu Magyar Informant: Ivan Herman. Everson: magyar nyelv
Icelandic is Íslenska
Indonesian id Bahasa indonesia
Irish ga Gaeilge
Italian it italiano Informant: Oreste Signore
Japanese ja 日本語
Kannada kn ಕನ್ನಡ Informant: Narasimha Datta
Kazakh kk Қазақ
Kinyarwanda rw Kinyarwanda
Kirghiz ky Кыргыз
Kirundi rn Kirundi
Korean ko 한국어 Informant: Ismael Funes-Aguilera. "한글 is the name of the Korean alphabet. The language is properly called 한국어/韓國語 or 한국말."
Latvian lv Latviešu Everson: latviešu valoda
Lithuanian lt Lietuviškai Informant: Åke Persson. Everson: lietuvių kalba. I was also told Lietuvių by a Lithuanian student once.
Luo luo Dholuo Informant: Rose Oburra
Macedonian mk Македонски
Malaysian ms Bahasa melayu
Maltese mt Malti Everson: il-Malti
Norwegian no Norsk This could be bokmål or nynorsk variants - do they need to be distinguished?
Pashto ps پښتو Informant: Said Marjan Zazai
Persian fa فارسی
Polish pl polski
Portuguese pt português
Portuguese (Brazilian) pt-BR português brasileiro
Romanian ro Română Ismael Funes-Aguilera: "român or limbă română (= Romanian language) ". Validation by Romanian speakers welcome.
Romansch ? Rumantsch
Russian ru Pyccĸий
Serbian sr Srpski
Српски
First is in Cyrillic, second in Latin script.
Somali so Somali
Spanish es Español Ismael Funes-Aguilera: "but also widely know as castellano (Spanish Constitution states that both names are correct.)"
Slovak sk Slovenčina Informant: Åke Persson. Slovenský jazyk means 'the Slovak language'.
Slovenian sl Slovenščina Informant: Åke Persson. Slovenski jezik means 'the Slovene language'.
Swahili sw Kiswahili Informant: Don Osborn
Swedish sv svenska Informant: Don Olsson
Telugu te తెలుగు Informants: P.Sivaram Prasad, K.Siva Basava
Thai th ภาษาไทย The BBC page simply says ไทย. Werner Fröhlich: "ไทย being the general adjective for Thai, and ภาษาไทย when referring to the language".
Turkish tr Tϋrkçe
Ukrainian uk Українська
Urdu ur اردو
Uzbek uz o'zbek Informant: Werner Fröhlich: "Uzbek is cited as Ўзбек in Cyrillic spelling. However the country has changed to the Latin script in 1993 and the spelling is now: o'zbek as adjective or o'zbekcha meaning "in Uzbek" (that is "in the Uzbek language"). All Turkic languages use this form when referring to the language.".
Vietnamese vi Tiếng Việt Note that the acute accent associated with the e in the first word should appear to the right of the circumflex, not above it. You'll need a Unicode font that is Vietnamese capable and aware that this is Vietnamese text to see this correctly.
Welsh cy Cymraeg
Wolof wo Wolof Informant: Don Osborn
Xhosa xs isiXhosa Informant: Don Osborn
Yoruba yo Yorùbá Informant: Don Osborn
Zulu zu isiZulu Informant: Don Osborn