helpinghand
search
needassistance
 

Mimer SQL Unicode Collation Charts

Prerequisite

The collation charts that can be obtained from this page may, for a proper display, need additional fonts to be installed on your computer. We suggest you to look at the Unicode Display Problems? page, or the Babelstone Custom Font List, for help in this matter. An alternative to get a complete character display is to use the PDF link provided, which opens up a PDF version of the respective chart.


Languages - Predefined and Downloadable

Below are specifications on sorting adjustments for various languages, so called tailorings, needed to get the correct national sort order compared to the Unicode default sorting order.

In the table below, languages with their names bolded are among the predefined collations included in the current version of Mimer SQL. For a summary, see our Collation Tailorings overview.

For some of the languages that are not bolded, the collation definition can be found and easily used by copy/paste. Where applicable, see Uyghur for example, the respective language's page contains a Collation link (in the top of the page) that leads to the CREATE COLLATION statement used to define the collation.

Afrikaans
Albanian
Amharic
Arabic
Armenian
Arumanian
Assamese
Asturian
Azerbaijani

Basque
Belarusian
Bengali
Bosnian
Breton
Bulgarian

Catalan
Chinese (康熙 KangXi)
Chinese (拼音 PinYin)
Chinese (五笔画 WuBiHua)
Chinese (注音 ZhuYin)
Corsican
Croatian
Czech

Danish
Dari
Dutch
Dzongkha

Edo
Elfdalian
English
Esperanto
Estonian
Ewe

Faroese
Filipino
Finnish
French
Frisian
Friulian

Galician
Georgian
German
German (Phonebook)
Greek
Greek-Latin
Greenlandic
Guarani
Gujarati

Hausa
Hebrew
Hindi
Hungarian

Icelandic
Igbo
Indonesian
Irish Gaelic
Italian

Japanese
Javanese

Kannada
Kashmiri
Kazakh
Khmer
Kirghiz
Konkani
Korean
Kurdish

Lao
Lao (Traditional)
Latin
Latvian
Lithuanian
Luxembourgish

Macedonian
Malay
Malayalam
Maltese
Manipuri
Maori
Marathi
Moldavian
Mongolian
Moore
Myanmar

Ndebele
Nepali
Norwegian

Occitan
Oriya
Oromo

Pashto
Persian
Polish
Portuguese
Punjabi

Quechua

Romanian
Romansch
Russian

Sami
Sanskrit
Scots
Scottish Gaelic
Sepedi
Serbian
Sesotho
Sindhi
Sinhala
Slovak
Slovenian
Somali
Sorani
Sorbian (Lower)
Sorbian (Upper)
Spanish
Spanish (Traditional)
Swahili
Swati
Swedish

Tajik
Tamil
Tatar
Telugu
Thai
Tibetan
Tigrinya
Tongan
Tsonga
Tswana
Turkish
Turkmen

Ukrainian
Urdu
Uyghur
Uzbek

Venda
Vietnamese
Vietnamese (Traditional)

Welsh
Wolof

Xhosa

Yiddish
Yoruba

Zulu
Scripts

In this context a script is a collection of symbols used to represent textual information. The Unicode Character Database (UCD) provides data for a mapping from Unicode characters to script names.

European Ordering Rules (EOR) is a standard that defines how Latin, Greek and Cyrillic scripts should be sorted. It should provide guidance on sorting European repertoires in Unicode.

ISO/IEC 8859-1 (SQL datatype CHAR)
The following script for Latin-1 representation is used with the CHAR datatype in SQL.

Latin-1

Unicode (SQL datatype NCHAR)
Below are scripts for the Unicode representation, used with the NCHAR datatype in SQL. The Default Unicode Collation Element Table (DUCET) is provided in the AllKeys table, as stated in the specification for the Unicode Collation Algorithm (UCA). This table provides a mapping from characters to collation elements. The following scripts represent different parts of the table, given in the order they are defined.

Variable  Common  Latin  Greek  Coptic 
Cyrillic  Glagolitic  Georgian  Armenian  Hebrew 
Phoenician  Samaritan  Arabic  Syriac  Mandaic 
Thaana  Nko  Tifinagh  Ethiopic  Devanagari 
Bengali  Gurmukhi  Gujarati  Oriya  Tamil 
Telugu  Kannada  Malayalam  Sinhala  Meetei‑Mayek 
Syloti‑Nagri  Saurashtra  Kaithi  Sharada  Takri 
Sundanese  Brahmi  Kharoshthi  Thai  Lao 
Tai‑Viet  Tibetan  Lepcha  Phags‑Pa  Limbu 
Tagalog  Hanunoo  Buhid  Tagbanwa  Buginese 
Batak   Rejang  Kayah‑Li  Myanmar  Chakma 
Khmer  Tai‑Le  New‑Tai‑Lue  Tai‑Tham  Cham 
Balinese  Javanese  Mongolian  Ol‑Chiki  Cherokee 
Canadian‑Aboriginal  Ogham  Runic  Old‑Turkic  Vai 
Bamum  Hangul  Hiragana‑Katakana  Bopomofo  Yi 
Lisu  Miao  Lycian  Carian  Lydian 
Old‑Italic  Gothic  Deseret  Shavian  Osmanya 
Sora‑Sompeng  Linear‑B  Cypriot  Old‑South‑Arabian  Avestan 
Imperial‑Aramaic  Inscriptional‑Parthian  Inscriptional‑Pahlavi  Ugaritic  Old‑Persian 
Cuneiform  Egyptian  Meroitic  CJK 

The Variable script above includes characters that may be set to Ignorable by using a collation option. Among these characters space, punctuation marks and most symbols can be found. The Common script above includes digits, currency symbols, etc.


 

Powered by Mimer SQL

Powered by Mimer SQL