The Unicode Standard and ISO/IEC 10646

The Unicode Standard if fully compatible with the international standard ISO/IEC 10646, Information Technology - Universal Multiple-Octet Coded Character Set (UCS).

While modeled on the ASCII character set, the Unicode Standard goes far beyond ASCII's limited ability to encode only the upper- and lowercase letters A through Z. It provides the capacity to encode all characters used for the written languages of the world - more than 1 million characters can be encoded. The Unicode character encoding treats alphabetic characters, ideographic characters, and symbols equivalently, which means they can be used in any mixture and with equal facility.

In addition to the Unicode standard, there is a technical standard called Unicode Collation Algorithm (UCA), which is kept synchronized with the ISO/IEC 14651 standard for International String Ordering.

EOR - European Ordering Rules

The Unicode Default Order and ISO/IEC 14651 have defined the default Latin alphabet to contain not only the base letters A through Z, but also a number of more or less language specific base letters. One example, the Polish L with stroke is not a variant of L; it is a separate base letter between L and M.

The EOR, European Ordering Rules, ENV 13710 (and ISO 12199 - Alphabetical ordering of multilingual terminological and lexicographical data represented in the Latin alphabet) have taken a more natural approach: The alphabet is A through Z, and the other language specific letters are secondary variants of the corresponding base letter.

Mimer SQL is using the EOR tailoring as the basis for all specific language tailorings.

