Mimer SQL Documentation TOC PREV NEXT INDEX

Mimer SQL Developer Site


Unicode and ANSI Interfaces


The Mimer ODBC driver is Unicode based. This allows applications to use both the ANSI and Unicode interfaces when using Mimer SQL. Unicode based applications can both store and retrieve Unicode data through SQL statements and/or Unicode host variables.

ANSI applications can use Unicode host variables, but are restricted to character (8-bit) characters when passing data with SQL statements. Please note that database objects still have to be named with the same character set as before.

The Unicode SQL data types in Mimer SQL are called NATIONAL CHARACTER (or NCHAR), NATIONAL CHARACTER VARYING (NCHAR VARYING), and NATIONAL CHARACTER LARGE OBJECT (NCLOB).

You can find more information about these data types in the Mimer SQL Reference Manual. The ODBC documentation contains specifics about the SQL_WCHAR database type and the SQL_C_WCHAR and SQL_C_WLONGVARCHAR host language types.

External Character Set Support

The system follows the current locale setting on the machine to determine what characters are stored/retrieved when an application passes single-byte character strings to ODBC.

When character data is stored in Mimer SQL it can be stored in CHAR or VARCHAR columns or in NCHAR or NVARCHAR columns. Data in CHAR and VARCHAR columns use the Latin-1 character representation (also called ISO 8859-1). This character set can only be used to store 256 different characters. For the exact characters that can be stored see Mimer SQL Reference Manual, Appendix B, Character Sets. To store any other characters the data type NCHAR or NVARCHAR must be used. These column types can store any character.

If a locale is used by the application that has characters that are not included in Latin-1, it means that the columns in the database data must use an NCHAR or NVARCHAR column to store the correct characters. Previously, each character in the application was simply stored in a character column. When these characters were retrieved with, for example, DbVisualizer or other Unicode enabled applications, the interpretation of the characters were done differently and the wrong characters were displayed. With the new locale support the Mimer SQL client now understands the representation of the characters in the application and maps them accordingly to its internal representation.

When retrieving data from the database, the translation work the other way. I.e. when retrieving data from a CHAR or NCHAR column to a SQL_C_CHAR variable, the current locale must be able to represent all the characters returned from the database. When this is not possible, a conversion error -10401 is returned. If characters stored in the database have no representation in the chosen locale, a wide character data type must be used by the application instead (SQLWCHAR rather than SQLCHAR).

On Windows the setting used for the external character set is set in the Regional and Language Options in the Control Panel under the tab Advanced. This setting is used automatically by the Mimer ODBC client.

On VMS the system continues to use the Latin-1 character representation regardless of locale settings.

On other platforms (Unix/Linux, Mac OS X, others) the application must call the runtime library routine setlocale to pick the locale to use. For example, the call setlocale(LC_CTYPE, "") sets the default locale as decided by the environment setting. The actual conversions made by the Mimer client are through the library routines mbstowcs (multibyte character set to wide char set) and wcstombs. Please note that if an application does not call setlocale a default 7-bit locale is used. This means that no 8-bit characters can be used without getting a conversion error. For applications where the source is not available it is possible to set an environment variable MIMER_LOCALE that will be used when calling the Mimer client. The value of the environment variable is used as the second argument to setlocale.

To use the default locale set MIMER_LOCALE to current. On Windows the environment variable is set to the desired code page, i.e. only numeric values may be specified (for example: 1250: ANSI Central Europe, 1251: ANSI Cyrillic, 1252: Latin1, 1253: ANSI Greek, 1254: ANSI Turkish, and so on.)

The fact that the character type is considered a multi-byte character set allows any external character representation to be used. In particular various character sets such as Traditional Chinese Big5 and Japanese Shift-JIS may be used. The character set may, of course, be a single byte character set as such as the Greek Latin-7 character set (code page 1253 on Windows). On Unix platforms the prevalent representation is UTF-8 that allows any Unicode character to be stored in a character variable.


Mimer
Mimer Information Technology AB
Voice: +46 18 780 92 00
Fax: +46 18 780 92 40
info@mimer.se
Mimer SQL Documentation TOC PREV NEXT INDEX