NAME

kiconv_open - code conversion descriptor allocation function

SYNOPSIS

#include <sys/sunddi.h>
kiconv_t kiconv_open(const char *tocode, const char *fromcode);

INTERFACE LEVEL

illumos DDI specific (illumos DDI).

PARAMETERS

tocode

Points to a target codeset name string.

fromcode

Points to a source codeset name string.

DESCRIPTION

The kiconv_open() function returns a code conversion descriptor that describes a conversion from the codeset specified by fromcode to the codeset specified by tocode. For state-dependent encodings, the conversion descriptor is in a codeset-dependent initial state (ready for immediate use with the kiconv() function).

Supported code conversions are between UTF-8 and the following:

Name                    Description


 Big5                    Traditional Chinese Big5


 Big5-HKSCS              Traditional Chinese Big5-Hong Kong


                         Supplementary Character Set


 CP720                   DOS Arabic


 CP737                   DOS Greek


 CP850                   DOS Latin-1 (Western European)


 CP852                   DOS Latin-2 (Eastern European)


 CP857                   DOS Latin-5 (Turkish)


 CP862                   DOS Hebrew


 CP866                   DOS Cyrillic Russian


 CP932                   Japanese Shift JIS (Windows)


 CP950-HKSCS             Traditional Chinese HKSCS-2001 (Windows)


 CP1250                  Central Europe


 CP1251                  Cyrillic


 CP1252                  Western Europe


 CP1253                  Greek


 CP1254                  Turkish


 CP1255                  Hebrew


 CP1256                  Arabic


 CP1257                  Baltic


 EUC-CN                  Simplified Chinese EUC


 EUC-JP                  Japanese EUC


 EUC-JP-MS               Japanese EUC MS


 EUC-KR                  Korean EUC


 EUC-TW                  Traditional Chinese EUC


 GB18030                 Simplified Chinese GB18030


 GBK                     Simplified Chinese GBK


 ISO-8859-1              Latin-1 (Western European)


 ISO-8859-2              Latin-2 (Eastern European)


 ISO-8859-3              Latin-3 (Southern European)


 ISO-8859-4              Latin-4 (Northern European)


 ISO-8859-5              Cyrillic


 ISO-8859-6              Arabic


 ISO-8859-7              Greek


 ISO-8859-8              Hebrew


 ISO-8859-9              Latin-5 (Turkish)


 ISO-8859-10             Latin-6 (Nordic)


 ISO-8859-13             Latin-7 (Baltic)


 ISO-8859-15             Latin-9 (Western European with euro sign)


 KOI8-R                  Cyrillic


 Shift_JIS               Japanese Shift JIS (JIS)


 TIS_620                 Thai (a.k.a. ISO 8859-11)


 Unified-Hangul          Korean Unified Hangul

UTF-8 and the above names can be used at tocode and fromcode to specify the desired code conversion. The following aliases are also supported as alternative names to be used:

Aliases                 Original Name


  720                     CP720


  737                     CP737


  850                     CP850


  852                     CP852


  857                     CP857


  862                     CP862


  866                     CP866


  932                     CP932


  936, CP936              GBK


  949, CP949              Unified-Hangul


  950, CP950              Big5


  1250                    CP1250


  1251                    CP1251


  1252                    CP1252


  1253                    CP1253


  1254                    CP1254


  1255                    CP1255


  1256                    CP1256


  1257                    CP1257


  ISO-8859-11             TIS_620


  PCK, SJIS               Shift_JIS

A conversion descriptor remains valid until it is closed by using kiconv_close().

RETURN VALUES

Upon successful completion, kiconv_open() returns a code conversion descriptor for use on subsequent calls to kiconv(). Otherwise, if the conversion specified by fromcode and tocode is not supported or for any other reasons the code conversion descriptor cannot be allocated, kiconv_open() returns (kiconv_t)-1 to indicate the error.

CONTEXT

kiconv_open() can be called from user context only.

EXAMPLES

Example 1 Opening a Code Conversion

The following example shows how to open a code conversion from ISO 8859-15 to UTF-8

#include <sys/sunddi.h>
kiconv_t cd;
cd = kiconv_open("UTF-8", "ISO-8859-15");
if (cd == (kiconv_t)-1) {


         /* Cannot open up the code conversion. */


         return (-1);
}

ATTRIBUTES

See attributes(7) for descriptions of the following attributes:

ATTRIBUTE TYPE	ATTRIBUTE VALUE
Interface Stability	Committed

NOTES

The code conversions are available between UTF-8 and the above noted codesets. For example, to convert from EUC-JP to Shift_JIS, first convert EUC-JP to UTF-8 and then convert UTF-8 to Shift_JIS.

The code conversions supported are based on simple one-to-one mappings. There is no special treatment or processing done during code conversions such as case conversion, Unicode Normalization, or mapping between combining or conjoining sequences of UTF-8 and pre-composed characters in non-UTF-8 codesets.

All supported non-UTF-8 codesets use pre-composed characters only. However, UTF-8 allows combining or conjoining characters too. For this reason, using a form of Unicode Normalizations on UTF-8 text with u8_textprep_str() before or after doing code conversions might be necessary.