iconv_extra(7) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

iconv_extra(7)

Standards, Environments, Macros, Character Sets, and miscellany
                                                                iconv_extra(7)



NAME
       iconv_extra - codeset conversion for non-Unicode encodings

DESCRIPTION
       iconv  and  cconv support conversions to and from a wide range of code‐
       sets.


       The lists below provide basic information about  encodings  mainly  for
       the  EMEA  regions.  For  information  on  Asian  encodings,  refer  to
       iconv_ja(7),    iconv_ko(7),    iconv_zh(7),    iconv_zh_HK(7),     and
       iconv_zh_TW(7)  manual  pages.  For  information  on Unicode encodings,
       refer to the iconv_unicode(7) manual page.


       The codeset names shown are in their canonical form directly usable  as
       fromcode   or   tocode  parameters  to  iconv(1),  iconv_open(3C),  and
       cconv_open(3C), with aliases in parentheses where applicable.


       Available iconv and cconv conversions in  the  current  system  can  be
       obtained by running iconv -l as described in the iconv(1) manual page.


       For  additional information on the mappings between canonical names and
       supported aliases with optional variant levels, refer to  the  alias(5)
       manual page and also the /usr/lib/iconv/alias file.

   646 - ISO/IEC 646:1991 and Variants
       The  codeset  of  the "C" locale in Oracle Solaris, the ISO basic Latin
       alphabet, is referred with the canonical name 646. Common aliases  such
       as US-ASCII and ASCII are also defined.


       The following national variants of the 646 codeset are also available:


       tab()  box;  lw(2.36i)  |lw(3.14i)  lw(2.36i) |lw(3.14i) 646 CodesetNa‐
       tional Variant _ 646deGermany _ 646chSwitzerland _ 646gb  (646en)United
       Kingdom  _  646frFrance  _  646caCanada  _ 646fiFinland _ 646svSweden _
       646itItaly _ 646dk (646da)Denmark _ 646es (646sp)Spain _ 646ptPortugal


   ISO 8859 Character Sets
       tab() box; lw(1.27i) |lw(4.23i) lw(1.27i) |lw(4.23i) ISO 8859 Character
       SetDescription _ ISO8859-1 (Latin1)T{ For most West European languages,
       including:

         Albanian   Finnish    Italian
         Catalan    French     Norwegian
         Danish     German     Portuguese
         Dutch      Galician   Spanish
         English    Irish      Swedish
         Faeroese   Icelandic


       T} _ ISO8859-2 (Latin2)T{ For most  Latin-written  Slavic  and  Central
       European languages:

         Czech      Polish     Slovak
         German     Rumanian   Slovene
         Hungarian  Croatian


       T}  _  ISO8859-3  (Latin3)T{ Used for Esperanto, Galician, Maltese, and
       Turkish.  T} _ ISO8859-4 (Latin4)T{ Introduces  letters  for  Estonian,
       Latvian,  and  Lithuanian.  It  is  an  incomplete  predecessor  of ISO
       8859-10.  T} _ ISO8859-5T{ For languages that  use  Cyrillic  alphabet,
       such  as  Belarusian,  Bulgarian,  Macedonian,  Russian,  Serbian,  and
       Ukrainian.  T} _ ISO8859-6Latin + Arabic.  _ ISO8859-7T{ Latin + Greek.
       Does  not include accents used in polytonic Greek.  T} _ ISO8859-8Latin
       + Hebrew.  _ ISO8859-9 (Latin5)T{ Replaces the rarely needed  Icelandic
       letters in ISO 8859-1 (Latin 1) with the Turkish ones.  T} _ ISO8859-10
       (Latin6)T{ Adds the last Inuit (Greenlandic) and Sami (Lappish) letters
       that  were not included in ISO 8859-4 (Latin 4) to complete coverage of
       the Nordic area.  T} _ ISO8859-11T{ Latin + Thai. ISO/IEC  8859-11:2001
       is equivalent to TIS 620-2533 (1990) with the addition of 0xA0 NO-BREAK
       SPACE.  T} _ ISO8859-13 (Latin7)T{ Includes characters for Baltic  lan‐
       guages  which  were  missing from Latin-4 and Latin-6.  T} _ ISO8859-14
       (Latin8)T{ Covers Celtic languages such as Gaelic and the  Breton  lan‐
       guage.   T}  _  ISO8859-15 (Latin9)T{ Variant of 8859-1 that modifies 8
       less used characters and introduces the euro  sign.   T}  _  ISO8859-16
       (Latin10)T{ Supports Albanian, Croatian, English, Finnish, French, Ger‐
       man, Hungarian, Irish Gaelic (new orthography), Italian, Latin, Polish,
       Romanian,  and  Slovenian.  The currency sign is replaced with the euro
       sign.  T}


   IBM EBCDIC Code Pages
       EBCDIC (Extended Binary Coded Decimal Interchange  Code)  is  an  8-bit
       character  encoding  mainly used in IBM mainframes. The following table
       outlines the basics on supported IBM EBCDIC-based code pages.


       IBM PC and EBCDIC code pages are prefixed with "IBM-" as  like  IBM-037
       in the codeset name.


       tab()  box; lw(1.03i) |lw(4.47i) lw(1.03i) |lw(4.47i) EBCDIC Code Page‐
       Country/Region _ IBM-037Latin-1 character set _ IBM-273Austria, Germany
       _  IBM-277Denmark,  Norway  _  IBM-278Finland,  Sweden _ IBM-280Italy _
       IBM-284Latin  America,  Spain  _  IBM-285Ireland,  United   Kingdom   _
       IBM-297France  _  IBM-420Egypt,  Iraq,  Jordan,  Saudi  Arabia, Syria _
       IBM-424Israel _ IBM-500T{ Australia, Austria, Belgium, Brazil,  Canada,
       Denmark,  Finland,  France,  Germany,  Iceland,  Ireland, Italy, Japan,
       Latin America, Multinational, Netherlands, New Zealand, Norway,  Portu‐
       gal,  South  Africa,  Spain,  Sweden,  Switzerland, United Kingdom, and
       United States T} _ IBM-838Thailand _  IBM-875Greece  _  IBM-933Korea  _
       IBM-935Simplified  Chinese  _  IBM-937Traditional  Chinese _ IBM-1025T{
       Belarus, Bosnia-Herzegovina,  Bulgaria,  Macedonia  (FYR),  Montenegro,
       Russia, Serbia, Serbia-Montenegro, and Yugoslavia T} _ IBM-1026Multina‐
       tional, Turkey _ IBM-1112Estonia, Latvia, Lithuania _ IBM-1122Estonia _
       IBM-1140T{  Australia,  Brazil, Canada, Multinational, Netherlands, New
       Zealand, Portugal,  South  Africa,  Taiwan,  and  United  States  T}  _
       IBM-1141Austria,  Germany  _ IBM-1142Denmark, Norway _ IBM-1143Finland,
       Sweden _ IBM-1144Italy _ IBM-1145Latin America,  Spain  _  IBM-1146Ire‐
       land,  United Kingdom _ IBM-1147France _ IBM-1148T{ Australia, Austria,
       Belgium, Brazil, Canada, Denmark, Finland,  France,  Germany,  Iceland,
       Ireland,  Italy,  Japan, Latin America, Multinational, Netherlands, New
       Zealand, Norway, Portugal, South Africa,  Spain,  Sweden,  Switzerland,
       United Kingdom, and United States T} _ IBM-1149Iceland


   IBM-PC Code Pages
       The  following table covers the supported IBM-PC (DOS and Windows) code
       pages.


       tab() box; lw(0.97i) |lw(4.53i) lw(0.97i) |lw(4.53i) IBM-PC  Code  Page
       Country/Region   _  IBM-850T{  Albania,  Australia,  Austria,  Belgium,
       Bosnia-Herzegovina, Brazil, Bulgaria, Canada, Croatia, Czech  Republic,
       Denmark,  Egypt,  Finland,  France,  Germany, Greece, Hungary, Iceland,
       Iraq, Ireland, Italy, Jordan,  Latin  America,  Multinational,  Nether‐
       lands,  New  Zealand,  Norway, Poland, Portugal, Romania, Russia, Saudi
       Arabia, Slovakia, Slovenia, South Africa, Spain,  Sweden,  Switzerland,
       Syria,  United  Kingdom,  and  United  States  T}  _ IBM-852T{ Albania,
       Bosnia-Herzegovina, Croatia, Czech  Republic,  Hungary,  Multinational,
       Poland, Romania, Slovakia, and Slovenia T} _ IBM-855T{ Bosnia-Herzegov‐
       ina, Bulgaria, Macedonia (FYR), Montenegro, Multinational, Serbia, Ser‐
       bia-Montenegro,  and  Yugoslavia  T}  _ IBM-856Israel _ IBM-857Multina‐
       tional, Turkey _ IBM-862Israel _ IBM-864T{ Egypt, Iraq,  Jordan,  Saudi
       Arabia,  and Syria T} _ IBM-866Russia _ IBM-869Greece _ IBM-870T{ Alba‐
       nia, Bosnia-Herzegovina, Croatia,  Czech  Republic,  Hungary,  Multina‐
       tional,  Poland,  Romania, Slovakia, and Slovenia T} _ IBM-871Iceland _
       IBM-874Thailand _ IBM-921Estonia, Latvia, Lithuania _ IBM-922Estonia


   Microsoft Code Pages
       The following table covers the supported Microsoft DOS and Windows code
       pages. Microsoft code pages are prefixed with "CP" as like CP850 in the
       codeset name.


       tab() box; lw(1.27i) |lw(4.23i) lw(1.27i) |lw(4.23i) Code  PageDescrip‐
       tion  _  CP437MS-DOS,  Latin  United  States  _  CP720MS-DOS,  Arabic _
       CP737MS-DOS, Greek _ CP775MS-DOS, Baltic  _  CP850MS-DOS,  Multilingual
       Latin  I _ CP852MS-DOS, Latin II _ CP855MS-DOS, Cyrillic _ CP857MS-DOS,
       Turkish _ CP860MS-DOS, Portuguese _ CP861MS-DOS, Icelandic  _  CP862MS-
       DOS,  Hebrew  _  CP863MS-DOS,  French  Canada  _  CP864MS-DOS, Arabic _
       CP865MS_DOS, Nordic _ CP866MS-DOS, Cyrillic  (Russian)  _  CP869MS-DOS,
       Greek  2  _  CP874MS-DOS,  Thai _ CP949Windows, Korean _ CP1250Windows,
       Central Europe _  CP1251Windows,  Cyrillic  _  CP1252Windows,  Latin  _
       CP1253Windows, Greek _ CP1254Windows, Turkish _ CP1255Windows, Hebrew _
       CP1256Windows, Arabic _ CP1257Windows, Baltic _ CP1258Windows, Vietnam


   Other Code Pages
       tab() box; lw(0.87i) |lw(4.63i) lw(0.87i) |lw(4.63i) Code  PageDescrip‐
       tion _ KOI8-R, KOI8-UT{ 8-bit codesets for Russian and Ukrainian Cyril‐
       lic T} _ PTCP154T{ Pratype CP154 for Cyrillic;  based  on  CP1251  with
       added  Asian  Cyrillic  symbols T} _ ALT8-bit Alternative PC Cyrillic _
       MAC8-bit Macintosh Cyrillic _ DHNT{ Dom Handlowy Nauki,  8-bit  codeset
       for  Polish text T} _ Mazovia8-bit codeset for Polish text.  _ VISCIIT{
       Vietnamese Standard Code for Information Interchange is a  modification
       of  ASCII  for  Vietnamese.   T}  _ TCVNT{ Vietnamese Standard Code for
       Information Interchange TCVN 5712:1993.   T}  _  TIS-620  (TIS620-2533,
       EUC-TH)T{ Thai Industrial Standard 620-2533 is practically identical to
       the ISO 8859-11 codeset (see above).  T}  _  ISCII  (ISCII91)T{  Indian
       Script  Code for Information Interchange is an ASCII-compatible codeset
       for Indic scripts.   T}  _  ACE  (IDNA2008-REGIST)T{  ASCII  Compatible
       Encoding  defined  in  the  RFCs  3490, 3492, and 5890 without allowing
       unassigned characters; it also uses STD3 ASCII rules.   IDNA2008-REGIST
       is  an  alias  to ACE utilizing the IDNA2008 terminologies described in
       RFC 5890.  T} _ ACE-ALLOW-UNASSIGNED (AIDNA2008-LOOKUP)T{ Same  as  ACE
       except  that  it  allows  unassigned characters. It's more suitable for
       query purposes; the ACE is more suitable for storing or giving host  or
       domain  names  to machines.  IDNA2008-LOOKUP is an alias for ACE-ALLOW-
       UNASSIGNED utilizing the IDNA2008 terminologies described in RFC  5890.
       T}


FILES
       /usr/lib/iconv/*.so

           iconv conversion modules


       /usr/lib/iconv/*.bt

           cconv  code  conversion  binary  tables for iconv(1), cconv(3C) and
           iconv(3C)


       /usr/lib/iconv/geniconvtbl/binarytables/*.bt

           geniconvtbl conversion binary tables


       /usr/lib/iconv/alias

           Alias table file of codeset names



SEE ALSO
       geniconvtbl(1), iconv(1), cconv(3C),  cconv_close(3C),  cconv_open(3C),
       cconvctl(3C), iconv(3C), iconv_close(3C), iconv_open(3C), iconvctl(3C),
       alias(5), geniconvtbl-cconv(5),  iconv_ja(7),  iconv_ko(7),  iconv_uni‐
       code(7), iconv_zh(7), iconv_zh_HK(7), iconv_zh_TW(7)


       Chernov, A., Registration of a Cyrillic Character Set, RFC 1489, RELCOM
       Development Team, July 1993.


       Nussbacher, H., and Y. Bourvine, Hebrew Character Encoding for Internet
       Messages, RFC 1555, Israeli Inter-University, Hebrew University, Decem‐
       ber 1993.


       Reynolds, J., and J. Postel, ASSIGNED NUMBERS, RFC 1700, University  of
       Southern California/Information Sciences Institute, October 1994.


       Simonson,  K., Character Mnemonics & Character Sets, RFC 1345, Rationel
       Almen Planlaegning, June 1992.


       Spinellis, D., Greek Character Encoding for Electronic  Mail  Messages,
       RFC 1947, SENA S.A., May 1996.



Oracle Solaris 11.4               4 Nov 2014                    iconv_extra(7)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3