iconv_ja(7) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

iconv_ja(7)

Standards, Environments, Macros, Character Sets, and miscellany
                                                                   iconv_ja(7)



NAME
       iconv_ja - codeset conversions for Japanese encodings

DESCRIPTION
       Iconv  and  cconv support conversions to and from a wide range of code‐
       sets.


       The list below provides basic information about Japanese codesets  sup‐
       ported.  For  information on other codesets, refer to iconv_unicode(7),
       iconv_extra(7),   iconv_ko(7),   iconv_zh(7),    iconv_zh_TW(7),    and
       iconv_zh_HK(7).


       Following  are the descriptions of Japanese codeset names used by iconv
       and cconv with aliases in parentheses where applicable:


       tab() box; lw(1.83i) |lw(3.67i) lw(1.83i) |lw(3.67i) Code  SetsDescrip‐
       tion  _  eucJP  (EUC-JP)  eucJP-S11  (EUC-JP-S11)T{  Japanese  EUC. See
       eucJP(7). It is Windows-compatible in the conversions from and to  Uni‐
       code.  eucJP-S11  is a variant of eucJP, which is compatible with eucJP
       in Solaris 11 and earlier releases.  T} _ PCK (Shift_JIS) PCK-S11T{  PC
       Kanji. See PCK(7). It is Windows-compatible in the conversions from and
       to Unicode. PCK-S11 is a variant of PCK, which is a compatible with PCK
       in  Solaris  11  and  earlier releases.  T} _ ISO-2022-JP (JIS7)T{ Code
       representation of the character sets ISO 646 IRV or JIS  X  0201  (both
       Roman  and  Katakana); JIS X 0208, and JIS X 0212 using the designation
       sequences to G0 specified by ISO/IEC 2022 and UI-OSF Application  Plat‐
       form   Profile   for   Japanese   Environment   Version   1.1.    T}  _
       ISO-2022-JP.RFC1468T{ Code representation of the character sets ISO 646
       IRV  or  JIS  X  0201 (Roman only) and JIS X 0208 using the designation
       sequences to G0 specified by RFC 1468.  T} _ JIST{ JIS 7-bit code  used
       in JLE, JFP 2.4, and the preceding releases.  T} _ T{ IBM-930, IBM-931,
       IBM-939, IBM-5026, IBM-5035 T}T{ IBM  codesets  based  on  EBCDIC.  IBM
       CCSID  is  prefixed  with "IBM-". For example, "IBM-930" represents the
       IBM CCSID 930 codeset.  T} _ IBMJ, IBMJ-EBCDIKT{ IBM codesets based  on
       EBCDIC.  IBMJ and IBMJ-EBCDIK are different for SBCS (Single-Byte Char‐
       acter Set). The SBCS of "IBMJ" is RFC 1345 IBM038 (EBCDIC-INT), and the
       SBCS  of  "IBMJ-EBCDIK"  is RFC 1345 IBM290 (EBCDIC-JP-kana), it's also
       known as EBCDIK. It contains Japanese half-width Katakana, but  doesn't
       contain lowercase alphabet characters. The DBCS for both codeset is IBM
       code page (CPGID) 300.   T}  _  T{  FujitsuJEF-ascii-code,  FujitsuJEF-
       ascii-face, FujitsuJEF-kana-code, FujitsuJEF-kana-face T}T{ Fujitsu JEF
       code. There are four variants for converting SBCS and characters mapped
       differently  between  JIS  C 6226 and JIS X 0208. In the "-ascii" vari‐
       ants, EBCDIC(ASCII) is used for  SBCS,  and  in  the  "-kana"  variants
       EBCDIC(Kana)  is  used  for SBCS. With the "-code" variants, JIS C 6226
       characters are converted by code value, while with "-face" these  char‐
       acters   are   converted   by  character  face.   T}  _  HitachiKEIS83,
       HitachiKEIS90T{ Hitachi KEIS83 and KEIS90. In the Solaris iconv  imple‐
       mentation,  the  SBCS  of  this codeset is equivalent with the IBM code
       page 290, Japanese (Katakana) Extended.  T} _ NECJIPST{ NEC JIPS(J). In
       the  Solaris  iconv implementation, the SBCS of this codeset is equiva‐
       lent with the IBM code 290, Japanese (Katakana) Extended.   T}  _  EUC-
       JIS-2004T{  Extended  eucJP  codeset to support JIS X 0213. It does not
       contain JIS X 0212 though it's contained eucJP.  T} _  Shift_JIS-2004T{
       Extended  PCK  codeset  to support JIS X 0213. All characters in PCK is
       contained Shift_JIS-2004.  T} _ ISO-2022-JP.2004T{ Extended ISO-2022-JP
       to  support JIS X 0213. The two designator are added to designate JIS X
       0213 characters.  T} _ UTF-8-CP932T{ UTF-8 encoded  Unicode  which  was
       converted  from  CP932.   T} _ UTF-8-JavaT{ UTF-8 encoded Unicode, Java
       implementation. The user-defined characters and vendor-defined  charac‐
       ters  are  not  mapped  in this codeset. They will be replaced with the
       substitute character when converting. See NOTES.  T}



       Available iconv and cconv conversions in  the  current  system  can  be
       obtained  by  running  'iconv  -l'  as described in the iconv(1) manual
       page.


       Additional information on the mappings between canonical names and sup‐
       ported  aliases  with optional variant levels, refer to alias(5) manual
       page and /usr/lib/iconv/alias file.

FILES
       /usr/lib/iconv/*.so

           iconv conversion modules


       /usr/lib/iconv/*.bt

           cconv code conversion binary tables for  iconv(1),  cconv(3C),  and
           iconv(3C)


       /usr/lib/iconv/geniconvtbl/binarytables/*.bt

           geniconvtbl conversions binary tables


       /usr/lib/iconv/alias

           alias table file of codeset names



SEE ALSO
       geniconvtbl(1),  iconv(1),  cconv(3C), cconv_close(3C), cconv_open(3C),
       cconvctl(3C), iconv(3C), iconvctl(3C), alias(5), geniconvtbl(5),  geni‐
       convtbl-cconv(5),     attributes(7),     environ(7),    iconv_extra(7),
       iconv_ko(7),     iconv_unicode(7),     iconv_zh(7),     iconv_zh_HK(7),
       iconv_zh_TW(7)


       Murai, J., M. Crispin, and E. van der Poel, Japanese Character Encoding
       for Internet Messages, RFC 1468, Keio  University,  Panda  Programming,
       June 1993.


       Ohta,  M.,  Character Sets ISO-10646 and ISO-10646-J-1, RFC 1815, Tokyo
       Institute of Technology, July 1995.


       Ohta, M.,  and  K.  Handa,  ISO-2022-JP-2:  Multilingual  Extension  of
       ISO-2022-JP, RFC 1554, Tokyo Institute of Technology, December 1993.


       Simonson,  K., Character Mnemonics & Character Sets, RFC 1345, Rationel
       Almen Planlaegning, June 1992.


       UI-OSF Japanese Localization Group, UI-OSF Application Platform Profile
       for Japanese Environment Version 1.1, May 1993.


       ISO/IEC  2022:1994  Information  technology -- Character code structure
       and extension techniques, 1994.

NOTES
       The user-defined characters are  mapped  to  the  corresponding  values
       sequentially  in the target codeset. When the codeset is Unicode encod‐
       ing like UTF-8, it is mapped to the value in the Private Use Area (from
       U+E000  to  U+F8FF).  When  there are no user-defined characters in the
       target codeset, they are mapped to the substitute character.  When  the
       source codeset has bigger user-defined characters' area than the target
       codeset, overflowed characters are mapped to the substitute character.


       The vendor-defined character is mapped to the corresponding code  value
       in  the target codeset. If the target codeset does not have that value,
       it is replaced with the substitute character.


       There are codesets which contain duplicated characters in  the  vendor-
       defined characters. The characters duplicated with the standard charac‐
       ters like JIS X 0208, they are mapped to those standard characters.  In
       the  PCK and CP932 codeset, there are duplicated characters between the
       NEC special characters and the IBM extended characters,  those  charac‐
       ters are mapped to the NEC special characters.


       The substitute character is different in each codeset. It's a represen‐
       tation of the Unicode replacement character (U+FFFD)  when  the  target
       codeset  is  Unicode  encoding. It is question mark '?' when the target
       codeset is the ASCII-compatible or the EBCDIC-compatible.



Oracle Solaris 11.4               19 Apr 2019                      iconv_ja(7)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3