svcadm(1M)을 검색하려면 섹션에서 1M 을 선택하고, 맨 페이지 이름에 svcadm을 입력하고 검색을 누른다.
iconv_ja(7)
Standards, Environments, Macros, Character Sets, and miscellany
iconv_ja(7)
NAME
iconv_ja - codeset conversions for Japanese encodings
DESCRIPTION
Iconv and cconv support conversions to and from a wide range of code‐
sets.
The list below provides basic information about Japanese codesets sup‐
ported. For information on other codesets, refer to iconv_unicode(7),
iconv_extra(7), iconv_ko(7), iconv_zh(7), iconv_zh_TW(7), and
iconv_zh_HK(7).
Following are the descriptions of Japanese codeset names used by iconv
and cconv with aliases in parentheses where applicable:
tab() box; lw(1.83i) |lw(3.67i) lw(1.83i) |lw(3.67i) Code SetsDescrip‐
tion _ eucJP (EUC-JP) eucJP-S11 (EUC-JP-S11)T{ Japanese EUC. See
eucJP(7). It is Windows-compatible in the conversions from and to Uni‐
code. eucJP-S11 is a variant of eucJP, which is compatible with eucJP
in Solaris 11 and earlier releases. T} _ PCK (Shift_JIS) PCK-S11T{ PC
Kanji. See PCK(7). It is Windows-compatible in the conversions from and
to Unicode. PCK-S11 is a variant of PCK, which is a compatible with PCK
in Solaris 11 and earlier releases. T} _ ISO-2022-JP (JIS7)T{ Code
representation of the character sets ISO 646 IRV or JIS X 0201 (both
Roman and Katakana); JIS X 0208, and JIS X 0212 using the designation
sequences to G0 specified by ISO/IEC 2022 and UI-OSF Application Plat‐
form Profile for Japanese Environment Version 1.1. T} _
ISO-2022-JP.RFC1468T{ Code representation of the character sets ISO 646
IRV or JIS X 0201 (Roman only) and JIS X 0208 using the designation
sequences to G0 specified by RFC 1468. T} _ JIST{ JIS 7-bit code used
in JLE, JFP 2.4, and the preceding releases. T} _ T{ IBM-930, IBM-931,
IBM-939, IBM-5026, IBM-5035 T}T{ IBM codesets based on EBCDIC. IBM
CCSID is prefixed with "IBM-". For example, "IBM-930" represents the
IBM CCSID 930 codeset. T} _ IBMJ, IBMJ-EBCDIKT{ IBM codesets based on
EBCDIC. IBMJ and IBMJ-EBCDIK are different for SBCS (Single-Byte Char‐
acter Set). The SBCS of "IBMJ" is RFC 1345 IBM038 (EBCDIC-INT), and the
SBCS of "IBMJ-EBCDIK" is RFC 1345 IBM290 (EBCDIC-JP-kana), it's also
known as EBCDIK. It contains Japanese half-width Katakana, but doesn't
contain lowercase alphabet characters. The DBCS for both codeset is IBM
code page (CPGID) 300. T} _ T{ FujitsuJEF-ascii-code, FujitsuJEF-
ascii-face, FujitsuJEF-kana-code, FujitsuJEF-kana-face T}T{ Fujitsu JEF
code. There are four variants for converting SBCS and characters mapped
differently between JIS C 6226 and JIS X 0208. In the "-ascii" vari‐
ants, EBCDIC(ASCII) is used for SBCS, and in the "-kana" variants
EBCDIC(Kana) is used for SBCS. With the "-code" variants, JIS C 6226
characters are converted by code value, while with "-face" these char‐
acters are converted by character face. T} _ HitachiKEIS83,
HitachiKEIS90T{ Hitachi KEIS83 and KEIS90. In the Solaris iconv imple‐
mentation, the SBCS of this codeset is equivalent with the IBM code
page 290, Japanese (Katakana) Extended. T} _ NECJIPST{ NEC JIPS(J). In
the Solaris iconv implementation, the SBCS of this codeset is equiva‐
lent with the IBM code 290, Japanese (Katakana) Extended. T} _ EUC-
JIS-2004T{ Extended eucJP codeset to support JIS X 0213. It does not
contain JIS X 0212 though it's contained eucJP. T} _ Shift_JIS-2004T{
Extended PCK codeset to support JIS X 0213. All characters in PCK is
contained Shift_JIS-2004. T} _ ISO-2022-JP.2004T{ Extended ISO-2022-JP
to support JIS X 0213. The two designator are added to designate JIS X
0213 characters. T} _ UTF-8-CP932T{ UTF-8 encoded Unicode which was
converted from CP932. T} _ UTF-8-JavaT{ UTF-8 encoded Unicode, Java
implementation. The user-defined characters and vendor-defined charac‐
ters are not mapped in this codeset. They will be replaced with the
substitute character when converting. See NOTES. T}
Available iconv and cconv conversions in the current system can be
obtained by running 'iconv -l' as described in the iconv(1) manual
page.
Additional information on the mappings between canonical names and sup‐
ported aliases with optional variant levels, refer to alias(5) manual
page and /usr/lib/iconv/alias file.
FILES
/usr/lib/iconv/*.so
iconv conversion modules
/usr/lib/iconv/*.bt
cconv code conversion binary tables for iconv(1), cconv(3C), and
iconv(3C)
/usr/lib/iconv/geniconvtbl/binarytables/*.bt
geniconvtbl conversions binary tables
/usr/lib/iconv/alias
alias table file of codeset names
SEE ALSO
geniconvtbl(1), iconv(1), cconv(3C), cconv_close(3C), cconv_open(3C),
cconvctl(3C), iconv(3C), iconvctl(3C), alias(5), geniconvtbl(5), geni‐
convtbl-cconv(5), attributes(7), environ(7), iconv_extra(7),
iconv_ko(7), iconv_unicode(7), iconv_zh(7), iconv_zh_HK(7),
iconv_zh_TW(7)
Murai, J., M. Crispin, and E. van der Poel, Japanese Character Encoding
for Internet Messages, RFC 1468, Keio University, Panda Programming,
June 1993.
Ohta, M., Character Sets ISO-10646 and ISO-10646-J-1, RFC 1815, Tokyo
Institute of Technology, July 1995.
Ohta, M., and K. Handa, ISO-2022-JP-2: Multilingual Extension of
ISO-2022-JP, RFC 1554, Tokyo Institute of Technology, December 1993.
Simonson, K., Character Mnemonics & Character Sets, RFC 1345, Rationel
Almen Planlaegning, June 1992.
UI-OSF Japanese Localization Group, UI-OSF Application Platform Profile
for Japanese Environment Version 1.1, May 1993.
ISO/IEC 2022:1994 Information technology -- Character code structure
and extension techniques, 1994.
NOTES
The user-defined characters are mapped to the corresponding values
sequentially in the target codeset. When the codeset is Unicode encod‐
ing like UTF-8, it is mapped to the value in the Private Use Area (from
U+E000 to U+F8FF). When there are no user-defined characters in the
target codeset, they are mapped to the substitute character. When the
source codeset has bigger user-defined characters' area than the target
codeset, overflowed characters are mapped to the substitute character.
The vendor-defined character is mapped to the corresponding code value
in the target codeset. If the target codeset does not have that value,
it is replaced with the substitute character.
There are codesets which contain duplicated characters in the vendor-
defined characters. The characters duplicated with the standard charac‐
ters like JIS X 0208, they are mapped to those standard characters. In
the PCK and CP932 codeset, there are duplicated characters between the
NEC special characters and the IBM extended characters, those charac‐
ters are mapped to the NEC special characters.
The substitute character is different in each codeset. It's a represen‐
tation of the Unicode replacement character (U+FFFD) when the target
codeset is Unicode encoding. It is question mark '?' when the target
codeset is the ASCII-compatible or the EBCDIC-compatible.
Oracle Solaris 11.4 19 Apr 2019 iconv_ja(7)