cconv(3c) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

cconv(3c)

Standard C Library Functions                                         cconv(3C)



NAME
       cconv - per character sequence based code conversion function

SYNOPSIS
       #include <iconv.h>
       size_t cconv(cconv_t cd, char *inbuf, size_t *inlen, char *outbuf,
               size_t *outlen);

DESCRIPTION
       The cconv() function converts a character sequence from one codeset, in
       the array specified by inbuf, into a corresponding  character  sequence
       in  another codeset, in the array specified by outbuf. The codesets are
       those specified in the  cconv_open()  call  (optionally  modified  with
       variants  and  flag arguments) that returned the conversion descriptor,
       cd. The inbuf points to a character byte array to the  first  character
       in the input array and inlen indicates the number of bytes in the array
       to be converted. The outbuf points to a character  byte  array  to  the
       first  available byte in the output array and outlen indicates the num‐
       ber of the available bytes in the array.


       Unlike iconv(3C) that  does  buffer-based  character  code  conversion,
       cconv()  does  per-character  sequence-based code conversion which con‐
       verts a single character sequence that is the first character  sequence
       in the inbuf at a call indicating a character sequence boundary.


       A  character  sequence  is one or more character codes that comprises a
       character. In case of graphic characters, it may be a single  character
       code  or multiple character codes such as combining or conjoining char‐
       acter sequences  defined  in  the  current  code  conversion.  It  also
       includes  designator  character sequences for state-dependent encodings
       that may have one or more character codes  in  a  designator.  See  the
       NOTES section below for the definitions of the terms mentioned.


       For state-dependent encodings, the conversion descriptor (cd) is placed
       into its initial shift state by a  call  for  which  inbuf  is  a  null
       pointer.  When  cconv()  is  called in this way, and if outbuf is not a
       null pointer, and outlen has a positive value, cconv() will place, into
       the outbuf, the byte sequence to change the outbuf to its initial shift
       state. If the outbuf is not large  enough  to  hold  the  entire  reset
       sequence,  cconv()  will  fail and set errno to E2BIG. Subsequent calls
       with inbuf as other than a null pointer cause the  conversion  to  take
       place from the current state of the conversion descriptor.


       If  a  sequence of input bytes does not form a valid character sequence
       in the specified codeset, the conversion will fail  and  set  errno  to
       EILSEQ.  If  the  input  array  ends with an incomplete character or an
       incomplete shift sequence, conversion will fail and set errno  to  EIN‐
       VAL.  If  the  output array is not large enough to hold the entire con‐
       verted input character sequence, conversion will fail and set errno  to
       E2BIG. The value pointed to by inlen is decremented to reflect the num‐
       ber of bytes still not consumed by the code conversion  in  the  inbuf.
       The  value pointed to by outlen is decremented to reflect the number of
       bytes still available in outbuf.  For  state-dependent  encodings,  the
       conversion  descriptor  is updated to reflect the shift state in effect
       at the end of the successfully converted character sequence.  For  code
       conversions supporting combining or conjoining character sequences, the
       code conversion may consume and save some or  all  character  codes  of
       such  a  sequence  into  the  conversion descriptor as a new conversion
       state if the sequence is the last character sequence in the input array
       and potentially an incomplete sequence.


       If  cconv()  encounters a character sequence in the input array that is
       legal, but for which an identical character does not exist in the  tar‐
       get  codeset,  cconv()  performs  an implementation-defined conversion,
       i.e., non-identical conversion, on this character sequence.


       It is possible that cconv() consumes all the bytes from the input array
       without  any  failure return and yet produce no converted bytes or less
       number of characters at the output array if the current code conversion
       supports conversions of combining or conjoining character sequences and
       the input array has a potentially incomplete  combining  or  conjoining
       character  sequence.  For instance, if the current code conversion sup‐
       ports conversions of two combining character sequences of a base  char‐
       acter  'A'  followed  by  a combining mark character acute and the same
       base character 'A' followed by two combining mark characters acute  and
       dot  below and the input array has the base character 'A' and a combin‐
       ing mark character acute, the code  conversion  will  consume  the  two
       characters  but may not produce any converted bytes at the output array
       for a possible case that the next call will bring in dot below  as  the
       next first character in the input array which may result in a different
       code conversion bytes. Similar to the reset operation  requirement  for
       state-dependent  encodings, in the end of the conversion, an additional
       call of reset operation is thus also required to force  the  conversion
       to  store  any saved bytes in the conversion descriptor into the output
       array. See the EXAMPLES section below for actual usage.


       The default conversion behavior mentioned above can be modified if  one
       or  more of the conversion behavior modification flag values are speci‐
       fied and such conversion behavior modifications are  supported  by  the
       implementation  of  the  corresponding  cconv code conversion. For more
       detail, see cconv_open(3C) and cconvctl(3C).

RETURN VALUES
       The cconv() function updates the values pointed to by inlen and  outlen
       arguments  to reflect the extent of the conversion and returns the num‐
       ber of non-identical conversions performed. If the entire bytes in  the
       input  array  is converted, the value pointed to by inlen will be 0. If
       the input conversion is stopped due to any error  conditions  mentioned
       above, cconv() returns (size_t)-1 and sets errno to indicate the error.

ERRORS
       The cconv() function will fail if:


       EILSEQ    Input  conversion  stopped due to an input byte that does not
                 belong to the input codeset.


       E2BIG     Input conversion stopped due to lack of space in  the  output
                 array.


       EINVAL    Input  conversion  stopped  due to an incomplete character or
                 shift sequence at the end of the input array.




       The cconv() function may fail if:


       EBADF     The cd argument is not a valid open conversion descriptor.


       ENOMEM    Insufficient storage space is available.



EXAMPLES
       Example 1 Code Conversion from UTF-32 to ISO8859-1



       The following example converts the first UTF-32 character  sequence  in
       inbuf into an ISO8859-1 character.



         #include <stdio.h>
             #include <string.h>
             #include <errno.h>
             #include <iconv.h>

                 :

             #define MY_BUFFER_SIZE 24

             uint32_t ib[MY_BUFFER_SIZE];
             char ob[MY_BUFFER_SIZE];
             size_t il;
             size_t ol;
             size_t i;
             size_t ret;
             cconv_t cd;

                 :

             /*
              * As an example, initialize the input array, ib[],  with two
              * Unicode characters, 'a' U+0061 and COMBINING TILDE U+0303.
              */
             ib[0] = 0x61;
             ib[1] = 0x303;
             il = sizeof (uint32_t) * 2;
             ol = MY_BUFFER_SIZE;

             cd = cconv_open("ISO8859-1", 0, "UTF-32", 0, 0);
             if (cd == (cconv_t)-1) {
                 /* Code conversion not supported? */
                 return (-1);
             }

                 :

             /* Do the conversion. */
             ret = cconv(cd, ib, &il, ob, &ol);
             if (ret == (size_t)-1) {
                 /* Illegal character or bad conversion descriptor? */
                 if (errno == EILSEQ || errno == EBADF)
                     return (-2);

                 /* Output array too small? */
                 if (errno == E2BIG)
                     return (-3);
             } else {
                 for (i = 0; i < MY_BUFFER_SIZE - ol; i++)
                     printf("Converted byte value = %d\n", ob[i]);
             }

             /*
              * Make sure to flush any character bytes possibly stored in
              * the conversion descriptor by doing a reset operation.
              */
             ol = MY_BUFFER_SIZE;
             ret = cconv(cd, NULL, &il, ob, &ol);
             if (ret != (size_t)-1 && ob < MY_BUFFER_SIZE)
                 for (i = 0; i < MY_BUFFER_SIZE - ol; i++)
                     printf("Converted byte value = %d\n", ob[i]);

                 :

             (void) cconv_close(cd);




FILES
       /usr/lib/iconv/*.bt    cconv  code  conversion  binary  table files for
                              iconv(1), cconv(3C), and iconv(3C).



ATTRIBUTES
       See attributes(7) for descriptions of the following attributes:


       tab() box; cw(2.75i) |cw(2.75i) lw(2.75i) |lw(2.75i) ATTRIBUTE  TYPEAT‐
       TRIBUTE VALUE _ Interface StabilityCommitted _ MT-LevelMT-Safe


SEE ALSO
       iconv(1),  cconv_close(3C),  cconv_open(3C),  cconvctl(3C),  iconv(3C),
       iconv_close(3C), iconv_open(3C), iconvctl(3C),  iconvstr(3C),  genicon‐
       vtbl(5), attributes(7)

NOTES
       A  combining  character  sequence  is a sequence of multiple characters
       starting with a base character followed by one  or  more  of  combining
       mark  characters  and optionally control characters governing combining
       behavior. For instance, a base character 'A' followed by an acute  com‐
       bining mark followed by another combining mark, dot above.


       A  conjoining  character  sequence is a sequence of multiple characters
       starting with an initial character followed by one or more of  conjoin‐
       ing characters, combining characters, and optionally control characters
       governing conjoining/combining behavior in some Asian scripts.


       A designator sequence for state-dependent encodings is  a  sequence  of
       one  or  more of characters that changes the state of the encoding. For
       instance, in ISO-2022-JP encoding,  a  designator  (also  known  as  an
       escape  sequence)  ESC  $  B  (i.e., 0x1b 0x24 0x42) indicates that the
       bytes following the designator are Japanese double byte  characters  in
       7-bit encoding.



Oracle Solaris 11.4               3 Nov 2014                         cconv(3C)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3