awk(1) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

awk(1)

awk(1)                           User Commands                          awk(1)



NAME
       nawk, awk - pattern scanning and processing language

SYNOPSIS
       /usr/bin/awk [-F ERE] [-v assignment] 'program' | -f progfile...
            [argument]...


       /usr/xpg4/bin/awk [-F ERE] [-v assignment]... 'program' | -f progfile...
            [argument]...

DESCRIPTION
       The /usr/bin/awk and /usr/xpg4/bin/awk utilities execute programs writ‐
       ten in the awk programming language, which is specialized  for  textual
       data  manipulation. A awk  program is a sequence of patterns and corre‐
       sponding actions. The string specifying program  must  be  enclosed  in
       single  quotes  (') to protect it from interpretation by the shell. The
       sequence of pattern - action statements can be specified in the command
       line  as program or in one, or more, file(s) specified by the -f  prog‐
       file option. When input is read that  matches  a  pattern,  the  action
       associated with the pattern is performed.


       Input  is interpreted as a sequence of records. By default, a record is
       a line, but this can be changed by using the RS built-in variable. Each
       record  of  input  is  matched to each pattern in the program. For each
       pattern matched, the associated action is executed.


       The awk utility interprets each input record as a  sequence  of  fields
       where,  by  default,  a field is a string of non-blank characters. This
       default white-space field delimiter (blanks and/or tabs) can be changed
       by  using the FS built-in variable or the -FERE option. The awk utility
       denotes the first field in a record $1, the second $2,  and  so  forth.
       The  symbol  $0  refers  to  the entire record; setting any other field
       causes the reevaluation of $0. Assigning to $0 resets the values of all
       fields and the NF built-in variable.

OPTIONS
       The following options are supported:

       -F ERE           Define  the  input  field separator to be the extended
                        regular expression ERE, before any input is read  (can
                        be a character).


       -f progfile      Specifies the pathname of the file progfile containing
                        a awk program. If multiple instances  of  this  option
                        are  specified,  the concatenation of the files speci‐
                        fied as progfile in the order  specified  is  the  awk
                        program.  The  awk program can alternatively be speci‐
                        fied in the command line as a single argument.


       -v assignment    The assignment argument must be in the same form as an
                        assignment  operand.  The  assignment  is  of the form
                        var=value, where var is the name of one of  the  vari‐
                        ables described below. The specified assignment occurs
                        before  executing  the  awk  program,  including   the
                        actions  associated with BEGIN patterns (if any). Mul‐
                        tiple occurrences of this option can be specified.


OPERANDS
       The following operands are supported:

       program

           If no -f option is specified, the first operand to awk is the  text
           of the awk program. The application supplies the program operand as
           a single argument to awk. If the text does not  end  in  a  newline
           character, awk interprets the text as if it did.


       argument

           Either of the following two types of argument can be intermixed:

           file

               A  pathname of a file that contains the input to be read, which
               is matched against the set of patterns in the  program.  If  no
               file  operands  are  specified,  or if a file operand is −, the
               standard input is used.


           assignment

               An operand that begins with an underscore or alphabetic charac‐
               ter  from the portable character set, followed by a sequence of
               underscores, digits and alphabetics from the portable character
               set,  followed  by the = character specifies a variable assign‐
               ment rather than a pathname. The characters before the = repre‐
               sent the name of a awk variable. If that name is a awk reserved
               word, the behavior is undefined. The characters  following  the
               equal  sign  is interpreted as if they appeared in the awk pro‐
               gram preceded and followed by a double-quote (") character,  as
               a  STRING  token,  except  that  if  the  last  character is an
               unescaped backslash, it is interpreted as a  literal  backslash
               rather  than  as  the  first  character of the sequence \.. The
               variable is assigned the value of that  STRING  token.  If  the
               value  is considered a numeric string, the variable is assigned
               its numeric value. Each such variable assignment  is  performed
               just before the processing of the following file, if any. Thus,
               an assignment before the first file argument is executed  after
               the  BEGIN actions (if any), while an assignment after the last
               file argument is executed before the END actions (if  any).  If
               there  are  no  file arguments, assignments are executed before
               processing the standard input.



INPUT FILES
       Input files to the awk program from any of the following sources:

           o      any file operands or their equivalents, achieved by  modify‐
                  ing the awk variables ARGV and ARGC


           o      standard input in the absence of any file operands


           o      arguments to the getline function



       must  be  text  files.  Whether the variable RS is set to a value other
       than a newline character or not, for these files, implementations  sup‐
       port  records  terminated with the specified separator up to {LINE_MAX}
       bytes and can support longer records.


       If -f  progfile is specified, the files named by each of  the  progfile
       option-arguments must be text files containing an awk program.


       The  standard input are used only if no file operands are specified, or
       if a file operand is −.

EXTENDED DESCRIPTION
       A awk program is composed of pairs of the form:

         pattern { action }



       Either the pattern or the action (including the enclosing brace charac‐
       ters)  can  be  omitted.  Pattern-action  statements are separated by a
       semicolon or by a newline.


       A missing pattern matches any record of input, and a missing action  is
       equivalent  to  an  action  that  writes the matched record of input to
       standard output.


       Execution of the awk program starts  by  first  executing  the  actions
       associated  with all BEGIN patterns in the order they occur in the pro‐
       gram. Then each file operand (or standard input if no files were speci‐
       fied) is processed by reading data from the file until a record separa‐
       tor is seen (a newline character by  default),  splitting  the  current
       record  into fields using the current value of FS, evaluating each pat‐
       tern in the program in the  order  of  occurrence,  and  executing  the
       action  associated  with  each pattern that matches the current record.
       The action for a matching pattern is executed before evaluating  subse‐
       quent  patterns.  Last, the actions associated with all END patterns is
       executed in the order they occur in the program.

   Expressions in awk
       Expressions describe computations used in patterns and actions. In  the
       following  table,  valid expression operations are given in groups from
       highest precedence first to lowest precedence last,  with  equal-prece‐
       dence operators grouped between horizontal lines. In expression evalua‐
       tion, where the grammar is formally ambiguous, higher precedence opera‐
       tors  are  evaluated  before  lower precedence operators. In this table
       expr, expr1, expr2, and expr3 represent any  expression,  while  lvalue
       represents  any  entity  that  can be assigned to (that is, on the left
       side of an assignment operator).


       tab(); lw(1.1i) lw(2.2i) lw(1.1i) lw(1.1i) lw(1.1i)  lw(2.2i)  lw(1.1i)
       lw(1.1i)  SyntaxNameType  of ResultAssociativity _ ( expr )Groupingtype
       of exprn/a _ $exprField referencestringn/a _ ++  lvaluePre-incrementnu‐
       mericn/a −− lvaluePre-decrementnumericn/a lvalue ++Post-incrementnumer‐
       icn/a lvalue −−Post-decrement numericn/a _ expr ^ exprExponentiationnu‐
       mericright  _  ! exprLogical notnumericn/a + exprUnary plusnumericn/a −
       exprUnary minusnumericn/a _ expr * exprMultiplicationnumericleft expr /
       exprDivisionnumericleft  expr % exprModulusnumericleft _ expr + exprAd‐
       ditionnumericleft expr − exprSubtractionnumeric left _ expr  exprString
       concatenationstringleft _ expr < exprLess thannumericnone expr <= expr‐
       Less than or equal tonumericnone expr !=  exprNot  equal  tonumericnone
       expr == exprEqual tonumericnone expr > exprGreater thannumericnone expr
       >= exprGreater than or equal tonumericnone _ expr ~ exprERE matchnumer‐
       icnone  expr  !~ exprERE non-matchnumericnone _ expr in arrayArray mem‐
       bershipnumericleft ( index ) inMulti-dimension arraynumericleft
         array membership _ expr && exprLogical ANDnumericleft _ expr || expr‐
       Logical  ORnumericleft  _  expr1  ?  expr2Conditional expressiontype of
       selectedright
        : expr3  expr2 or expr3 _ lvalue ^= exprExponentiationnumericright
        assignment lvalue  %=  exprModulus  assignmentnumericright  lvalue  *=
       exprMultiplication   assignmentnumericright   lvalue   /=  exprDivision
       assignmentnumericright lvalue  +=  exprAddition  assignmentnumericright
       lvalue  −=  exprSubtraction assignmentnumericright lvalue = exprAssign‐
       menttype of exprright



       Each expression has either a string value, a  numeric  value  or  both.
       Except  as  stated for specific contexts, the value of an expression is
       implicitly converted to the type needed for the context in which it  is
       used.  A string value is converted to a numeric value by the equivalent
       of the following calls:

         setlocale(LC_NUMERIC, "");
         numeric_value = atof(string_value);



       A numeric value that is exactly equal to the value  of  an  integer  is
       converted  to a string by the equivalent of a call to the sprintf func‐
       tion with the string %d as the fmt argument and the numeric value being
       converted  as the first and only expr argument. Any other numeric value
       is converted to a string by the equivalent of a  call  to  the  sprintf
       function with the value of the variable CONVFMT as the fmt argument and
       the numeric value being converted as the first and only expr argument.


       A string value is considered to be a numeric string  in  the  following
       case:

           1.     Any leading and trailing blank characters is ignored.


           2.     If the first unignored character is a + or −, it is ignored.


           3.     If  the  remaining  unignored  characters would be lexically
                  recognized as a NUMBER token, the  string  is  considered  a
                  numeric string.




       If  a  −  character is ignored in the above steps, the numeric value of
       the numeric string is the negation of the numeric value of  the  recog‐
       nized  NUMBER  token. Otherwise the numeric value of the numeric string
       is the numeric value of the recognized NUMBER token. Whether or  not  a
       string is a numeric string is relevant only in contexts where that term
       is used in this section.


       When an expression is used in a Boolean context, if it  has  a  numeric
       value,  a  value  of  zero  is  treated as false and any other value is
       treated as true. Otherwise, a  string  value  of  the  null  string  is
       treated as false and any other value is treated as true. A Boolean con‐
       text is one of the following:

           o      the first subexpression of a conditional expression.


           o      an expression operated on by logical NOT,  logical  AND,  or
                  logical OR.


           o      the second expression of a for statement.


           o      the expression of an if statement.


           o      the  expression  of the while clause in either a while or do
                  ...  while statement.


           o      an expression used as  a  pattern  (as  in  Overall  Program
                  Structure).



       The  awk  language supplies arrays that are used for storing numbers or
       strings. Arrays need not be declared. They  are  initially  empty,  and
       their  sizes  changes  dynamically.  The subscripts, or element identi‐
       fiers, are strings, providing a type of associative  array  capability.
       An  array  name  followed  by a subscript within square brackets can be
       used as an lvalue and as an expression, as described  in  the  grammar.
       Unsubscripted array names are used in only the following contexts:

           o      a parameter in a function definition or function call.


           o      the NAME token following any use of the keyword in.



       A  valid  array  index  consists of one or more comma-separated expres‐
       sions, similar to the way in which multi-dimensional arrays are indexed
       in some programming languages. Because awk arrays are really one-dimen‐
       sional, such a comma-separated list is converted to a single string  by
       concatenating the string values of the separate expressions, each sepa‐
       rated from the other by the value of the SUBSEP variable.


       Thus, the following two index operations are equivalent:

         var[expr1, expr2, ... exprn]
         var[expr1 SUBSEP expr2 SUBSEP ... SUBSEP exprn]



       A multi-dimensioned index used with the in  operator  must  be  put  in
       parentheses.  The  in operator, which tests for the existence of a par‐
       ticular array element, does not create  the  element  if  it  does  not
       exist.  Any  other  reference to a non-existent array element automati‐
       cally creates it.

   Variables and Special Variables
       Variables can be used in an awk program by referencing them.  With  the
       exception  of  function  parameters,  they are not explicitly declared.
       Uninitialized scalar variables and array elements have both  a  numeric
       value of zero and a string value of the empty string.


       Field variables are designated by a $ followed by a number or numerical
       expression. The effect of the field  number  expression  evaluating  to
       anything  other  than a non-negative integer is unspecified. Uninitial‐
       ized variables or string values need not be converted to numeric values
       in  this  context. New field variables are created by assigning a value
       to them. References to non-existent fields (that is, fields after  $NF)
       produce  the  null  string.  However, assigning to a non-existent field
       (for example, $(NF+2) = 5) increases the value of NF, create any inter‐
       vening  fields with the null string as their values and cause the value
       of $0 to be recomputed, with the fields being separated by the value of
       OFS.  Each  field  variable  has  a  string  value when created. If the
       string, with any occurrence of the  decimal-point  character  from  the
       current  locale  changed to a period character, is considered a numeric
       string (see Expressions in awk above), the field variable also has  the
       numeric value of the numeric string.

   /usr/bin/awk, /usr/xpg4/bin/awk
       awk  sets  the  following  special variables that are supported by both
       /usr/bin/awk and /usr/xpg4/bin/awk:

       ARGC

           The number of elements in the ARGV array.


       ARGV

           An array of command line arguments, excluding options and the  pro‐
           gram argument, numbered from zero to ARGC−1.

           The  arguments  in  ARGV  can  be modified or added to; ARGC can be
           altered. As each input file ends, awk treats the next non-null ele‐
           ment  of ARGV, up to the current value of ARGC−1, inclusive, as the
           name of the next input file. Setting an element  of  ARGV  to  null
           means that it is not treated as an input file. The name − indicates
           the standard input. If an argument matches the format of an assign‐
           ment operand, this argument is treated as an assignment rather than
           a file argument.


       ENVIRON

           The variable ENVIRON is an array  representing  the  value  of  the
           environment. The indices of the array are strings consisting of the
           names of the environment variables, and the  value  of  each  array
           element  is  a  string consisting of the value of that variable. If
           the value of  an  environment  variable  is  considered  a  numeric
           string, the array element also has its numeric value.

           In  all  cases  where awk behavior is affected by environment vari‐
           ables (including the environment of any commands that awk  executes
           via the system function or via pipeline redirections with the print
           statement, the printf statement,  or  the  getline  function),  the
           environment  used  is the environment at the time awk began execut‐
           ing.


       FILENAME

           A pathname of the current input file. Inside  a  BEGIN  action  the
           value  is  undefined. Inside an END action the value is the name of
           the last input file processed.


       FNR

           The ordinal number of the  current  record  in  the  current  file.
           Inside  a  BEGIN action the value is zero. Inside an END action the
           value is the number of the last record processed in the  last  file
           processed.


       FS

           Input  field  separator  regular  expression;  a space character by
           default.


       NF

           The number of fields in the current record. Inside a BEGIN  action,
           the  use of NF is undefined unless a getline function without a var
           argument is executed previously. Inside an END action,  NF  retains
           the  value  it  had  for the last record read, unless a subsequent,
           redirected, getline function without a var  argument  is  performed
           prior to entering the END action.


       NR

           The  ordinal  number of the current record from the start of input.
           Inside a BEGIN action the value is zero. Inside an END  action  the
           value is the number of the last record processed.


       OFMT

           The  printf  format  for  converting  numbers  to strings in output
           statements "%.6g" by default.  The  result  of  the  conversion  is
           unspecified  if  the  value  of OFMT is not a floating-point format
           specification.


       OFS

           The print statement output field separator; a  space  character  by
           default.


       ORS

           The print output record separator; a newline character by default.


       LENGTH

           The length of the string matched by the match function.


       RS

           The  first  character of the string value of RS is the input record
           separator; a newline character by default. If RS contains more than
           one  character,  the  results  are unspecified. If RS is null, then
           records are separated by sequences of  one  or  more  blank  lines.
           Leading or trailing blank lines do not produce empty records at the
           beginning or end of input, and the field separator is  always  new‐
           line, no matter what the value of FS.


       RSTART

           The  starting position of the string matched by the match function,
           numbering from 1. This is always equivalent to the return value  of
           the match function.


       SUBSEP

           The  subscript  separator  string for multi-dimensional arrays. The
           default value is \034.


   /usr/xpg4/bin/awk
       The following variable is supported for /usr/xpg4/bin/awk only:

       CONVFMT

           The printf format for converting numbers  to  strings  (except  for
           output statements, where OFMT is used). The default is %.6g.


   Regular Expressions
       The   awk   utility  uses  the  same  regular  expression  notation  as
       /usr/bin/egrep. For more information, see the egrep(1) man page.


       The /usr/xpg4/bin/awk utility makes use of the extended regular expres‐
       sion  notation  (see  regex(7)) except that it allows the use of C-lan‐
       guage conventions to escape special characters within the EREs,  namely
       \\,  \a,  \b,  \f, \n, \r, \t, \v, and those specified in the following
       table. These escape sequences are recognized both  inside  and  outside
       bracket  expressions.  Records need not be separated by newline charac‐
       ters and string constants can contain newline characters, so  even  the
       \n  sequence  is  valid in awk EREs. Using a slash character within the
       regular expression requires escaping as shown in the table below:


       tab(); cw(0.61i)  cw(2.44i)  cw(2.44i)  cw(0.61i)  lw(2.44i)  lw(2.44i)
       Escape  SequenceDescriptionMeaning  _  \"Backslash quotation-markQuota‐
       tion-mark character _ \/Backslash slashSlash character _ \dddT{ A back‐
       slash  character followed by the longest sequence of one, two, or three
       octal-digit characters (01234567). If all of the digits  are  0,  (that
       is,  representation  of the NULL character), the behavior is undefined.
       T}T{ The character encoded by the one-, two- or three-digit octal inte‐
       ger.   Multibyte   characters  require  multiple,  concatenated  escape
       sequences, including the leading \ for each byte.  T} _  \cT{  A  back‐
       slash  character  followed by any character not described in this table
       or special characters (\\, \a, \b, \f, \n, \r, \t, \v).  T}Undefined



       A regular expression can be matched against a specific field or  string
       by  using  one  of the two regular expression matching operators, ~ and
       !~. These operators interpret their right-hand  operand  as  a  regular
       expression  and  their  left-hand  operand  as a string. If the regular
       expression matches the string, the ~ expression evaluates to the  value
       1,  and  the  !~  expression  evaluates  to the value 0. If the regular
       expression does not match the string, the ~ expression evaluates to the
       value  0, and the !~ expression evaluates to the value 1. If the right-
       hand operand is any expression other than the lexical  token  ERE,  the
       string  value  of  the expression is interpreted as an extended regular
       expression, including the escape conventions  described  above.  Notice
       that  these same escape conventions also are applied in the determining
       the value of a string  literal  (the  lexical  token  STRING),  and  is
       applied a second time when a string literal is used in this context.


       When an ERE token appears as an expression in any context other than as
       the right-hand of the ~ or !~ operator or as one of the built-in  func‐
       tion  arguments  described below, the value of the resulting expression
       is the equivalent of:

         $0 ~ /ere/



       The ere argument to the gsub, match, sub functions, and the fs argument
       to the split function (see String Functions) is interpreted as extended
       regular expressions. These  can  be  either  ERE  tokens  or  arbitrary
       expressions,  and  are interpreted in the same manner as the right-hand
       side of the ~ or !~ operator.


       An extended regular expression can be used to separate fields by  using
       the  -F   ERE option or by assigning a string containing the expression
       to the built-in variable FS. The default value of the FS variable is  a
       single space character. The following describes FS behavior:

           1.     If FS is a single character:

               o      If  FS is the space character, skip leading and trailing
                      blank characters; fields are delimited by sets of one or
                      more blank characters.


               o      Otherwise,  if  FS  is any other character c, fields are
                      delimited by each single occurrence of c.



           2.     Otherwise, the string value of FS is  considered  to  be  an
                  extended  regular  expression. Each occurrence of a sequence
                  matching the extended regular expression delimits fields.




       Except in the gsub, match, split, and sub built-in  functions,  regular
       expression  matching is based on input records. That is, record separa‐
       tor characters (the first character of the value of the variable RS,  a
       newline character by default) cannot be embedded in the expression, and
       no expression matches the record separator  character.  If  the  record
       separator  is  not  a newline character, newline characters embedded in
       the expression can be matched. In those four built-in functions,  regu‐
       lar  expression  matching  are based on text strings. So, any character
       (including the newline character  and  the  record  separator)  can  be
       embedded  in the pattern and an appropriate pattern matches any charac‐
       ter. However, in all awk regular expression matching, the use of one or
       more  NULL  characters in the pattern, input record or text string pro‐
       duces undefined results.

   Patterns
       A pattern is any valid expression, a range specified by two expressions
       separated by comma, or one of the two special patterns BEGIN or END.

   Special Patterns
       The  awk  utility  recognizes two special patterns, BEGIN and END. Each
       BEGIN pattern is matched once and its associated action executed before
       the  first  record of input is read (except possibly by use of the get‐
       line function in a prior BEGIN action) and before command line  assign‐
       ment  is  done.  Each  END  pattern  is matched once and its associated
       action executed after the last record of input has been read. These two
       patterns have associated actions.


       BEGIN  and  END  do not combine with other patterns. Multiple BEGIN and
       END patterns are allowed. The actions associated with  the  BEGIN  pat‐
       terns  are  executed  in the order specified in the program, as are the
       END actions. An END pattern can precede a BEGIN pattern in a program.


       If an awk program consists of only actions with the pattern BEGIN,  and
       the  BEGIN action contains no getline function, awk exits without read‐
       ing its input when the last statement in the last BEGIN action is  exe‐
       cuted.  If an awk program consists of only actions with the pattern END
       or only actions with the patterns BEGIN and  END,  the  input  is  read
       before the statements in the END actions are executed.

   Expression Patterns
       An  expression  pattern  is  evaluated as if it were an expression in a
       Boolean context. If the result is true, the pattern  is  considered  to
       match, and the associated action (if any) is executed. If the result is
       false, the action is not executed.

   Pattern Ranges
       A pattern range consists of two expressions separated by  a  comma.  In
       this  case,  the action is performed for all records between a match of
       the first expression and the following match of the second  expression,
       inclusive. At this point, the pattern range can be repeated starting at
       input records subsequent to the end of the matched range.

   Actions
       An action is a sequence of statements. A statement can be  one  of  the
       following:

         if ( expression ) statement [ else statement ]
         while ( expression ) statement
         do statement while ( expression )
         for ( expression ; expression ; expression ) statement
         for ( var in array ) statement
         delete array[subscript] #delete an array element
         break
         continue
         { [ statement ] ... }
         expression        # commonly variable = expression
         print [ expression-list ] [ >expression ]
         printf format [ ,expression-list ] [ >expression ]
         next              # skip remaining patterns on this input line
         exit [expr] # skip the rest of the input; exit status is expr
         return [expr]



       Any  single  statement  can be replaced by a statement list enclosed in
       braces. The statements are terminated by newline  characters  or  semi‐
       colons, and are executed sequentially in the order that they appear.


       The  next  statement causes all further processing of the current input
       record to be abandoned. The behavior is undefined if a  next  statement
       appears or is invoked in a BEGIN or END action.


       The  exit  statement invokes all END actions in the order in which they
       occur in the program source and  then  terminate  the  program  without
       reading  further  input.  An exit statement inside an END action termi‐
       nates the program without further  execution  of  END  actions.  If  an
       expression  is specified in an exit statement, its numeric value is the
       exit status of awk, unless subsequent errors are encountered or a  sub‐
       sequent exit statement with an expression is executed.

   Output Statements
       Both  print  and printf statements write to standard output by default.
       The output is written to the location specified  by  output_redirection
       if one is supplied, as follows:

         > expression >> expression | expression



       In  all  cases, the expression is evaluated to produce a string that is
       used as a full pathname to write into (for > or >>) or as a command  to
       be  executed  (for  |).  Using the first two forms, if the file of that
       name is not currently open, it is opened, creating it if necessary  and
       using  the first form, truncating the file. The output then is appended
       to the file. As long as the file  remains  open,  subsequent  calls  in
       which expression evaluates to the same string value simply appends out‐
       put to the file. The file remains open until the close function,  which
       is called with an expression that evaluates to the same string value.


       The third form writes output onto a stream piped to the input of a com‐
       mand. The stream is created if no stream is  currently  open  with  the
       value  of expression as its command name. The stream created is equiva‐
       lent to one created by a call to the popen(3C) function with the  value
       of  expression  as  the  command  argument and a value of w as the mode
       argument. As long as the stream remains open, subsequent calls in which
       expression  evaluates  to  the  same  string value writes output to the
       existing stream. The stream remains open until the  close  function  is
       called  with  an expression that evaluates to the same string value. At
       that time, the stream is closed as if by a call to the pclose function.


       These output statements take a comma-separated list  of  expression   s
       referred   in  the  grammar  by  the  non-terminal  symbols  expr_list,
       print_expr_list or print_expr_list_opt. This list is referred  to  here
       as the expression list, and each member is referred to as an expression
       argument.


       The print statement writes the value of each expression  argument  onto
       the indicated output stream separated by the current output field sepa‐
       rator (see variable OFS above), and terminated  by  the  output  record
       separator  (see  variable ORS above). All expression arguments is taken
       as strings, being converted if necessary; with the exception  that  the
       printf format in OFMT is used instead of the value in CONVFMT. An empty
       expression list stands for the whole input record ($0).


       The printf statement produces output based on a notation similar to the
       File  Format  Notation  used  to describe file formats in this document
       Output is produced as specified with the first expression  argument  as
       the  string  format  and subsequent expression arguments as the strings
       arg1 to argn, inclusive, with the following exceptions:

           1.     The format is an  actual  character  string  rather  than  a
                  graphical representation. Therefore, it cannot contain empty
                  character positions.  The  space  character  in  the  format
                  string,  in  any  context  other than a flag of a conversion
                  specification, is treated as an ordinary character  that  is
                  copied to the output.


           2.     If  the  character  set  contains a Delta character and that
                  character appears in the format string, it is treated as  an
                  ordinary character that is copied to the output.


           3.     The escape sequences beginning with a backslash character is
                  treated as sequences of ordinary characters that are  copied
                  to the output. Note that these same sequences is interpreted
                  lexically by awk when they appear in  literal  strings,  but
                  they is not treated specially by the printf statement.


           4.     A field width or precision can be specified as the * charac‐
                  ter instead of a digit string. In this case the  next  argu‐
                  ment  from  the  expression  list is fetched and its numeric
                  value taken as the field width or precision.


           5.     The implementation does not precede or  follow  output  from
                  the  d  or u conversion specifications with blank characters
                  not specified by the format string.


           6.     The implementation does not precede output from the  o  con‐
                  version  specification  with  leading zeros not specified by
                  the format string.


           7.     For the c conversion specification: if the  argument  has  a
                  numeric value, the character whose encoding is that value is
                  output. If the value is zero or is not the encoding  of  any
                  character  in  the character set, the behavior is undefined.
                  If the argument does not have a  numeric  value,  the  first
                  character  of the string value is output; if the string does
                  not contain any characters the behavior is undefined.


           8.     For each conversion specification that consumes an argument,
                  the  next  expression argument is evaluated. With the excep‐
                  tion of the c conversion, the  value  is  converted  to  the
                  appropriate type for the conversion specification.


           9.     If  there  are  insufficient expression arguments to satisfy
                  all the conversion specifications in the format string,  the
                  behavior is undefined.


           10.    If any character sequence in the format string begins with a
                  % character, but does not form a valid conversion specifica‐
                  tion, the behavior is unspecified.




       Both print and printf can output at least {LINE_MAX} bytes.

   Functions
       The  awk  language  has  a  variety  of built-in functions: arithmetic,
       string, input/output and general.

   Arithmetic Functions
       The arithmetic functions, except for int, are based on the ISO  C stan‐
       dard.  The  behavior  is  undefined  in cases where the ISO  C standard
       specifies that an error be returned or that the behavior is  undefined.
       Although the grammar permits built-in functions to appear with no argu‐
       ments or parentheses, unless the argument or parentheses are  indicated
       as  optional  in  the following list (by displaying them within the [ ]
       brackets), such use is undefined.

       atan2(y,x)       Return arctangent of y/x.


       cos(x)           Return cosine of x, where x is in radians.


       sin(x)           Return sine of x, where x is in radians.


       exp(x)           Return the exponential function of x.


       log(x)           Return the natural logarithm of x.


       sqrt(x)          Return the square root of x.


       int(x)           Truncate its argument to an integer. It  is  truncated
                        toward 0 when x > 0.


       rand()           Return a random number n, such that 0 ≤ n < 1.


       srand([expr])    Set the seed value for rand to expr or use the time of
                        day if expr is omitted. The  previous  seed  value  is
                        returned.


   String Functions
       The string functions in the following list shall be supported. Although
       the grammar permits built-in functions to appear with no  arguments  or
       parentheses,  unless  the  argument  or  parentheses  are  indicated as
       optional in the following list (by  displaying  them  within  the  [  ]
       brackets), such use is undefined.

       gsub(ere,repl[,in])

           Behave  like  sub  (see  below), except that it replaces all occur‐
           rences of the regular expression (like the ed utility  global  sub‐
           stitute) in $0 or in the in argument, when specified.


       index(s,t)

           Return  the  position, in characters, numbering from 1, in string s
           where string t first occurs, or zero if it does not occur at all.


       length[([s])]

           Return the length, in  characters,  of  its  argument  taken  as  a
           string, or of the whole record, $0, if there is no argument.


       match(s,ere)

           Return  the  position, in characters, numbering from 1, in string s
           where the extended regular expression ere occurs,  or  zero  if  it
           does  not  occur  at  all.  RSTART  is set to the starting position
           (which is the same as the returned value),  zero  if  no  match  is
           found; RLENGTH is set to the length of the matched string, −1 if no
           match is found.


       split(s,a[,fs])

           Split the string s into array elements a[1], a[2], ...,  a[n],  and
           return  n. The separation is done with the extended regular expres‐
           sion fs or with the field separator FS if fs  is  not  given.  Each
           array  element  has  a  string  value  when  created. If the string
           assigned to any array element, with any occurrence of the  decimal-
           point character from the current locale changed to a period charac‐
           ter, would be considered a numeric string; the array  element  also
           has  the  numeric value of the numeric string. The effect of a null
           string as the value of fs is unspecified.


       sprintf(fmt,expr,expr,...)

           Format the expressions according to the printf format given by  fmt
           and return the resulting string.


       sub(ere,repl[,in])

           Substitute  the  string  repl in place of the first instance of the
           extended regular expression ERE in string in and return the  number
           of  substitutions.  An ampersand ( & ) appearing in the string repl
           is replaced by the string from in that matches the regular  expres‐
           sion.  An  ampersand preceded with a backslash ( \ ) is interpreted
           as the literal ampersand character. An occurrence of  two  consecu‐
           tive  backslashes is interpreted as just a single literal backslash
           character. Any other occurrence of a backslash (for  example,  pre‐
           ceding any other character) is treated as a literal backslash char‐
           acter. If repl is a string literal, the handling of  the  ampersand
           character  occurs after any lexical processing, including any lexi‐
           cal backslash escape sequence processing. If in is specified and it
           is  not  an lvalue the behavior is undefined. If in is omitted, awk
           uses the current record ($0) in its place.


       substr(s,m[,n])

           Return the at most n-character substring of s that begins at  posi‐
           tion  m,  numbering from 1. If n is missing, the length of the sub‐
           string is limited by the length of the string s.


       tolower(s)

           Return a string based on the string s. Each character in s that  is
           an  uppercase  letter  specified  to  have a tolower mapping by the
           LC_CTYPE category of the current locale is replaced in the returned
           string  by  the  lowercase  letter  specified by the mapping. Other
           characters in s are unchanged in the returned string.


       toupper(s)

           Return a string based on the string s. Each character in s that  is
           a  lowercase  letter  specified  to  have  a toupper mapping by the
           LC_CTYPE category of the current locale is replaced in the returned
           string  by  the  uppercase  letter  specified by the mapping. Other
           characters in s are unchanged in the returned string.



       All of the preceding functions that take ERE as a  parameter  expect  a
       pattern  or  a string valued expression that is a regular expression as
       defined below.

   Input/Output and General Functions
       The input/output and general functions are:

       close(expression)

           Close the file or pipe opened by a print or printf statement  or  a
           call  to  getline  with  the  same string-valued expression. If the
           close was successful, the function returns 0; otherwise, it returns
           non-zero.


       expression|getline[var]

           Read  a  record  of  input from a stream piped from the output of a
           command. The stream is created if no stream is currently open  with
           the  value of expression as its command name. The stream created is
           equivalent to one created by a call to the popen function with  the
           value of expression as the command argument and a value of r as the
           mode argument. As long as the stream remains open, subsequent calls
           in which expression evaluates to the same string value reads subse‐
           quent records from the file. The  stream  remains  open  until  the
           close  function  is called with an expression that evaluates to the
           same string value. At that time, the stream is closed as  if  by  a
           call  to  the pclose function. If var is missing, $0 and NF is set.
           Otherwise, var is set.

           The getline operator can form ambiguous constructs when  there  are
           operators  that  are  not in parentheses (including concatenate) to
           the left of the | (to the beginning of  the  expression  containing
           getline).  In the context of the $ operator, | behaves as if it had
           a lower precedence than $. The result of evaluating other operators
           is  unspecified, and all such uses of portable applications must be
           put in parentheses properly.


       getline

           Set $0 to the next input record from the current input  file.  This
           form of getline sets the NF, NR, and FNR variables.


       getline var

           Set  variable  var  to the next input record from the current input
           file. This form of getline sets the FNR and NR variables.


       getline [var] < expression

           Read the next record of input from a named file. The expression  is
           evaluated  to  produce a string that is used as a full pathname. If
           the file of that name is not currently open, it is opened. As  long
           as  the  stream  remains open, subsequent calls in which expression
           evaluates to the same string value reads  subsequent  records  from
           the  file. The file remains open until the close function is called
           with an expression that evaluates to the same string value. If  var
           is missing, $0 and NF is set. Otherwise, var is set.

           The  getline  operator can form ambiguous constructs when there are
           binary operators that are not in  parentheses  (including  concate‐
           nate)  to  the right of the < (up to the end of the expression con‐
           taining the getline). The result of evaluating such a construct  is
           unspecified, and all such uses of portable applications must be put
           in parentheses properly.


       system(expression)

           Execute the command given by expression in a manner  equivalent  to
           the system(3C) function and return the exit status of the command.



       All  forms of getline return 1 for successful input, 0 for end of file,
       and −1 for an error.


       Where strings are used as the name of a file or pipeline,  the  strings
       must  be  textually  identical.  The  terminology  "same  string value"
       implies that "equivalent strings", even those that differ only by space
       characters, represent different files.

   User-defined Functions
       The  awk  language also provides user-defined functions. Such functions
       can be defined as:

         function name(args,...) { statements }



       A function can be referred to anywhere in an awk program;  in  particu‐
       lar,  its  use  can  precede its definition. The scope of a function is
       global.


       Function arguments can be either scalars or  arrays;  the  behavior  is
       undefined  if  an array name is passed as an argument that the function
       uses as a scalar, or if a scalar expression is passed  as  an  argument
       that  the  function  uses as an array. Function arguments are passed by
       value if scalar and by reference if  array  name.  Argument  names  are
       local  to  the  function; all other variable names are global. The same
       name is not used as both an argument name and as the name of a function
       or  a  special  awk  variable. The same name must not be used both as a
       variable name with global scope and as the name of a function. The same
       name  must  not be used within the same scope both as a scalar variable
       and as an array.


       The number of parameters in the function definition need not match  the
       number of parameters in the function call. Excess formal parameters can
       be used as local variables. If fewer arguments are supplied in a  func‐
       tion  call  than  are  in the function definition, the extra parameters
       that are used in the function body as scalars are  initialized  with  a
       string  value  of  the null string and a numeric value of zero, and the
       extra parameters that are used in the function body as arrays are  ini‐
       tialized  as empty arrays. If more arguments are supplied in a function
       call than are in the function definition, the behavior is undefined.


       When invoking a function, no white space  can  be  placed  between  the
       function name and the opening parenthesis. Function calls can be nested
       and recursive calls can be made upon functions. Upon  return  from  any
       nested  or  recursive  function  call, the values of all of the calling
       function's parameters are unchanged, except for array parameters passed
       by  reference. The return statement can be used to return a value. If a
       return statement appears outside of a function definition, the behavior
       is undefined.


       In  the function definition, newline characters are optional before the
       opening brace and after the closing  brace.  Function  definitions  can
       appear anywhere in the program where a pattern-action pair is allowed.

USAGE
       The  index,  length, match, and substr functions should not be confused
       with similar functions in the ISO C standard;  the  awk  versions  deal
       with characters, while the ISO C standard deals with bytes.


       Because  the concatenation operation is represented by adjacent expres‐
       sions rather than an explicit operator, it is often  necessary  to  use
       parentheses to enforce the proper evaluation precedence.

EXAMPLES
       The  awk program specified in the command line is most easily specified
       within single-quotes (for example, 'program')  for  applications  using
       sh,  because  awk programs commonly contain characters that are special
       to the shell, including double-quotes. In the cases where a awk program
       contains single-quote characters, it is usually easiest to specify most
       of the program as strings  within  single-quotes  concatenated  by  the
       shell with quoted single-quote characters. For example:

         awk '/'\''/ { print "quote:", $0 }'



       prints  all  lines  from  the  standard input containing a single-quote
       character, prefixed with quote:.


       The following are examples of simple awk programs:

       Example 1 Write to the standard output all input lines for which  field
       3 is greater than 5:


         $3 > 5


       Example 2 Write every tenth line:


         (NR % 10) == 0


       Example  3 Write any line with a substring matching the regular expres‐
       sion:


         /(G|D)(2[0-9][[:alpha:]]*)/


       Example 4 Print any line with a substring containing a G or D, followed
       by a sequence of digits and characters:



       This  example uses character classes digit and alpha to match language-
       independent digit and alphabetic characters, respectively.


         /(G|D)([[:digit:][:alpha:]]*)/


       Example 5 Write any line in which the second field matches the  regular
       expression and the fourth field does not:


         $2 ~ /xyz/ && $4 !~ /xyz/


       Example  6  Write  any  line in which the second field contains a back‐
       slash:


         $2 ~ /\\/


       Example 7 Write any line in which the second field contains a backslash
       (alternate method):



       Notice  that  backslash  escapes are interpreted twice, once in lexical
       processing of the string and once in processing the regular expression.


         $2 ~ "\\\\"


       Example 8 Write the second to the last and the last field in each line,
       separating the fields by a colon:


         {OFS=":";print $(NF-1), $NF}


       Example 9 Write the line number and number of fields in each line:



       The  three strings representing the line number, the colon and the num‐
       ber of fields are concatenated and that string is written  to  standard
       output.


         {print NR ":" NF}


       Example 10 Write lines longer than 72 characters:


         {length($0) > 72}


       Example  11  Write  first two fields in opposite order separated by the
       OFS:


         { print $2, $1 }


       Example 12 Same, with input fields separated by comma or space and  tab
       characters, or both:


         BEGIN { FS = ",[\t]*|[\t]+" }
         { print $2, $1 }


       Example 13 Add up first column, print sum and average:


         {s += $1 }
         END {print "sum is ", s, " average is", s/NR}


       Example  14 Write fields in reverse order, one per line (many lines out
       for each line in):


         { for (i = NF; i > 0; --i) print $i }


       Example 15 Write all lines between occurrences of the  strings  "start"
       and "stop":


         /start/, /stop/


       Example 16 Write all lines whose first field is different from the pre‐
       vious one:


         $1 != prev { print; prev = $1 }


       Example 17 Simulate the echo command:


         BEGIN  {
                for (i = 1; i < ARGC; ++i)
                      printf "%s%s", ARGV[i], i==ARGC-1?"\n":""
                }


       Example 18 Write the path prefixes contained in  the  PATH  environment
       variable, one per line:


         BEGIN  {
                n = split (ENVIRON["PATH"], path, ":")
                for (i = 1; i <= n; ++i)
                       print path[i]
                }


       Example  19 Print the file "input", filling in page numbers starting at
       5:



       If there is a file named input containing page headers of the form


         Page#




       and a file named program that contains


         /Page/{ $2 = n++; }
         { print }




       then the command line


         awk -f program n=5 input




       prints the file input, filling in page numbers starting at 5.

ENVIRONMENT VARIABLES
       See environ(7) for descriptions of the following environment  variables
       that affect execution: LC_COLLATE, LC_CTYPE, LC_MESSAGES, and NLSPATH.

       LC_NUMERIC

           Determine the radix character used when interpreting numeric input,
           performing conversions between numeric and string values  and  for‐
           matting  numeric output. Regardless of locale, the period character
           (the decimal-point character of the POSIX locale) is  the  decimal-
           point  character  recognized  in processing awk programs (including
           assignments in command-line arguments).


EXIT STATUS
       The following exit values are returned:

       0      All input files were processed successfully.


       > 0    An error occurred.



       The exit status can be altered within the  program  by  using  an  exit
       expression.

ATTRIBUTES
       See attributes(7) for descriptions of the following attributes:

   /usr/bin/awk
       tab()  box; cw(2.75i) |cw(2.75i) lw(2.75i) |lw(2.75i) ATTRIBUTE TYPEAT‐
       TRIBUTE VALUE _ Availabilitysystem/core-os


   /usr/xpg4/bin/awk
       tab() box; cw(2.75i) |cw(2.75i) lw(2.75i) |lw(2.75i) ATTRIBUTE  TYPEAT‐
       TRIBUTE VALUE _ Availabilitysystem/xopen/xcu4


SEE ALSO
       awk(1),   ed(1),   egrep(1),   grep(1),   lex(1),   sed(1),  popen(3C),
       printf(3C), system(3C), XPG4(7), attributes(7), environ(7), regex(7)


       Aho, A. V., B. W. Kernighan, and P. J. Weinberger, The AWK  Programming
       Language, Addison-Wesley, 1988.

DIAGNOSTICS
       If any file operand is specified and the named file cannot be accessed,
       awk writes a diagnostic message to standard error and terminate without
       any further action.


       If  the  program  specified by either the program operand or a progfile
       operand is not a valid awk program (as specified in  EXTENDED  DESCRIP‐
       TION), the behavior is undefined.

NOTES
       Input white space is not preserved on output if fields are involved.


       There are no explicit conversions between numbers and strings. To force
       an expression to be treated as a number add 0 to it; to force it to  be
       treated as a string concatenate the null string ("") to it.



Oracle Solaris 11.4               11 May 2021                           awk(1)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3