tcp(4p) 맨 페이지 - 윈디하나의 솔라나라

개요

섹션
맨 페이지 이름
검색(S)

tcp(4p)

tcp(4P)                        Network Protocols                       tcp(4P)



NAME
       tcp, TCP - Internet Transmission Control Protocol

SYNOPSIS
       #include <sys/socket.h>


       #include <netinet/in.h>


       s = socket(AF_INET, SOCK_STREAM, 0);


       s = socket(AF_INET6, SOCK_STREAM, 0);


       t = t_open("/dev/tcp", O_RDWR);


       t = t_open("/dev/tcp6", O_RDWR);

DESCRIPTION
       TCP is the virtual circuit protocol of the Internet protocol family. It
       provides reliable, flow-controlled, in order, two-way  transmission  of
       data.  It is a byte-stream protocol layered above the Internet Protocol
       (IP), or the Internet Protocol Version 6 (IPv6), the Internet  protocol
       family's internetwork datagram delivery protocol.


       Programs  can  access  TCP  using the socket interface as a SOCK_STREAM
       socket type, or using the Transport Level Interface (TLI) where it sup‐
       ports the connection-oriented (T_COTS_ORD) service type.


       TCP  uses  IP's host-level addressing and adds its own per-host collec‐
       tion of "port addresses." The endpoints of a TCP connection are identi‐
       fied by the combination of an IP or IPv6 address and a TCP port number.
       Although other protocols, such as the User Datagram Protocol (UDP), can
       use the same host and port address format, the port space of these pro‐
       tocols is distinct. See inet(4P) and inet6(4P) for details on the  com‐
       mon aspects of addressing in the Internet protocol family.


       Sockets  utilizing TCP are either "active" or "passive." Active sockets
       initiate connections to passive sockets. Both  types  of  sockets  must
       have  their local IP or IPv6 address and TCP port number bound with the
       bind(3C) system call after the socket is created. By default, TCP sock‐
       ets  are  active. A passive socket is created by calling the listen(3C)
       system call after binding the socket with bind().  This  establishes  a
       queueing  parameter  for the passive socket. After this, connections to
       the passive socket can be received with  the  accept(3C)  system  call.
       Active  sockets use the connect(3C) call after binding to initiate con‐
       nections.


       By using the special value  INADDR_ANY  with  IP,  or  the  unspecified
       address  (all  zeroes)  with  IPv6,  the  local  IP address can be left
       unspecified in the bind() call by either active or passive TCP sockets.
       This  feature is usually used if the local address is either unknown or
       irrelevant. If left unspecified, the local IP or IPv6 address is  bound
       at connection time to the address of the network interface used to ser‐
       vice the connection.


       No two TCP sockets can be bound to the same port unless  the  bound  IP
       addresses  are  different.  This  behavior  can be changed by using the
       SO_REUSEPORT option. If both the binding  and  existing  bound  sockets
       have  this  option enabled, and the user IDs of both sockets (at bind()
       calling time) are the same, then such bind() is allowed. But  only  one
       of the two sockets can become a listener socket.


       When  comparing  addresses  at  bind()  time,  IPv4 INADDR_ANY and IPv6
       unspecified addresses compare as equal to any IPv4 or IPv6 address. For
       example,  if a socket is bound to INADDR_ANY or unspecified address and
       port X, no other socket can bind to port X, regardless of  the  binding
       address.  This  special  consideration  of  INADDR_ANY  and unspecified
       address can  be  changed  using  the  socket  option  SO_REUSEADDR.  If
       SO_REUSEADDR  is set on a socket doing a bind, IPv4 INADDR_ANY and IPv6
       unspecified address do not compare as equal to  any  IP  address.  This
       means  that  as  long  as  the  two  sockets  are  not  both  bound  to
       INADDR_ANY/unspecified address or the same IP address, the two  sockets
       can be bound to the same port.


       If  an  application  does  not  want  to allow another socket using the
       SO_REUSEADDR/SO_REUSEPORT option to bind to a port its socket is  bound
       to,  the  application  can set the socket level option SO_EXCLBIND on a
       socket. The option values of 0 and 1 mean enabling  and  disabling  the
       option  respectively. Once this option is enabled on a socket, no other
       socket can be bound to the same port.


       Once a connection has been established, data can be exchanged using the
       read(2) and write(2) system calls.


       Under  most  circumstances,  TCP  sends data when it is presented. When
       outstanding data has not  yet  been  acknowledged,  TCP  gathers  small
       amounts  of output to be sent in a single packet once an acknowledgment
       has been received. For a small number of clients, such as  window  sys‐
       tems  that send a stream of mouse events which receive no replies, this
       packetization can cause significant delays. To circumvent this problem,
       TCP provides a socket-level boolean option, TCP_NODELAY. TCP_NODELAY is
       defined in <netinet/tcp.h>, and is set with setsockopt(3C)  and  tested
       with  getsockopt(3C). The option level for the setsockopt() call is the
       protocol number for TCP, available from getprotobyname(3C).


       For some applications, it can be desirable for TCP not to send out data
       unless  a  full  TCP  segment  can be sent. To enable this behavior, an
       application can use the TCP_CORK socket option. When  TCP_CORK  is  set
       with  a  non-zero  value,  TCP  sends out a full TCP segment only. When
       TCP_CORK is set to zero after it has been enabled, all buffered data is
       sent  out  (as  permitted  by the peer's receive window and the current
       congestion window). TCP_CORK is defined in <netinet/tcp.h>, and is  set
       with  setsockopt(3C)  and  tested with getsockopt(3C). The option level
       for the setsockopt() call is the protocol  number  for  TCP,  available
       from getprotobyname(3C).


       The  TCP_MAXSEG  socket option can be used to determine the TCP maximum
       segment size (MSS) that will be used by a socket. When set after a con‐
       nection has been completed, the TCP_MAXSEG socket option will limit the
       sizes of the segments being sent. Only reductions in the  segment  size
       are allowed; this should never increase the size of segments.


       When  set before a connection is begun (by either the listen(3C) or the
       connect(3C) call) the option will additionally specify the size sent in
       the  outgoing TCP  MSS option on the SYN segment; this can be useful in
       dealing with networks that impose unusual restrictions on packet  size.
       If  the  user-specified  value is larger than the value that would have
       been sent otherwise, the smaller value is used.


       When retrieved before a connection is completed, the TCP_MAXSEG  socket
       option  will  return  the default segment size that will be used if the
       peer does not send an MSS option (536 for IPv4, 1220  for  IPv6).  When
       retrieved  after the connection is completed, the value returned is the
       current maximum segment size used by the  stack.  This  may  vary  over
       time,  due to Path MTU Discovery, but will never exceed any user-speci‐
       fied TCP_MAXSEG value.


       TCP_MAXSEG is defined in <netinet/tcp.h>,  and  is  set  with  setsock‐
       opt(3C)  and  retrieved  with  getsockopt(3C). The option level for the
       setsockopt() call is the protocol number for TCP, available  from  get‐
       protobyname(3C). The option value is an int.


       The  TCP_MD5SIG  socket option can be set with setsockopt(3C) to sign a
       TCP segment using MD5 hash as  described  in  RFC  2385.  Setting  this
       option  has an adverse effect on performance. This option is turned off
       by default. Setting TCP_MD5SIG option when no SAs have been  configured
       is  not  supported, currently a connection request will proceed without
       MD5 option but an incoming connection will not be accepted.  TCP_MD5SIG
       option  is applied only if set before initiating or accepting a connec‐
       tion.


       Another socket level option, SO_RCVBUF, can be used to control the win‐
       dow  that TCP advertises to the peer. IP level options can also be used
       with TCP. See ip(4P) and ip6(4P).


       TCP provides an urgent data mechanism, which can be invoked  using  the
       out-of-band  provisions  of  send(3C).  The caller can mark one byte as
       "urgent" with the MSG_OOB  flag  to  send(3C).  This  sets  an  "urgent
       pointer"  pointing  to this byte in the TCP stream. The receiver on the
       other side of the stream is notified of the urgent  data  by  a  SIGURG
       signal.  The  SIOCATMARK   ioctl(2)  request returns a value indicating
       whether the stream is at the urgent  mark.  Because  the  system  never
       returns  data  across  the  urgent mark in a single read(2) call, it is
       possible to advance to the urgent data in a  simple  loop  which  reads
       data, testing the socket with the SIOCATMARK  ioctl() request, until it
       reaches the mark.


       Incoming connection requests that include an IP source route option are
       noted, and the reverse source route is used in responding.


       A  checksum over all data helps TCP implement reliability. Using a win‐
       dow-based flow control mechanism that makes use of positive acknowledg‐
       ments, sequence numbers, and a retransmission strategy, TCP can usually
       recover when datagrams are damaged, delayed,  duplicated  or  delivered
       out of order by the underlying communication medium.


       If the local TCP receives no acknowledgments from its peer for a period
       of time, (for example, if the remote machine crashes),  the  connection
       is closed and an error is returned.


       The    TCP   level   socket   options,   TCP_CONN_ABORT_THRESHOLD   and
       TCP_ABORT_THRESHOLD can be used to change and retrieve this  period  of
       time.  The  option  value  is  uint32_t  and  the  unit is millisecond.
       TCP_CONN_ABORT_THRESHOLD and TCP_ABORT_THRESHOLD  control  respectively
       this period before and after a connection is established. If the appli‐
       cation does not want TCP to time out, it can use the option value 0.


       During this period, TCP tries to  retransmit  the  unacknowledged  data
       multiple times, each after a timeout. And the timeout interval is expo‐
       nentially backed off. The TCP level  socket  options,  TCP_RTO_INITIAL,
       TCP_RTO_MIN,  and TCP_RTO_MAX can be used to control the timeout inter‐
       val.  TCP_RTO_INITIAL  controls  the  initial  retransmission   timeout
       period.  TCP_RTO_MIN  and  TCP_RTO_MAX  control the minimum and maximum
       timeout period respectively. The option value is an  uint32_t  and  the
       unit is millisecond.


       The  default  values  of  the  above options, TCP_CONN_ABORT_THRESHOLD,
       TCP_ABORT_THRESHOLD, TCP_RTO_MIN, TCP_RTO_MAX, and TCP_RTO_INITIAL  are
       appropriate for most situations. An application should only alter their
       values in special circumstances and when it has detailed  knowledge  of
       the network environment.


       TCP follows the congestion control algorithm described in RFC 2581, and
       also supports the initial congestion window (cwnd) changes in RFC 3390.
       The  initial  cwnd  calculation  can be overridden by the socket option
       TCP_INIT_CWND. An application can use this option to  set  the  initial
       cwnd  to  a specified number of TCP segments. This applies to the cases
       when the connection first starts and restarts after an idle period. The
       process  must  have  the  PRIV_SYS_NET_CONFIG  privilege if it wants to
       specify a number greater than that calculated by RFC 3390.


       The TCP_INFO option can be used to collect  various  information  about
       the  current  state  of a TCP socket, such as connection state, windows
       sizes, and so forth. The data structure used as an argument  is  struct
       tcp_info.


       The  TCP_CONGESTION option can be used to get or set a socket's conges‐
       tion control algorithm. Its argument is a pointer to a  null-terminated
       string.


       Oracle  Solaris  OS  supports  TCP Extensions for High Performance (RFC
       1323) which includes the window scale and time stamp options, and  Pro‐
       tection  Against Wrap Around Sequence Numbers (PAWS). Oracle Solaris OS
       also supports Selective Acknowledgment (SACK) capabilities  (RFC  2018)
       and Explicit Congestion Notification (ECN) mechanism (RFC 3168).


       Turn on the window scale option in one of the following ways:

           o      An  application  can  set SO_SNDBUF or SO_RCVBUF size in the
                  setsockopt() option to be larger than 64K. This must be done
                  before  the program calls listen() or connect(), because the
                  window scale option is negotiated  when  the  connection  is
                  established.  Once  the  connection has been made, it is too
                  late to increase the  send  or  receive  window  beyond  the
                  default TCP limit of 64K.


           o      For all applications, use ndd(8) to modify the configuration
                  parameter tcp_wscale_always. If tcp_wscale_always is set  to
                  1,  the window scale option is always set when connecting to
                  a remote system. If tcp_wscale_always is 0, the window scale
                  option  is  set  only  if  the  user has requested a send or
                  receive  window  larger  than  64K.  The  default  value  of
                  tcp_wscale_always is 1.


           o      Regardless  of  the  value  of tcp_wscale_always, the window
                  scale option is always included in a connect  acknowledgment
                  if the connecting system has used the option.



       Turn on SACK capabilities in the following way:

           o      Use  ndd to modify the configuration parameter tcp_sack_per‐
                  mitted. If tcp_sack_permitted is set  to  0,  TCP  does  not
                  accept  SACK  or send out SACK information. If tcp_sack_per‐
                  mitted is set to 1, TCP does not initiate a connection  with
                  SACK  permitted  option in the SYN segment, but does respond
                  with SACK permitted option in  the  SYN|ACK  segment  if  an
                  incoming  connection  request has the SACK permitted option.
                  This means that TCP only accepts  SACK  information  if  the
                  other  side of the connection also accepts SACK information.
                  If tcp_sack_permitted is set to 2,  it  both  initiates  and
                  accepts  connections  with SACK information. The default for
                  tcp_sack_permitted is 2 (active enabled).



       Turn on TCP ECN mechanism in the following way:

           o      Use ndd to modify the configuration  parameter  tcp_ecn_per‐
                  mitted. If tcp_ecn_permitted is set to 0, TCP does not nego‐
                  tiate  with  a  peer  that  supports   ECN   mechanism.   If
                  tcp_ecn_permitted  is set to 1 when initiating a connection,
                  TCP does not tell a peer that  it  supports  ECN  mechanism.
                  However, it tells a peer that it supports ECN mechanism when
                  accepting a new incoming  connection  request  if  the  peer
                  indicates that it supports ECN mechanism in the SYN segment.
                  If tcp_ecn_permitted is set to 2, in addition to negotiating
                  with a peer on ECN mechanism when accepting connections, TCP
                  indicates in the outgoing SYN segment that it  supports  ECN
                  mechanism  when  TCP  makes active outgoing connections. The
                  default for tcp_ecn_permitted is 1.



       Turn on the time stamp option in the following way:

           o      Use   ndd   to   modify    the    configuration    parameter
                  tcp_tstamp_always. If tcp_tstamp_always is 1, the time stamp
                  option is always be set when connecting to a remote machine.
                  If  tcp_tstamp_always  is  0, the timestamp option is not be
                  set when connecting to a  remote  system.  The  default  for
                  tcp_tstamp_always is 0.


           o      Regardless of the value of tcp_tstamp_always, the time stamp
                  option is always included in a connect  acknowledgment  (and
                  all  succeeding  packets)  if the connecting system has used
                  the time stamp option.



       Use the following procedure to turn on the time stamp option only  when
       the window scale option is in effect:

           o      Use    ndd    to    modify   the   configuration   parameter
                  tcp_tstamp_if_wscale.  Setting  tcp_tstamp_if_wscale  to   1
                  causes  the time stamp option to be set when connecting to a
                  remote system, if the window scale option has been  set.  If
                  tcp_tstamp_if_wscale  is 0, the time stamp option is not set
                  when  connecting  to  a  remote  system.  The  default   for
                  tcp_tstamp_if_wscale is 1.



       Protection  Against  Wrap Around Sequence Numbers (PAWS) is always used
       when the time stamp option is set.


       Oracle Solaris OS also supports multiple methods of generating  initial
       sequence  numbers.  One of these methods is the improved technique sug‐
       gested in RFC 1948. We HIGHLY recommend that you  set  sequence  number
       generation  parameters as close to boot time as possible. This prevents
       sequence number problems on connections that use the same connection-ID
       as ones that used a different sequence number generation. The svc:/net‐
       work/initial:default service configures  the  initial  sequence  number
       generation.  The service reads the value contained in the configuration
       file /etc/default/inetinit to determine which method to use.


       The /etc/default/inetinit file is an unstable interface, and can change
       in future releases.


       TCP  can  be  configured to report some information on connections that
       terminate by means of an RST packet. By default, no logging is done. If
       the  ndd(8)  parameter  tcp_trace  is set to 1, then trace data is col‐
       lected for all new connections established after that time.


       The trace data consists of the TCP headers and IP source  and  destina‐
       tion  addresses  of  the last few packets sent in each direction before
       RST occurred. Those packets are logged in a series of strlog(9F) calls.
       This trace facility has a very low overhead, and so is superior to such
       utilities as snoop(8) for non-intrusive debugging for connections  ter‐
       minating by means of an RST.


       Oracle Solaris supports the keep-alive mechanism described in RFC 1122.
       It is enabled using the socket option SO_KEEPALIVE. When enabled, there
       are two keep-alive mechanisms.

           o      By  default,  the first keep-alive probe is sent out after a
                  TCP connection is idle for two hours. If the peer  does  not
                  respond  to  the probe within eight minutes, the TCP connec‐
                  tion is aborted. You can alter the interval for sending  out
                  the     first     probe     using    the    socket    option
                  TCP_KEEPALIVE_THRESHOLD in milliseconds or  TCP_KEEPIDLE  in
                  seconds.

                  The  system  default is controlled by the TCP  ndd parameter
                  tcp_keepalive_interval. The minimum value  is  ten  seconds.
                  The  maximum is ten days, while the default is two hours. If
                  you receive no response  to  the  probe,  you  can  use  the
                  TCP_KEEPALIVE_ABORT_THRESHOLD  socket  option  to change the
                  time threshold for aborting a TCP connection.

                  The option value is an unsigned integer in milliseconds. The
                  value  zero  indicates  that  TCP  should never time out and
                  abort the connection when probing.  The  system  default  is
                  controlled      by      the      TCP       ndd     parameter
                  tcp_keepalive_abort_interval. The default is eight minutes.


           o      The second implementation  is  activated  if  socket  option
                  TCP_KEEPINTVL  and/or  TCP_KEEPCNT are set. The time between
                  each consequent probes is set by TCP_KEEPINTVL  in  seconds.
                  The  minimum  value is ten seconds. The maximum is ten days,
                  while the default is two hours. The TCP connection  will  be
                  aborted  after  certain  amount  of  probes, which is set by
                  TCP_KEEPCNT, without receiving response.



       After an application closes a TCP connection, TCP enters  the  shutdown
       sequence. But if the peer does not respond (it crashes), the connection
       is stuck in this state (FIN-WAIT-2). To prevent this, Oracle Solaris OS
       starts  a  timer when TCP enters this state. If the timer fires and the
       shutdown sequence has not  completed,  the  connection  is  freed.  The
       socket option TCP_LINGER2 can be used to change and retrieve this time‐
       out period. The option value is an int and  the  unit  is  second.  The
       option  value cannot be set higher than the system default value, which
       is controlled by the TCP private parameter  tcp_fin_wait_2_flush_inter‐
       val.  The default value is appropriate for most situations. An applica‐
       tion should only change the value in  some  special  circumstances  and
       when it has detailed knowledge of the network environment.

SEE ALSO
       svcs(1),   ioctl(2),  read(2),  write(2),  accept(3C),  bind(3C),  con‐
       nect(3C),  getprotobyname(3C),  getsockopt(3C),  listen(3C),  send(3C),
       inet(4P), inet6(4P), ip(4P), ip6(4P), smf(7), ndd(8), svcadm(8)


       Ramakrishnan,  K.,  Floyd,  S.,  Black,  D.,  RFC 3168, The Addition of
       Explicit Congestion Notification (ECN) to IP, September 2001.


       Mathis, M. and Mahdavi, J. Pittsburgh Supercomputing Center; Floyd,  S.
       Lawrence  Berkeley  National  Laboratory; Romanow, A. Sun Microsystems,
       Inc. RFC 2018, TCP Selective Acknowledgment Options, October 1996.


       Bellovin, S., RFC 1948, Defending Against Sequence Number Attacks,  May
       1996.


       Jacobson,  V., Braden, R., and Borman, D., RFC 1323, TCP Extensions for
       High Performance, May 1992.


       Postel, Jon, RFC 793, Transmission Control Protocol  -  DARPA  Internet
       Program  Protocol Specification, Network Information Center, SRI Inter‐
       national, Menlo Park, CA., September 1981.

DIAGNOSTICS
       A socket operation can fail if:

       EISCONN          A connect() operation was attempted  on  a  socket  on
                        which  a  connect()  operation  had  already been per‐
                        formed.


       ETIMEDOUT        A connection was dropped due to excessive  retransmis‐
                        sions.


       ECONNRESET       The  remote  peer  forced  the connection to be closed
                        (usually because the remote  machine  has  lost  state
                        information about the connection due to a crash).


       ECONNREFUSED     The remote peer actively refused connection establish‐
                        ment (usually because no process is listening  to  the
                        port).


       EADDRINUSE       A  bind()  operation  was attempted on a socket with a
                        network address/port pair that has already been  bound
                        to another socket.


       EADDRNOTAVAIL    A  bind()  operation  was attempted on a socket with a
                        network address for which no network interface exists.


       EACCES           A bind() operation was  attempted  with  a  "reserved"
                        port  number  and the effective user ID of the process
                        was not the privileged user.


       ENOBUFS          The system ran out of memory for internal data  struc‐
                        tures.


NOTES
       The  tcp service is managed by the service management facility, smf(7),
       under the service identifier:

         svc:/network/initial:default



       Administrative actions on this service, such as enabling, disabling, or
       requesting  restart,  can  be  performed using svcadm(8). The service's
       status can be queried using the svcs(1) command.



Oracle Solaris 11.4               11 May 2021                          tcp(4P)
맨 페이지 내용의 저작권은 맨 페이지 작성자에게 있습니다.
RSS ATOM XHTML 5 CSS3