Computer Science
TCP(4) Linux Programmer's Manual TCP(4)
NAME
tcp - TCP protocol.
SYNOPSIS
#include <sys/socket.h>
#include <netinet/in.h>
tcp_socket = socket(PF_INET, SOCK_STREAM, 0);
DESCRIPTION
This is an implementation of the TCP protocol defined in
RFC793, RFC1122 and RFC2001 with the NewReno extensions.
It implements a reliable stream oriented full duplex
stream between two sockets. TCP ensures that packets are
not reordered and retransmits them when they are dropped.
It generates and checks a per packet checksum to catch
transmission errors.
A fresh TCP socket has no remote or local address and is
not fully specified. To create an outgoing TCP connection
the connect(2) function is called on the socket. To accept
incoming connections bind(2) the socket first to a local
address and port and then call listen(2) to allow the
accepting of incoming connections. Then use accept(2) to
get a new socket with the incoming connection. The listen-
ing socket stays. After accept(2) or connect(2) a socket
is fully specified. Data may be only transferred on fully
specified sockets.
When the initial connection request packet carries IP
options and the accept_source_routes sysctl is enabled all
outgoing datagrams on this connection will carry the
reversed source route.
Linux 2.2 supports the RFC1323 TCP high performance exten-
sions. They include window scaling to support large win-
dows and the timestamp option with protection against
wrapped sequence numbers ( PAWS ). Large windows are
needed for good performance over links with long latencies
or very high bandwidth. To use them the send and receive
buffers have to be increased from the default values. This
can be either done globally using the
net.core.wmen_default and net.core.rmem_default sysctls,
or on a per socket basis using the SO_SNDBUF and SO_RCVBUF
socket options. The maximum receive buffer size settable
on a socket is limited by the global net.core.rmem_max and
net.core.wmem_max sysctls. See socket(4) for more informa-
tion.
TCP supports urgent data. Urgent data is used to signal
the receiver that some important message is part of the
data stream and that is should be processed as soon as
possible. To send urgent data specify the MSG_OOB option
to sendfile(2). When urgent data is received the kernel
sends an SIGURG signal to the reading process or the pro-
cess or process group that has been set for the socket
using the FIOCSPGRP or FIOCSETOWN ioctls. When the
SO_OOBINLINE socket option is enabled urgent data is put
into the normal data stream (and can be tested for by the
SIOCATMARK ioctl), otherwise it can be only received when
the MSG_OOB flag is set for sendmsg(2).NotethatLinuxperde-
faultusestheBSDcompatible interpretation of the urgent
pointer field, see the tcp_stdurg sysctl below.
ADDRESS FORMATS
TCP is built on top of IP (see ip(4)). The address for-
mats defined by ip(4) apply to TCP. TCP only supports
point-to-point communication; broadcasting and multicast-
ing are not supported.
SYSCTLS
These sysctls can be accessed by the /proc/sys/net/ipv4/*
files or with the sysctl(2) interface. In addition, most
IP sysctls also apply to TCP; see ip(4).
tcp_window_scaling
Enable RFC1323 TCP window scaling.
tcp_sack
Enable RFC2018 TCP Selective Acknowledgements.
tcp_timestamps
Enable RFC1323 TCP timestamps.
tcp_fin_timeout
How many seconds to wait for a final FIN packet
before the socket is forcibly closed. This is
strictly a violation of the TCP specification, but
required to prevent denial-of-service attacks.
tcp_keepalive_probes
Maximum TCP keep-alive probes to send before giving
up. Keep-alives are only send when the SO_KEEPALIVE
socket option is enabled.
tcp_keepalive_time
How often keep-alives are sent on a connection.
Defined in seconds. Default is 2 hours.
tcp_max_ka_probes
How many keep-alive probes are sent per slow timer
run. To prevent bursts, this value should not be
set too high.
tcp_stdurg
Enable the strict RFC793 interpretation of the TCP
urgent-pointer field. The default is to use the
BSD-compatible interpretation of the urgent-
pointer, pointing to the first byte after the
urgent data. The RFC793 interpretation is to have
it point to the last byte of urgent data. Enabling
this option may lead to interoperatibility prob-
lems.
tcp_syncookies
Enable TCP syncookies. The kernel must be compiled
with CONFIG_SYN_COOKIES. They defend against a
particular TCP denial-of-service attack. Note that
the concept of a socket backlog is abandoned; this
means the peer may not receive reliable error mes-
sages from an overloaded server with syncookies
enabled.
tcp_max_syn_backlog
Length of the per-socket backlog queue. As of Linux
2.2, the backlog specified in listen(2) only speci-
fies the length of the backlog queue of already
established sockets. The maximum queue of sockets
not yet established (in SYN_RECV state) per listen
socket is set by this sysctl. When more connection
requests arrive, Linux starts to drop packets. When
syncookies, are enabled the packets are still
answered and the maximum queue is effectively
ignored.
tcp_retries1
Defines how many times an answer to a TCP connec-
tion request is retransmited before giving up.
tcp_retries2
Defines how many times a TCP packet is retransmit-
ted in established state before giving up.
tcp_syn_retries
Defines how many times to try to send an initial
SYN packet to a remote host before giving up and
returns an error. Must be below 255. This is only
the timeout for outgoing connections; for incoming
connections the number of retransmits is defined by
tcp_retries1. tcp_retries1.
tcp_retrans_collapse
Try to send full-sized packets during retransmit.
This is used to work around TCP bugs in some
stacks.
SOCKET OPTIONS
To set or get a TCP socket option, call getsockopt(2) to
read or setsockopt(2) to write the option with the socket
family argument set to SOL_TCP. In addition, most SOL_IP
socket options are valid on TCP sockets. For more informa-
tion see ip(4).
TCP_NODELAY
Turn the Nagle algorithm off. This means that pack-
ets are always sent as soon as possible and no
unnecessary delays are introduced, at the cost of
more packets in the network. Expects an integer
boolean flag.
TCP_MAXSEG
Set or receive the maximum segment size for outgo-
ing TCP packets. If this option is set before con-
nection establishment, it also changes the MSS
value announced to the other end in the initial
packet. Values greater than the interface MTU are
ignored and have no effect.
TCP_CORK
If enabled don't send out partial frames. All
queued partial frames are sent when the option is
cleared again. This is useful for prepending head-
ers before calling sendfile(2), or for throughput
optimization. This option cannot be combined with
TCP_NODELAY.
IOCTLS
These ioctls can be accessed using ioctl(2). The correct
syntax is:
int value;
error = ioctl(tcp_socket, ioctl_type, &value);
FIONREAD
Returns the amount of queued unread data in the
receive buffer. Argument is a pointer to an inte-
ger.
SIOCATMARK
Returns true when the all urgent data has been
already received by the user program. This is used
together with SO_OOBINLINE. Argument is an pointer
to an integer for the test result.
TIOCOUTQ
Returns the amount of unsent data in the socket
send queue. Argument is an integer.
ERROR HANDLING
When a network error occurs, TCP tries to resend the
packet. If it doesn't succeed after some time, either
ETIMEDOUT or the last received error on this connection is
reported.
Some applications require a quicker error notification.
This can be enabled with the SOL_IP level IP_RECVERR
socket option. When this option is enabled, all incoming
errors are immediately passed to the user program. Use
this option with care - it makes TCP less tolerant to
routing changes and other normal network conditions.
When the other end closes the socket without doing a
proper closing handshake, a SIGPIPE signal is raised and
EPIPE is returned. This can be prevented by the MSG_NOSIG-
NAL flag.
ERRORS
EPIPE The other end closed the socket unexpectedly.
ETIMEDOUT
The other end didn't acknowledge retransmitted
data after some time.
EAFNOTSUPPORT
Passed socket address type in sin_family was not
AF_INET.
Any errors defined for ip(4) or the generic socket layer
may also be returned for TCP.
BUGS
Not all errors are documented.
IPv6 is not described.
Transparent proxy options are not described.
VERSIONS
The sysctls are new in Linux 2.2. IP_RECVERR and
MSG_NOSIGNAL are a new feature in Linux 2.2. TCP_CORK is
new in 2.2.
SEE ALSO
socket(4), socket(2), ip(4), sendmsg(2), recvmsg(2)
Linux Man Page 3 Oct 1998 1
Back to the index