Typically, sockets are classified along two orthogonal dimensions: domain and type. This is reflected in the system call used to create a socket
int socket(int domain, int type, int protocol)
In typical IPC, protocol is usually zero.
Domain:
Domain means two things:
- range of communication (e.g. on same host or between two remote hosts)
- address format used to identify a peer (e.g. a path name or (IPv4 address, port) pair)
At least following three domains are supported by OSs:
- UNIX domain (identified by C macro AF_INET)
- IPv4 domain (AF_INET)
- IPv6 domain (AF_INET6)
Note that in above macro names, prefix PF_* can also be used instead of AF_*. Both mean same thing.
Type:
Again typically, two types of sockets are used:
- Stream sockets (identified by C macro SOCK_STREAM)
- Datagram sockets (SOCK_DGRAM)
Stream sockets are connection-oriented. One socket is connected to only one peer. They are byte-stream based and don’t preserve message boundaries. This means that basic unit of data transfer between two SOCK_STREAM sockets is byte. If a sender sends two messages in quick succession, and then receiver does a receive then bytes from second message will follow bytes of first message as a continous stream of bytes, rather than two separate messages. In contrast, a SOCK_DGRAM socket will receive one message in each call to recvfrom().
Apart from above, stream sockets provide reliable (in-order and non-duplicate) two-way communication.
Datagram sockets are message oriented. Unit of transfer is a single message. If the message size is too big, i.e. ‘length’ parameter of recvfrom is less than actual message length, then the message is silently truncated to ‘length’. Datagram sockets are also unreliable (messages may be lost, duplicated or received out of order) and connectionless, i.e. unlike SOCK_STREAM where one socket is connected to only one peer. Therefore sender has to specify recipient address everytime when sending data – sendto() syscall does that. Similarly, recvfrom() identifies sender to receiver. Having said that, connectionlessness comes with one qualificatoin mentioned below.
Connected datagram socket:
Stream sockets use connect() system call to connect to their peer, thus forming the one-to-one pairing mentioned above. It turns out, connect() can also be called on datagram socket. The effect is that kernel creates an association between caller and remote address specified in connect(). Then that socket can use write() or send() syscall, without specifying recipient address every time. At the same time, that socket will only
receive datagrams from the socket that it is connected to. Note that connectedness of datagram sockets is asymmetrical – the remote socket doesn’t have to be connected to local one which called connect().
Connection can be changed by calling connect again on the same datagram socket but with a different remote socket. To abolish the connection, specify address family of peer address argument of connect as AF_UNSPEC. However, abolishing of connection is Linux-specific only and thus not portable.