Stream Control Transmission Protocol (SCTP) Associations

Jan Newmarch

Issue #162, October 2007

This second in a series of articles on the SCTP network protocol examines associations and connections.

An SCTP association is a generalisation of a TCP connection. Usually a TCP connection is one-to-one between two network interfaces, one on a server and the other on a client. In contrast, an SCTP association is many-to-many in two ways:

  • Multiple network interfaces on a server can be associated with multiple interfaces on a client. For example, suppose the server and client both have an Ethernet card and a Wi-Fi card connected to the Internet. Then, data can flow over a single association in up to four possible ways: Ethernet to Ethernet, Ethernet to Wi-Fi, Wi-Fi to Ethernet or Wi-Fi to Wi-Fi.

  • An association also can carry multiple logical streams. These are numbered from zero upward. So, for example, stream zero could carry control instructions, while stream one could carry small pieces of data (such as small files), and stream two could carry larger pieces of data (such as an MPEG movie). The three streams are logically independent, so that delays on one stream do not cause delays on any other stream.

Note that a single socket can have multiple associations—that is, one socket can be used to talk to multiple other hosts. In general, these different associations are distinguished by having an association ID. The socket API for SCTP distinguishes between situations where exactly one association can exist (a one-to-one socket) or where a socket can manage many associations (a one-to-many socket). The first case corresponds to the TCP-like case that I discussed in the first article on SCTP [LJ, September 2007]. The second case will be covered in the next article. In this article, I look only at a single association, which is applicable to both the one-to-one and the one-to-many sockets.

Using a Subset of Network Interfaces

TCP and UDP use a single network interface on an endpoint, by specifying its IP address in a socketaddr structure. If you specify the wildcard address INADDR_ANY, a server will listen on all interfaces while a client will choose only one. In any case, communication takes place only between a single interface on each endpoint. As an aside, if you want to know what all the interfaces on your machine are, use the call ioctl() with parameter SIOCGIFCONF. How to do this is described in WR Stevens et al., Unix Network Programming, vol 1, section 17.5.

Using only one interface reduces reliability when more than one is available. A network cable may have poor connections, or you may be too far from a wireless access point for a reliable signal. On the other hand, using all of the interfaces may not always be desirable. For example, in Australia, the download charges for 3-G or WiMAX connections are ridiculously expensive, so you would use that interface only if no others were available. Or, a bridge would expose the internal and external interfaces separately to different groups of users.

SCTP allows an application to choose a subset of interfaces on either the source or destination side of an association. Some implementations also will allow interfaces to be added or removed dynamically, so the application can adjust to different states of the network connections. By registering for association-change events (which will be discussed in the next article), one endpoint can track changes in the interfaces at the other end.

The normal socket call bind() just takes a single sockaddr parameter to bind the socket to a single IP address (or to the wildcard address). SCTP extends this by introducing a new call, sctp_bindx(), which takes an array of sockaddrs to bind the socket to all of these addresses. The socket is bound only to a single port though; all of the port numbers in the array of sockaddrs must have the same port number. And, if addresses are added or removed later, they must have the same bound port value. Otherwise, the call fails.

There is another wrinkle to sctp_bindx() regarding IPv4 and IPv6 addresses. The socket can be passed a set of only IPv4 sockaddrs, a set of only IPv6 sockaddrs or a mixture of both. The two types of socket address structures, sockaddr_in and sockaddr_in6, have different sizes, so mixing these in the same array can cause alignment issues. SCTP packs the structures together with no wasted space between them. So, you can't just use an index into the array, you have to copy the right number of bytes for each structure and then move up by that amount.

The call to bind a set of addresses is:

int sctp_bindx(int sd, struct sockaddr *addrs, 
 ↪int addrcount, int flags)

where flags can be one of SCTP_BINDX_ADD_ADDR or SCTP_BINDX_REM_ADDR, and the second parameter is a packed list of IPv4 and IPv6 socket address structures.

It is relatively easy to use this on a server to allow clients to connect on any of the server's bound interfaces. But, how do you do this on a client? Where is the bind() operation? Well, just like TCP, this is hidden under the covers. In TCP, if a call to connect() is made and the socket is not yet bound (the usual case for a client), the TCP stack will choose one of the interfaces and an ephemeral port. You can make an explicit call to bind() that will let you choose the interface, but usually the port is left as zero, so an ephemeral port is still chosen.

You can do exactly the same with SCTP—don't call bind() and leave it to the SCTP stack. This will choose an ephemeral port like TCP, but instead of using a single interface, it will choose a set of interfaces (probably all that are available). So calling connect() without an initial bind() or sctp_bindx() will give you multihoming on the client side automatically.

If you call bind() with a specified interface before connect() in the client, you get only that single client-side interface, losing one of the advantages of SCTP! If you call bind() with the wildcard address INADDR_ANY, SCTP will choose a set of interfaces for you. So, SCTP will try to give you multihoming unless you pin it down to a single address using bind() or to a specific set of addresses using sctp_bindx().

With SCTP, I would expect a call to sctp_bindx() with all ports set to zero to choose the same ephemeral port for all addresses. Instead, the current Linux implementation (up to kernel 2.6.21) gets an ephemeral port for the first address and then throws an error, because the ports in the later addresses are still zero instead of this ephemeral value. The workaround is to call bind() with one address with port zero, see what the system set the port to, and then call bindx() on the remaining addresses with this new port number. Listing 1 (multi_homed_client.c) shows an example of this. This workaround probably will become unnecessary in the next specification of SCTP following discussion on the SCTP mailing list.

Using Multiple Interfaces

You can set the local interfaces to be used by sctp_bindx(). A client also can specify the subset of addresses that it wants to use to connect to the server by using the call sctp_connectx(), which takes a list of socket address structures just like sctp_bindx(). Why do this? Well, using connect() with a single address is a possible point of failure at the time that the initial connection is done. This is what the function sctp_connectx() solves. It allows the client to try multiple addresses in order to connect to the server.

The set of addresses in sctp_connectx() is used just to make the initial connection. But, after the connection is established, an interchange of information takes place between the two endpoints. In that exchange, the remote peer tells the local peer which addresses it actually wants to use and vice versa. The set of remote addresses that the remote peer will use need not be the same as what the client used in the connection. However, you at least can assume that one (but you don't know which one) of the addresses passed to sctp_connectx() will appear in the list that the remote peer offers, because the local client had to connect to something!

So, if the remote peer chooses the set of addresses it uses, how does the local client find which ones they are? This is done by another function, sctp_getpaddrs(), that gives the set of remote peer addresses. There is also an sctp_getladdrs() function, in case the local peer forgets which addresses it is using!

Once an association is set up between two endpoints, messages can be sent between them. Note that SCTP does not concern itself with QoS (Quality-of-Service) issues, such as real-time delivery, but only with reliability issues. SCTP uses the multihomed capabilities to try as many possible routes as possible to get messages through. So on the sending side, there is no control over which interfaces are used; indeed, the sender might even use a scheme such as round-robin among its interfaces for each message. However, the sending application can indicate to its SCTP stack which of the remote peer's interface it would prefer to use, and it can tell the remote peer on which interfaces it would prefer to receive messages. These are done by using the setsockopt() call with option type as SCTP_PRIMARY_ADDR or SCTP_SET_PEER_PRIMARY_ADDR. Of course, if these particular addresses are not available, SCTP simply will use different addresses in the association.

Once SCTP is told which interfaces to use, it basically looks after things itself. It uses heartbeats to keep track of which interfaces are alive, and it switches interfaces transparently when failure occurs. This is to satisfy the design goals of SCTP for improved reliability over TCP. Applications can give hints to the SCTP stack about which interfaces to use, but the stack will ignore these hints on failure.

Streams

In TCP, a stream is just a sequence of bytes. In SCTP, it has a different meaning; a stream is a logical channel along which messages are sent, and a single association can have many streams. The original motivation for streams came from the telephony domain, where multiple reliable channels were needed, but the messages on each channel were independent of those on other channels. In last month's article, we pointed out some TCP applications that could benefit from streams, such as FTP, which uses two sockets for data and control messages. In addition, an increasing number of applications are multithreaded, and streams open up the possibility of a thread in one peer being able to communicate with a thread in another peer without worrying about being blocked by messages sent by other threads.

The socket I/O calls read/write/send/recv do not know about SCTP streams. By default, the write calls all use stream number zero (this can be changed by a socket option), but the read calls will read messages on all streams, and there is no indication as to which stream is used. So, to use streams effectively, you need to use some of the I/O calls that are designed specifically for SCTP.

Negotiating the Number of Streams

Each endpoint of an association will support a certain number of streams. A Linux endpoint, by default, will expect to be able to send to ten streams, while it can receive on 65,535 streams. Other SCTP stacks may have different default values. These values can be changed by setting the socket option SCTP_INITMSG, which takes a structure sctp_initmsg:

struct sctp_initmsg { 
    uint16_t sinit_num_ostreams; 
    uint16_t sinit_max_ostreams; 
    uint16_t sinit_max_attempts; 
    uint16_t sinit_max_init_timeo; 
}

If this socket option is used to set values, it must be done before an association is made. The parameters will be sent to the peer endpoint during association initialisation.

Each endpoint in an association will have an idea of how many input and output streams it will allow on an association, as discussed in the previous paragraph. During the establishment of the association, the endpoints exchange these values. Negotiation of final values is just a matter of taking the minimum values. If one end wants 20 output streams, and the other wants only 10 input streams, the result is the smaller, 10, and similarly for the number of streams in the opposite direction.

An endpoint will need to know how many output streams are available for writing in order not to exceed the limits. This value is determined during association setup. After setup, the endpoint can find this by making a query using getsockopt(). However, there is a little wrinkle here: a socket may have many associations (to different endpoints), and each association may have set different values. So, we have to make a query that asks for the parameters for a particular association, not just for the socket. The parameter to ask for is SCTP_STATUS, which takes a structure of type sctp_status:

struct sctp_status { 
       sctp_assoc_t    sstat_assoc_id; 
       int32_t         sstat_state; 
       uint32_t        sstat_rwnd; 
       uint16_t        sstat_unackdata; 
       uint16_t        sstat_penddata; 
       uint16_t        sstat_instrms; 
       uint16_t        sstat_outstrms; 
       uint32_t        sstat_fragmentation_point; 
       struct sctp_paddrinfo sstat_primary; 
};

This has fields sstat_instrms and sstat_outstrms, which contain the required information. See Listings 2 and 3 for a client and server negotiating the number of streams in each direction.

Association ID

For the one-to-one socket we discussed in the first article in this series, there can be only one association at any time. For the one-to-many sockets we will cover in the next article, there can be many associations active at any one time—a peer can be connected to many other peers simultaneously. This is different from TCP where only one connection on a socket can exist and also is different from UDP where no connections exist and messages are just sent to arbitrary peers.

When there can be many associations, you need to be able to distinguish between them. This is done by an opaque data type called an association ID. You need to use this sometimes, but not every time. For one-to-one sockets, there is only one association, so the association ID is always ignored. For one-to-many sockets, when the association is “obvious”, the association ID again is ignored. This occurs, for example, when you write to a peer and give the peer's socket address; there can be only one association to a peer (but many associations to many peers), so if the peer is known, the association is known, and there is no need for the ID. But the association ID has to be used when the SCTP stack cannot work out for itself which association is meant. One place where this happens is in the getsockopt() call described previously to find the number of streams of an association on a one-to-many socket. I will defer the discussion of how to find the association ID to the next article, where I look at one-to-many sockets.

Writing to and Reading from a Stream

There are several ways of writing to a stream and telling to which stream a read belongs. Some of them make use of a structure of type sctp_sndrcvinfo:

struct sctp_sndrcvinfo { 
      uint16_t sinfo_stream; 
      uint16_t sinfo_ssn; 
      uint16_t sinfo_flags; 
      uint32_t sinfo_ppid; 
      uint32_t sinfo_context; 
      uint32_t sinfo_timetolive; 
      uint32_t sinfo_tsn; 
      uint32_t sinfo_cumtsn; 
      sctp_assoc_t sinfo_assoc_id; 
}

Most of the fields in this structure are not of interest to us at the moment. The interesting one is the first one, sinfo_stream. To write to a particular stream, zero out all fields and set this one; to read, zero out all fields again, do the read, and then examine this field. (As an aside, if the SCTP stack cannot work out which association is meant, the last field, sinfo_assoc_id, must be set.)

The function call to write a message is:

int sctp_send(int sd, 
              const void *msg, 
              size_t len, 
              const struct sctp_sndrcvinfo *sinfo, 
              int flags);

where the field sinfo_stream of sinfo has been set.

The call to read is, conversely:

ssize_t sctp_recvmsg(int sd, 
                     void *msg, 
                     size_t len, 
                     struct sockaddr *from, 
                     socklen_t *fromlen 
                     struct sctp_sndrcvinfo *sinfo 
                     int *msg_flags)

The stream number is then available in sinfo.sinfo_stream.

The SCTP stack keeps a lot of information about each message that passes between peers. It also keeps information about the state of each association. To avoid overloading applications, most of this information is suppressed and is not passed to the application. In particular, by default, the structure sctp_sndrcvinfo is not filled in, so a reader cannot tell on which stream a read occurred! To enable this to be filled, a socket option must be called first as:


struct sctp_event_subscribe events; 
bzero(&events, sizeof(events)); 
events.sctp_data_io_event = 1; 
setsockopt(sockfd, IPPROTO_SCTP, 
           SCTP_EVENTS, &events, sizeof(events));

(More details on SCTP events will be given in the next article.) See Listings 4 (streamsend_echo_client.c) and 5 (streamsend_echo_server.c) for an example of a client and server using a specific stream for communication. .

There is no way to specify from which stream to read. This is deliberate; the intention is that when data is ready on any stream, then you read it. Otherwise, data could be blocked on a stream with no one to read it, which eventually could fill up system buffers. So, you can't restrict reading to any particular stream. But, once a read is done, you can tell which stream it has come from by using the mechanism above.

Typically, a server that reads and handles a message will have (pseudocode) that looks like this:


while (true) { 
    nread = sctp_recvmsg(..., msg, ..., &sinfo, ...) 
    if (nread <= 0) break; 
    assoc_id = sinfo.sinfo_assoc_id; 
    stream = sinfo.sinfo_stream; 
    handle_mesg(assoc_id, stream, msg, nread); 
}

This is a single-threaded read loop. It ensures that information is read, no matter what association or stream it is sent on. The application function handle_mesg() can, of course, dispatch the message to different threads if it wants. Writes, on the other hand can be sent from multiple threads if desired.

Conclusion

This article has discussed a feature novel to SCTP, streams. A stream allows multiple data channels on a single association. This avoids a major problem of TCP, head-of-line blocking, but it also allows applications that deal with multiple logical streams to be written more easily. The next and final article will look at how SCTP can handle multiple associations at once, simplifying the TCP model and also offering improvements over the UDP connection model.

Jan Newmarch is Honorary Senior Research Fellow at Monash University. He has been using Linux since kernel 0.98. He has written four books and many papers and given courses on many technical topics, concentrating on network programming for the last six years. His Web site is jan.newmarch.name.