.NC "Transport Service Interface" .sh 1 "General" .pp It is assumed that the reader is acquainted with the set of system calls and library routines that compose the Berkeley Unix interprocess communication service (IPC). To every extent possible the ARGO transport service is provided by the same IPC mechanisms that support the transport-level services included in the AOS distribution. In some instances, the interface provided by AOS does not support the services required by ISO 8073, so system calls were added to support these services. It is felt that this is superior to modifying existing system calls, in order to avoid the recoding of existing Unix utilities. .pp What follows is a description of the system calls that are used to provide the transport service. According to Unix custom, the return value of a system call is 0 if the call succeeds and -1 if the call fails for any reason. In the latter case, the global variable \fIerrno\fR contains more information about the error that caused the failure. In the descriptions of all the system calls for which this custom is followed, the return value is named \fIstatus\fR. .sh 1 "Connection establishment" .pp Establishing a TP connection is similar to establishing a connection using any other transport protocol supported by Unix. The same system calls are used, and the passive open is required. Some of the parameters to the system calls differ. .pp The following call creates a communication endpoint called a \fIsocket\fR. It returns a positive integer called a \fIsocket descriptor\fR, which will be a parameter in all communication primitives. .(b \fC .TS tab(+); l s s s. s = socket( af, type, protocol ) .T& l l l. +int+s,af,type,protocol; .TE \fR .)b .pp The \fIaf\fR parameter describes the format of addresses used in this communication. Existing formats include AF_INET (DoD Internet addresses), AF_PUP (Xerox PUP-I Internet addresses), and AF_UNIX (addresses are Unix file names, for intra-machine IPC only). TP runs in either the Internet domain or the ISO domain (AF_ISO). When using the Internet domain, the network layer is the DoD Internet IP with Internet-style addresses. The ISO domain uses the ISO network service and ISO-style addresses\**. .(f \**ISO/DP 8348/DAD2 Addendum to the Network Service Definition Covering Network Layer Addressing. .)f Regardless of the address family used, an address takes the general form, .(b \fC .TS tab(+); l s s s. struct sockaddr { .T& l l l l. +u_char+sa_len;+/* length of sockaddr */ +u_char+sa_family;+/* address family */ +char+sa_data[14];+/* space for an address */ }+ .TE \fR .)b .lp .i A sockaddr is no longer required to be precisely 16 bytes long. The allocation of 14 bytes for sa_data is intended for backwards compatibility. .r .sp 1 When viewed as an Internet address, it takes the form .(b \fC .TS tab(+); l s s s. struct sockaddr_in { .T& l l l l. +u_char+sin_len;+/* address length */ +u_char+sin_family;+/* address family */ +u_short+sin_port;+/* internet port */ +struct in_addr+sin_addr;+/* network addr A.B.C.D */ +char+sin_zero[8];+/* unused */ } .TE \fR .)b .sp 1 When viewed as an ISO address, as supplied by the university of wisconsin, it takes the form .(b \fC .TS tab(+); l s s s. struct sockaddr_iso { .T& l l l l. +u_char+siso_len;+/* address length */ +u_char+siso_family;+/* address family */ +u_short+siso_tsuffix;+/* transport suffix */ +struct iso_addr+siso_addr;+/* ISO NSAP addr */ +char+siso_zero[2];+/* unused */ } .TE \fR .)b The address described by a \fIsockaddr_iso\fR structure is a TSAP-address (transport service access point address). It is made of an NSAP-address (network service access point address) and a TSAP selector (also called a transport suffix or transport selector, hereafter called a TSEL). The structure \fIsockaddr_iso\fR contains a 2-byte TSEL. This is for compatibility with Internet addressing. ARGO supports TSELs of length 1-64 bytes. TSELs of any length other than 2 are called \*(lqextended TSELs\*(rq. They are described in detail in the section \fB\*(lqExtended TSELs\*(rq\fR. If extended TSELs are not requested, 2-byte TSELs are used by default. .pp Refer to Chapter Five for more information about ISO NSAP-addresses. .pp .i It is our intent at Berkeley to revamp the sockaddr_iso to use a more natural and uniform model, for ISO addresses. We cannot guarantee this modification to be complete by the time we are ready to have something for NIST to test. We hope to remove this notion of extended TSEL's as soon as possible, certainly by formal beta testing of 4.4. .r Since sockaddr can be 108 bytes long without breaking anything in the current Berkeley kernel, we should be able to eliminate extended TSEL's entirely by providing a sockaddr_iso along the lines of: .(b \fC .TS tab(+); l s s s. struct sockaddr_iso { .T& l l l l. +u_char+siso_len;+/* address length */ +u_char+siso_family;+/* address family */ +u_char+siso_slen;+/* session suffix length */ +u_char+siso_tlen;+/* transport suffix length */ +u_char+siso_nlen;+/* nsap length */ +char+siso_data[22];+/* minimum nsap + tsel */ } .TE \fR .)b .pp The \fItype\fR parameter in the \fIsocket()\fR call distinguishes datagram protocols, stream protocols, sequenced packet protocols, reliable datagram protocols, and "raw" protocols (in other words, the absence of a transport protocol). Unix provides manifest named constants for each of these types. TP supports the sequenced packet protocol abstraction, to which the manifest constant SOCK_SEQPACKET applies. .pp The \fIprotocol\fR parameter is an integer that identifies the protocol to be used. Unix provides a database of protocol names and their associated protocol numbers. Unix also provides user-level tools for searching the database. The tools take the form of library routines. A protocol number for TP has been chosen by the Internet NIC to allow TP to run in the Internet domain, and this has been added to the Unix network protocol database. The standard Internet database tools that serve TCP users can also serve user of TP in the Internet domain, if the TP protocol number is added to the proper Internet database file, \fC/etc/protocols\fR. This change must be made for TP to run in either the Internet or in the ISO domain. The ARGO package contains a set of tools and a database for use with TP in the ISO domain. This set of tools is described in the manual pages \fIisodir(5)\fR and \fIisodir(3)\fR. .pp When a socket is created, it is not given an address. Since a socket cannot be reached by a remote entity unless it has an address, the user must request that a socket be given an address by using the \fIbind()\fR system call: .(b \fC .TS tab(+); l s s s. status = bind( s, addr, addrlen ) .T& l l l. +int+s; +struct sockaddr+*addr; +int+addrlen; .TE \fR .)b .pp The address is expected to be in the format specified by the \fIaf\fR parameter to the \fIsocket()\fR call that yielded the socket descriptor \fIs\fR. If the user passes an address parameter with a zero-valued transport suffix, the transport layer assigns an unused 2-byte transport selector. This is a 4.3 Unix convention; it is not part of any ISO standard. .pp The \fIconnect()\fR system call effects an active open. It is used to establish a connection with an entity that is passively waiting for connection requests, and whose transport address is known. .(b \fC .TS tab(+); l s s s. status = connect( s, addr, addrlen ) .T& l l l. +int+s; +struct sockaddr+*addr; +int+addrlen; .TE \fR .)b .pp The first parameter is a socket descriptor. The \fIaddr\fR parameter is a transport address in the format specified by the \fIaf\fR parameter to the \fIsocket()\fR call that yielded the socket descriptor \fIs\fR. .pp A passive open is accomplished with two system calls, \fIlisten()\fR followed by \fIaccept()\fR. .(b \fC .TS tab(+); l s s s. status = listen( s, queuelen ) .T& l l l. +int+s; +int+queuelen; .TE \fR .)b .pp The \fIqueuelen\fR argument specifies the maximum number of pending connection requests that will be queued for acceptance by this user. Connections are then accepted by the system call \fIaccept()\fR. There is no way to refuse connections. The functional equivalent of connection refusal is accomplished by accepting a connection and immediately disconnecting. .(b \fC .TS tab(+); l s s s. new_s = accept( s, addr, addrlen ) .T& l l l. +int+new_s, s; +struct sockaddr+*addr; +int+addrlen; .TE \fR .)b .pp The \fIaccept()\fR call completes the connection establishment. If a connection request from a prospective peer is pending on the socket described by \fIs\fR, it is removed and a new socket is created for use with this connection. A socket descriptor for the new socket is returned by the system call. If no connection requests are pending, this call blocks. If the \fIaccept()\fR call fails, -1 is returned. The transport address of the entity requesting the connection is returned in the \fIaddr\fR parameter, and the length of the address is returned in the \fIaddrlen\fR parameter. The address associated with the new socket is inherited from the socket on which the \fIlisten()\fR and \fIaccept()\fR were performed. .pp It is possible for the \fIaccept()\fR call to be interrupted by an asynchronous event such as the arrival of expedited data. When system calls are interrupted, Unix returns the value -1 to the caller and puts the constant EINTR in the global variable \fIerrno\fR. This can create problems with the system call \fIaccept()\fR. In the case of incoming expedited data, the interruption does not indicate a problem, but the data may have arrived before the caller has received the new socket descriptor, which is the socket descriptor on which the expedited data are to be received. In order to prevent this problem from occurring, the caller must prevent the issuance of asynchronous indications until the \fIaccept()\fR call has returned. Asynchronous indications are discussed below, in the section titled "Indications from the transport layer to the transport user". .pp It is possible to discover the address bound to a socket with the \fIgetsockname()\fR system call. .(b \fC .TS tab(+); l s s s. status = getsockname( s, addr, addrlen ) .T& l l l. +int+s; +struct sockaddr+*addr; +int+addrlen; .TE \fR .)b .pp If the socket has a peer, that is, it is connected, the system call \fIgetpeername()\fR is used to discover the peer's address. .(b \fC .TS tab(+); l s s s. status = getpeername( s, addr, addrlen ) .T& l l l. +int+s; +struct sockaddr+*addr; +int+addrlen; .TE \fR .)b .lp The names returned by \fIgetsockname()\fR and \fIgetpeername()\fR do not contain extended TSELs. Extended TSELs can be retrieved with the \fIgetsockopt()\fR and \fIsetsockopt()\fR system calls, described below. .pp Unix supports several protocol-independent options and protocol-specific options associated with sockets. These options can be inspected and changed by using the \fIgetsockopt()\fR and \fIsetsockopt()\fR system calls. .(b \fC .TS tab(+); l s s s. status = getsockopt( s, level, option, value, valuelen ) .T& l l l. +int+s, level, option; +char+*value; +int+*valuelen; .TE \fR .)b .(b \fC .TS tab(+); l s s s. status = setsockopt( s, level, option, value, valuelen ) .T& l l l. +int+s, level, option; +char+*value; +int+valuelen; .TE \fR .)b .pp The \fIlevel\fR argument may indicate either that this option applies to sockets or that it applies to a specific protocol. The constants SOL_SOCKET, SOL_TRANSPORT, and SOL_NETWORK are possible values for the \fIlevel\fR argument. The \fIoption\fR argument is an integer that identifies the option chosen. .\" LIST THE OPTIONS HERE The options available to TP users provide the user with the ability to control various TP protocol options including but not limited to TP class, TPDU size negotiated, TPDU format used, acknowledgment and retransmission strategies. For a detail list of the options, see the manual page \fItp(4p)\fR. .sh 1 "Extended TSELs" .pp ARGO supports TSELs of length 1 byte - 64 bytes for sockets bound to addresses in the AF_ISO address family. The ARGO user program uses the \fIgetsockopt()\fR and \fIsetsockopt()\fR system calls to discover and assign extended TSELs. .pp To create a socket with an extended TSEL, the process .ip \(bu 5 opens a socket with \fCsocket(AF_ISO, SOCK_SEQPACKET, ISOPROTO_TP)\fR .ip \(bu 5 binds an NSAP-address to the socket with \fIbind()\fR. The address bound may contain a 2-byte selector (\fIiso_tsuffix\fR). .ip \(bu 5 uses \fIsetsockopt()\fR with the command TPOPT_MY_TSEL, to assign a TSEL to the socket. .ip \(bu 5 calls \fIlisten(), connect()\fR, or any other appopriate system calls to use the socket as desired. .lp To connect to a transport entity that is bound to a TSAP-address with an extended TSEL, the process .ip \(bu 5 opens a socket with \fCsocket(AF_ISO, SOCK_SEQPACKET, ISOPROTO_TP)\fR .ip \(bu 5 uses \fIsetsockopt()\fR, with the command TPOPT_PEER_TSEL, to assign a PEER TSEL to the socket. This TSEL is used by the transport entity for all subsequent connect requests made on this socket, unless the peer TSEL is changed by another call to \fIsetsockopt()\fR employing the command TPOPT_PEER_TSEL. .lp To discover the TSEL of the peer of a connected socket, the process .ip \(bu 5 uses \fIgetsockopt()\fR with the command TPOPT_PEER_TSEL. .lp To discover the TSEL of socket's own address, the process .ip \(bu 5 uses \fIgetsockopt()\fR with the command TPOPT_MY_TSEL. .sh 1 "Data transfer" .pp Earlier BSD-based systems have provided system calls for data transfer having bugs and semantics that are problematic for TP. These should be correct as presented in the test system. The problem was in the manner in which the kernel handled interrupted system calls. The send and receive primitives may be interrupted by signals. A signal is the mechanism used to indicate the presence of expedited data or out-of-band data. If the send primitive as interrupted before completion, the user could not determine how many octets of data were sent. All forms of the existing interface (\fIsend()\fR, \fIrecv()\fR, \fIsendmsg()\fR, \fIrecvmsg()\fR, \fIsendto()\fR, \fIrecvfrom()\fR, \fIwrite()\fR, \fIread\fR, \fIwritev()\fR, and \fIreadv()\fR system calls) return an octet count when the system call completes, and to return a short count if the system call is interrupted. .pp The system calls sendmsg and recvmsg have been revised to make them more convenient for receipt of out of band data. .(b \fC .TS tab(+); l s s s. cc = sendmsg( s, msg, flags ) .T& l l l. +int+s; +istruct msghdr+msg; +unsigned int+flags; .TE \fR .)b .(b \fC .TS tab(+); l s s s. cc = recvmsg( s, msg, flags ) .T& l l l. +int+s; +istruct msghdr+msg; +unsigned int+flags; .TE \fR .)b .pp The reader should now consult the manual page for recvmsg for an explanation of the elements of the msghdr structure, and how the calls may be used to glean user-connection-request data. .pp The \fIflags\fR parameter serves several purposes. The TP specification requires that TSDUs be unlimited in size. System calls cannot pass unlimited amounts of data between the user and the kernel, so there cannot be a one-to-one correspondence between TSDUs and system calls. The \fIflags\fR parameter is used to mark the end-of-TSDU on sending data. When receiving, TP sets this bit in the flags element of the msghdr structure when the end of a TSDU is consumed. This way one TSDU can span several system calls. It is possible for the peer to send an empty TPDU with the end-of-TSDU flag set, in which case the transport user may receive zero octets with the end-of-TSDU flag set. .pp The \fIflags\fR parameter also serves to distinguish data transfer primitives from expedited data transfer primitives. The flag bit MSG_OOB is provided for "out of band data" in the DoD Internet protocols. It is also used to provide the expedited data service of the ISO protocols. The transport layer will deliver one expedited datum (there will be a one-to-one correspondence between expedited TSDUs and XPD TPDUs) at a time. The user must receive the datum before the transport layer will accept more expedited data. Each expedited datum my contain up to 16 octets. .pp .sh 1 "Disconnection" .pp The \fIclose\fR system call will disconnect any association between two TP entities. .(b \fC .TS tab(+); l s s s. status = close( s ) .T& l l l. +int+s; .TE \fR .)b .pp The argument \fIs\fR is a socket descriptor. If a Unix user process terminates, Unix will close all files and sockets associated with the process, which means all transport connections associated with the process will be disconnected. .sh 1 "Indications from the transport layer to the transport user" .pp The above set of system calls allows you to establish a connection, transfer data, and disconnect. The presence or reception of expedited data is indicated by TP setting the MSG_OOB bit in the flags element of the msg structure. A disconnection initiated by the peer or by one of the cooperating TP entities can be signalled by a control message, although we have not yet implemented this. .pp The Unix signal mechanism may be used to provide these service elements, as well. When an expedited data TSDU arrives, the TP may interrupt the user with a SIGURG signal ("urgent condition present on socket"). The user must have previously registered a procedure to handle the signal by using the \fIsigvec()\fR system call or the \fIsignal()\fR library routine provided for that purpose. The signal handler takes the form .(b \fC .TS tab(+); l s s s. int sighandler( signal_number) .T& l l l. +int+signal_number; .TE \fR .)b .pp The \fIsignal_number\fR argument will be the well-known constant SIGURG. There are two reasons for the transport layer to issue a SIGURG: expedited data are present or disconnection was initiated by a transport entity or by the peer. Should the user have more than one transport connection open, another system call is used to determine to which socket(s) the urgent condition applies. This is the \fIselect()\fR system call, described below. .pp When the SIGURG indicates a disconnection, there may be user data from the peer present. TP saves the disconnect data for the user to receive via the \fIgetsockopt()\fR system call, or through the .IR recvmsg () ancillary data mechanism. .\" .\"If the user does not receive the disconnect data before the .\"reference timer expires, the data will be discarded and the .\"socket will be closed. .pp Transport service users may use more than one transport connection at a time. The \fIselect()\fR system call facilitates this. .(b \fC .TS tab(+); l s s s. #include + nfound = select( num_to_scan, recvmask, sendmask, +exceptmask, timeout ) .T& l l l. +int+nfound, num_to_scan; +fd_set+*recvmask, *sendmask, *exceptmask; +time+timeout; .TE \fR .)b .pp This system call takes as parameters a set of masks that specify a subset of the socket descriptors that are in use by the user program. \fISelect()\fR inspects the sockets to see if they have data to be received, can service a send without blocking, or have an exceptional condition pending, respectively. The masks will be set upon return to indicate the socket descriptors for which the respective conditions exist. The \fInum_to_scan\fR argument limits the number of sockets that are inspected. The call will return within the amount of time given in the \fItimeout\fR parameter, or, if the parameter is zero, \fIselect()\fR will block indefinitely. .\" FIGURE .so ../wisc/figs/TS_primitives.nr .pp .CF summarizes the mapping of the transport service primitives to Unix facilities.