Introduction
Announcements

Schedule
Labs
Assignments
TA office hours

Tests, exam

Topic videos
Some course notes
Extra problems
Lecture recordings

Discussion board

Grades so far

Calling select()

Calling select() is tricky. You have to spend some time packaging up the arguments to pass to it, and you have to spend some time unpacking the results.

Suppose you're talking to two other programs, on file descriptors fd1 and fd2.

In that case, you call select() like this. The first line is a declaration of the variable "fds".

fd_set fds;
FD_ZERO(&fds);
FD_SET(fd1, &fds);
FD_SET(fd2, &fds);
if (select(..., &fds, ...))
    ...

The first argument to select() is supposed to be the max of all of the file descriptors, plus one. It's as if select would do "for (i = 0; i < firstarg; i++)" (i.e. the select routine itself could contain such a loop). But of course it doesn't exactly do this. This is a bit pointless right now and it is in flux. But if you obey the rule, then your program will work now and in future. In computer programming we're guaranteed that if we write programs which obey the rules, then they work; so we obey the rules. So pass the max of all involved file descriptors plus one as the first argument to select().

My examples all calculate the max of all of the file descriptors and put that+1 as the first select() arg. It is sometimes a simple calculation, e.g. in the above case we could just check whether fd1>fd2 and pass in either fd1+1 or fd2+1. But the usual server case needs to find the max of all clients using a loop.

Something which confuses some students is that "fds" there is an "in/out parameter" -- the data in it already is data going into the select() function, and also data comes out of the select() function via this parameter. I.e. select() both reads it and changes it.

The select() function turns off all the bits which don't have read activity on them. So afterwards, you write something like this:

if (FD_ISSET(fd1, &fds)) {
    len = read(fd1, ...
    ...

and similarly for fd2. And if we were listening for new connections too (as the assignment four server does), we could include listenfd in the set, and do "if (FD_ISSET(listenfd, &fds)) ...accept(...)...".

We must only call read on fds with read activity pending or we will block. And unlike the usual case of a call to read(), we don't want to block in a read() in the server, because we might have something else to do for some other file descriptor. So select() is used to block until any of the file descriptors has data or EOF. Then we can safely read (just once!) from that file descriptor without blocking.

The "fds" is a "set" abstract data type, but it was obviously designed by people who didn't usually think about abstract data types. You may find the terminology of "zero", "set", and "clear" to be confusing. It's based on digital logic terminology -- these operations set or clear bits in the word.

Best is probably to think of them as defined as follows, ignoring the sometimes-poorly-chosen names:

Mathematically, it is a "set" rather than a "list" because the elements are not in order. The only datum is whether an element is in the set or not; not where it is within the set. So it's a set rather than a list.

Detecting dropped connections:

If the other side drops the connection, this will count as read activity for the select(). That is, it will be telling you to read from that file descriptor, but then when you do call read(), it will neither block nor read any data -- it will return zero, and indeed zero bytes will have been read. A return value of zero from read() means EOF. The circumstances under which you get EOF are the same as for reading from a pipe: the other end has gone away AND you have finished reading all data already sent.

When the server gets EOF from a client, it will close its fd and otherwise clean up its data structures, thus forgetting about that client.

And... select() is not specific to sockets. You can use it with any file descriptors. For example, a client could include 0 in the fd set and then you are select()ing on stdin as well, and it will return when the server has sent something OR the user has typed something.