- Registered User
- Join Date
- Jul 2006
How to initialize a non blocking socket using only winsock library
Hi, im sorry for asking what must be a very simple question, but i just cant figure out (ive googled and sifted thru msdn) how to set a socket as non blocking. Is it a function which sets it as non blocking or blocking, or is it in a data structure which is used by a function? Im sorry but i just cant figure it out, so if anyone could help me it would be greatly appreciated.
Also, just out of curiosity, is a socket set as blocking/non blocking when the socket is initialized (ie socket is called), or can it be changed, say, in between two recv() calls.
- Kernel hacker
Compilers can produce warnings - make the compiler programmers happy: Use them!
- Join Date
- Jul 2007
- Farncombe, Surrey, England
Please don't PM me for help - and no, I don't do help over instant messengers.
As far as I understand [given that I haven't ever done socket programming], you use select to check if a socket will block. You can then safely call recv() to fetch some data on a particular socking, knowing that the call will NOT block.
Is that what you wanted to know?
- Registered User
- Join Date
- Jul 2006
Well, i think your talking about what happens with non-blocking sockets once they have been initialized. If you call recv and the call wouldblock, then it returns a blocking error and you must parse that and move on with the program.
When you create a socket it is automatically set to be blocking. If you want to set it to non blocking then you can call fcntl, i.e.fcntl(sockfd, F_SETFL, O_NONBLOCK);however this is something i dont recommend doing as it requires you to continually check the socket and can kill the CPU.
I highly recommend you use the select method which can monitor a set of sockets and when there is an event on one of them you can then determine which one it is and work with it. You can also add a timeout interval on select so if no event is received on the sockets during that time interval then it carries on with program execution.
Let us know if you need any help using select. It is relatively easy.
Hope this helps
- Unregistered User
- Join Date
- Sep 2005
maybe what you are looking for is the select() function? Here's a link that may help:
In UNIX, process creation is most commonly done by combining two function calls:
- creates a new process which is an exact copy, a clone, of the process that called it. The process that s is the parent process and the clone is the child process, and the two get each other's process IDs (PIDs) which they will use to communicate and control.
- replaces the current process with a different process. The asterix (*) indicates that this is a family of functions whose names have different endings and which differ in how they construct an argument list for passing command-line arguments to the new process being loaded. You will need to read exec*'s documentation for details (eg, Google on man page exec).
The way this is used is that immediately after a there's an if-statement which separates the code for the child and parent processes to run (since the child is a perfect clone of the parent, they both use the return value of the call to tell whether they're the parent or the child, kind of like the dots under the clone's eyelid in Arnold Schwarzenegger's The 6th Day). The child calls to replace itself with the actual program that the parent wanted to run. But before it calls its replacement, the child will usually perform some tasks to set up parent-child communication, commonly with pipes (see below). Also, any process can and create virtually any number of child processes (there are real limits imposed by the capacity of the operating system, but conceptually there's no limit). And child processes can and create their own child processes. In fact, all processes running on UNIX and Linux are children, however many generations removed, of Process 1, -- reportedly, in some versions of Linux has been replaced by another Process 1, such as . To illustrate how quickly a multitude of processes can be ed, a common problem in writing your first ing experiments is for child processes to continue ing when they're not supposed to, thus creating a rash of processes that you never expected. Depending on the cirumstances and whether it's happening to you or somebody else, it can be either very frustrating or quite amusing.
UNIX manages process groups through which it ties child processes to their parents. When a parent process terminates, the operating system also terminates all its child processes. Therefore, when writing the parent you need to have it wait for all its child processes to terminate before it can itself terminate. This is supported by system signals and functions, such as and and .
IPC under UNIX is strongly supported and widely known. A common method is for the parent to create a pair of pipes which the child then attaches to its standard input and output files (stdin and stdout -- if you're a C programmer, then you should know about these already) before it calls . Other methods include named pipes, message queues, shared memory, and UNIX domain sockets. These methods and the techniques for using them are very well-documented and widely used by UNIX and Linux programmers. In many cases, the opportunity to do this kind of programming is what attracted the programmer to Linux in the first place.
This technique is widely used in sockets programming under UNIX. A lot of commercial servers use it to handle multiple clients and many UNIX-based books on network programming devote entire chapters to discussions of different design issues with forking servers.
I have to admit that I have zero practical experience with process control under Windows. I will report what I have gleaned from reading and I hope that what I say here is accurate.
Windows (ie, Win32), on the other hand, does not support nor anything exactly like it. Instead, it has a function that directly creates a new process based on the program named in the function call, basically combining and in a single function call. After that, I'm afraid that it gets rather hazy for me. I know that a process can get another process' handle (equivalent to the PID in UNIX), but I'm not sure how nor what all it can do with it.
Windows also does not have anything like process groups nor an actual parent-child relationship between a process and the processes that it creates. Rather, it's up to the "parent" to establish that role. There isn't any signal, but there are two functions, and , that are used to wait for "child" processes to terminate, as well as for other kinds of events. For example, they are also used later in multithreading for threads to synchronize in their use of common resources (a hint of nightmares to come).
I'm also rather hazy about IPC under Windows. I've seen mention of anonymous pipes, named pipes, and mailslots, but haven't played with them.
Compared to UNIX, this area is not widely known. I've only seen a few books on the subject.
As far as I know, process creation is not used when writing Windows servers. I believe that multithreading is the method of choice there.
Despite the differences in the operating systems, there are some common concepts. When a process is started, it's given the resources it will need. Mainly that's an area of memory and environment. As a general concept, only that process can access its own memory space; it cannot access the memory space of another process nor can any process access its memory space. This makes it more difficult for multiple process to share data. That is why IPC becomes important.
At the same time, the different processes operate asynchronously, which is to say that they run independently of other processes, such that no process can know where another process is in its execution; ie, it cannot know what exactly what another process is doing at a given time. When two processes attempt to access the same resource, such as shared memory (an IPC mechanism in UNIX), then they need to coordinate their activities. This is called synchronization and it gets really important when we move on to multithreading.
Most of these same issues also come up in multithreading, except that the different threads will be able to access the same global memory within their common process. Which simplifies matters significantly, but also complicates things greatly.
Another solution was the development of threads, originally known as "light-weight processes" with traditional processes being considered "heavy-weight" due to the amount of OS support they require. They're kind of like sub-processes; processes within a process. They're light-weight because there's less involved in creating and running them, plus all the threads within a process share that process' memory, thus reducing the demand on the system for resources.
OK, why "threads"? Well, we all know that computers execute programs one instruction at a time, strung one after the other as if on a thread. Let's call that a "thread of execution". The idea of multithreading is to enable a process to have multiple threads of execution. It almost seems like multitasking except for the fact that all these multiple threads are executing in the same process. That means that that memory space that belongs to the process is accessible to all the threads in that process. Suddenly life has become a lot easier, and a lot harder, all at once. It's a lot easier to share data and resources, but it's a lot harder to keep those threads from stepping on each other's toes and corrupting those common resources. Freer access requires stronger discipline on the programmer's part. Remember when I said that it demands skill and experience of the programmer? You have to know what you are doing: with great freedom comes great responsibility.
From what I've read, threads started out in UNIX as "light-weight processes". In POSIX UNIX this developed into a library called "pthreads" which is commonly used in UNIX and in Linux. It has even been ported over to 32-bit Windows (Win32) in a package called "pthreads-win32".
In Windows, there's a different history. MS-DOS was most definitely a single-threaded operating system. The 16-bit Windows versions 1.0 through 3.11 (AKA "Win16") just simply ran on top of MS-DOS. Win16 operated on the principle of cooperative multitasking, which meant that the only way you could switch between Windows tasks was if the current task surrendered control to another task, which meant that programmers had to be disciplined in how they wrote their applications so that no part of their application would run for too long and block all other applications, including Windows. Win16 could not support multithreading.
One of the big features of OS/2, which Microsoft developed for IBM circa 1987, was that it supported preemptive multitasking. Windows NT also supported it, from what I understand. UNIX had always used it. In preemptive multitasking, all processes are given a time slice, a very short slice of time in which to run, and all processes take turn running and their time slice is up the OS interrupts the process and passes control to the next process, and so on, such that each process has a chance to run. It wasn't until Windows went 32-bit with Windows 95, making that and subsequent versions known as "Win32", that the mainstream Windows products could operate on preemptive multitasking. This also allowed Win32 dows (Win32, starting with Windows 95) to finally support multithreading.
As to be expected, Windows and UNIX do multithreading differently. However, despite the superficial differences, the concepts are the same and they perform a lot of the same functionalities. Also, the same inherent problems exist that are handled in very similar ways.
In both Windows and UNIX, you write the thread itself as a function whose function header format is predefined. You start the thread by passing its name (which in C is its address) to a system function: in UNIX and in Windows. In Windows you could also use ; in fact, it is recommended by Johnson M. Hart in Win32 System Programming (2nd Ed, Addison-Wesley, 2001, page 214) where he advises:
Do not use CreateThread; rather, use a special C function, _beginthreadex, to start a thread and create thread-specific working storage for LIBCMT.LIB. Use _endthreadex in place of ExitThread to terminate a thread.
Note: There is a _beginthread function, intended to be simpler to use, but it should be avoided.
In both Windows and UNIX, the thread function takes a single argument, a pointer. The thread creation function passes a pointer to the data being passed to the thread. The thread receives that data pointer as its single argument. A neat trick you can use with this pointer is to define a struct that you fill with all different kinds of data and then pass all that data to the thread through a single pointer. And of course, because you wrote the thread, the thread knows the struct's declaration and therefore will know how to access all that data.
Closing a thread is the easiest part: simply reach the end of the thread function. In addition, there are a number of functions for controlling threads and for getting their status.
OK, now comes the trouble.
Just as each function call has its own block of memory on the stack for its local variables, etc, each thread has its own "thread local storage" (TLS). At the same time, like any other function, a thread has direct access to the program's global memory. So instead of going through some exotic IPC procedure, threads can communicate with each other through global variables. Very simple, very direct. Very dangerous.
Here are two principal problems that can arise, both based on the same basic fact that threads run asynchronously:
- Consider the situation where one thread needs a value provided by another thread. The second thread stores the result of its calculations in a global variable that the first thread reads. So how does the first thread know when the value in that global variable is valid? The first thread has no idea when the second thread has completed its task.
- This problem is based also on the fact that a thread can be interrupted at any time. At any time. Here's a scenario to illustrate the problem. One thread reads from a global variable to use the value that another thread has written to it. Let's say that it's an int. In 32-bit systems an int is four bytes long; in 16-bit systems they're two bytes long, but who runs one of those anymore? But either way, this could happen: in the middle of updating a multi-byte value, the thread is interrupted and the other thread reads that value. The other thread has read a bogus value.
Remember, despite all this multitasking and multithreading, as long as it's all running on a single hardware processor, the computer can only perform one instruction at a time. Two different threads aren't running at the exact same time. They can't on a system with only one single hardware processor. Instead, each thread is given a finite amount of time to run, after which it is interrupted right in the middle of whatever it's doing for control to pass to another thread, preemptive multitasking. It's even that way on a dual- or quad-core machine where you have 2 or 4 processors. Your software has to either be specially written or smart enough in order to take advantage of extra processors, and most software is neither. The OS should be smart enough to, but preemption will still be used in the running of most of your code.
That's the crux of the problem. There are some sections of code, called critical sections (keyword alert!) within the code where an operation cannot afford to be interrupted until it has been completed. Like the writing of a multi-byte value. Or the updating of a data buffer.
This is where synchronization comes into play.
These problems are what's called "race conditions", because the different threads are competing against each other for the use of the same resources, effectively "racing" each other. We don't want them to race each other; we want them to synchronize with each other. Maybe not synchronize all the time, but at least at critical moments. In critical sections.
Thus, the solution to race conditions is called "synchronization" (another keyword alert!). And, again, while the details will differ between operating systems and languages, a lot of the concepts and tools are very much the same:
- Thread control. A given thread could be suspended (ie, be told to not run) until the conditions arise for it to resume. For example, in our first problem a thread has to wait until a second thread has performed a calculation, so the first thread would be suspended and then resume when the second thread has completed its task. In pthreads, this is done with the function, .
- A semaphore could be used to count how many threads are using a particular resource -- usually that number is only one. Only when that count is less than a given value would a new thread be allowed to access it. Both UNIX and Windows support semaphores.
Windows uses the data type, , to implement semaphores. After having declared and initialized a , a thread can then signal that it's entering that critical section and that it's leaving that critical section. While one thread is in a critical section, all other threads are locked out and will block until the thread in the critical section signals that it's leaving, whereupon the next thread can enter, etc.
- More commonly, you will want a thread to have mutually exclusive access to a resource, locking out all other threads until it's done. This is done with a mutex (obviously an abbreviate of "mutually exclusive"). Both UNIX and Windows support mutexes, albeit differently. Also, in Windows a mutex can apply to multiple processes, whereas a only applies within a single process.
The basic procedure is to create a mutex variable for each shared resource and then to surround all code that accesses that resource with mutex calls. Before a thread attempts to access a shared resource, it locks the mutex, which will deny access to any other thread. And after it's finished with that resource, it unlocks the mutex, which will allow access to other threads. When a thread attempts to lock the mutex and it's already locked, then that thread will be suspended until the other thread unlocks the mutex, at which point the first thread's lock succeeds and that thread then accesses the resource and unlocks the mutex.
Both mutexes and s share the same problems:
- They are based on the honor system. A mutex or semaphore blocks a thread's access only if the thread goes through the mutex/semaphore; if a thread bypasses the mutex/semaphore altogether then there's nothing to stop it. Therefore, the programmer must have the discipline to write all code that accesses a common resource so that it performs the necessary mutex/semaphore calls. Any code that breaks that discipline defeats the purpose of synchronization.
- The programmer must write the critical section code so as to avoid a deadlock (yet another keyword alert!). That is the condition where two threads or processes have locked resources that each other needs and must wait for. Since neither thread can release its resources until the other releases its, they're stuck there and that application is now dead in the water.
Of course, as you learn a specific multithreading system then you will learn that system's methods for synchronization. And some systems have even more methods than I've mentioned here.
Applying Multithreading to Network Applications:
OK, so now we may ask how we would apply multitasking or multithreading to network apps. Let's start by reviewing why we started down this long path to begin with. In a program with a single thread of execution (the normal situation), a blocking socket causes the entire program to come to an abrupt halt until it receives data to be read. We need the program to be able to continue to operate while waiting for that data to arrive. We need to be able to respond to user input from the keyboard, to receive data from other sockets, and to do whatever else needs to be done (eg, updating a time display on the monitor). This is especially necessary for a server that could be handling multiple clients simultaneously, plus checking for any new clients attempting to connect. Getting blocked by any single socket would be catastrophic for such a server.
The basic strategy of using multithreading in designing such an application is that you give each socket its own thread of execution; if the socket blocks that thread then it causes no problems, because none of the other threads will be affected by it.
A Multithreading Server:
Among my sample servers, I developed a version of my multi-client tcp server, , that uses multithreading under Windows. I will describe it in the following discussion to illustrate the general design ideas:
- When the program starts up, it's running the main thread. It is my understanding that that is the thread that should be running your user interface and performing the user I/O. It is my understanding that that is where the Windows WinMain loop needs to be running. This main thread then creates "worker threads" which do the non-user-interface work. Basically, just write the user interface as you normally would.
The basic design of the main thread would be initialization of the application based on the command-line arguments and/or a configuration file, followed by the creation of one or more worker threads, some of which could create more worker threads. Then the main thread settles into a loop where it processes user I/O and possibly interacts with some of the worker threads.
In , the main thread reads the command-line argument, which is the echo port to use. Then it does the regular Winsock initialization and initializes the variables for accessing the client list and the output string queue. Then it creates the server's listening socket and starts the AcceptThread, passing the listening socket to it. Finally, the main thread settles into its user I/O loop in which it looks for keyboard input and checks for output strings to be displayed.
Two possible alternatives would be:
- Let the AcceptThread create its own listening thread and possibly also have it perform all the Winsock setup and initialization.
- For a major project, have the main thread create a ManagerThread which will create and manage the thread hierarchy that forms the entire back-end of the application, leaving the main thread with nothing to do except run the user interface and communicate with the ManagerThread. The user could then type in a command and the main thread would send the command to the ManagerThread who would execute it and send the results back to the main thread to display to the user.
Bear in mind that this ManagerThread idea is ambitious, but it makes sense for a large and major project. A lot of its complexity can be delegated to worker threads working under it; eg, a ThreadManager and a ApplicationManager.
- In simple designs, that first thread created by the main thread would simply be the AcceptThread, which would block on a call to and then, upon receiving an incoming connection, would create a new thread to handle the new client.
This is the approach I took in . When the AcceptThread accepts a new connection, it adds the new client socket to a client list and then starts a new ClientThread and passes the new client socket to it. Then it returns to block on .
- The idea behind the ClientThread is that we can have several of them at the same time, each one running the echo session with its own client. Then when the client disconnects, the thread performs a graceful shutdown and close, terminating itself.
Again, this is the way that it's done in . In addition, the thread reports what's happening by sending output strings to the output string queue, a critical section that all threads would be attempting to access, so that the main thread can actually output them.
- When the server is commanded to shut down, it will need to perform the following operations (though the exact details may vary widely):
- Tell the AcceptThread to stop accepting new clients, whereupon it will close its listening socket.
- Tell each ClientThread to terminate its session, whereupon it will shut down its connection, close its socket, and then finally close.
- When all ClientThreads have closed, then the AcceptThread will signal that fact to the main thread and then close itself.
- Finally, the main thread can exit, terminating the application.
Sadly, does none of that, but rather simply exits. Mainly, there's the problem that none of the threads can respond to a command to shut down because they're all blocked. We can have the main thread command the threads to terminate, but that would not be a graceful shutdown.
The only solution that comes to my mind at the moment is that we combine techniques; eg, make the threads' sockets non-blocking -- could be an option, but since we're only dealing with one socket per thread, would be a bit of overkill, but would still be an option if the thread deals with more than one socket. That way, the threads would be able to periodically pull their heads out of their sockets and check for an incoming command (eg, via a global variable guarded by a ) that they'd need to act upon. Nor would it necessarily defeat the purpose of multithreading, since the program can benefit in other ways by being multithreaded; it all depends on your needs and on your design.
A Multithreading Client:
In my echo examples, there wasn't any need for more than a simple echo client. I really couldn't think of a meaningful way to add multithreading to it. But there are a multitude of possible projects where we would want the client to remain free to perform a variety of tasks in addition to communicating with a server.
Let's consider how we might use multithreading in the design of a game client. For sake of the example, let's assume that the players will each run their own copy of the client and that they will connect to a server via TCP. Among other things, the server will maintain the game state, including the condition of all the players and their locations within the game space. Players can also communicate directly with the other players, either individually, to a small group, or broadcasting to all, via UDP datagrams, though some forms of inter-player communication which affect the game state would need to either be routed through the server or be reported to the server after the fact. In addition, the client could use UDP datagrams to periodically communicate certain information to the server, such as a "heartbeat" message that indicates that the client is still running and is connected; TCP is a robust protocol that is supposed to be able to maintain a connection despite momentary disconnects, which means that it can be difficult to detect when the connection is actually lost (eg, the client computer's network cable is unplugged, the client suddenly crashes). Another possibility would be to borrow an idea from FTP and have two TCP connections to the server, one for commands and the other for the transfer of data. The point here is that the client can have several sockets to manage.
The client will also have a lot of different tasks to perform; eg (by no means exhaustively):
- User interface to accept and process commands and display responses.
- Updating of client's knowledge of the game state and game space.
- Real-time updating of graphical display to reflect the updating of the client's knowledge of the game state and space.
- Communications with the server.
- Communications with the other players.
Multithread/Multiprocess Server Designs
Reiterating the operation of my simple multithreaded server, , as a typical example:
- The main thread initializes the server and then creates and starts the AcceptThread.
- The AcceptThread handles the listening thread. When a client connects, it accepts the connection, creates a new client socket for that client and creates a new ClientThread to service that client through its client socket. There will be one ClientThread for each and every client.
- The ClientThread waits for the client to send it a request, which it services (in the echo service, that means that it echoes the request string back to the client) and then returns to waiting for a request from the client. When the client shuts down the connection, the ClientThread performs its part of the shutdown, closes the socket, and ends itself.
A simple multiprocess server would function in a similar manner, only instead of creating a thread for each client it would create a process that would use some inter-process communication (IPC) technique to communicate with its parent and with other processes in the server. Again, this would most likely be done under UNIX/Linux, so /* and pipes would most likely be used.
While these simple servers can perform satisfactorily under light and moderate loads, real-life and commercial applications can easily overtax and overwhelm them. There is overhead to be paid in creating and destroying threads and much more overhead in creating and destroying processes. And even some overhead in creating and closing sockets. So some different design approaches have been devised to address these problems and to speed up server response while reducing the work load on the system.
Some of those design approaches are examined in Lincoln Stein's book, Network Programming with Perl (Addison-Wesley, 2001). The following discussions are based primarily on his examples in that source.
Preforking and Prethreading
One way to avoid loading down the server by creating new processes and threads on the fly is to create them all when the server starts up. This is called either preforking or prethreading, depending on whether you're using multitasking or multithreading. The basic idea is to create upon start-up a pool of threads or processes from which you draw to handle a new client and back to which you return the threads and processes that the clients have disconnected from. That way, you keep reusing the threads and processes instead of constantly creating new ones and destroying them.
Stein's presentation first looks at simple preforking and prethreading using a web server as an example and then discussing their problems and suggesting an improved "adaptive" method. His approach involved much analysis which you can read in his book; I'm just going to give a brief presentation for you get a general idea. Also, the general approachs, problems, and solutions are very similar between forking and threading; it's mainly just the specific techniques that differ:
- Simple Forking ("Accept-and-fork"):
- The simple baseline forking server spends most of its time blocking on . When a new connection comes in, it spawns a new child, via and , to process that new connection and goes back to blocking on . The child exists long enough to service the client and then terminates when the client disconnects.
While this works well enough under light to moderate loads, it cannot handle heavier demand for new incoming connections. Since Stein's example application is a web server, in which it is typical for a server to be hit with a rapid succession of short requests, his example would routinely be overtaxed. Spawning and destroying a new process takes time and resources away from the entire system, not just the server itself, because the entire system normally has only one processor that can only do one thing at a time. These bursts of sudden spawning and destroying activity will dog everything down and make that web page very slow to load.
- Simple Preforking:
- Stein iterates through a few versions of this. The basic idea is that when the server starts up, it spawns a predetermined number of child processes. This time, each child process includes the , such that the child process runs an infinite loop that blocks on until a client connects to it, services that client, closes the client socket and goes back to blocking on until the next client connects. This eliminates the system overhead of child process spawning while trying to service clients. And it eliminates the system overhead of destroying the child processes because they never get destroyed, but rather they repeatedly get recycled.
In his first iteration, Stein had the parent process terminate after it had created all the child processes; after all, there was nothing else that it had to do, having handed everything off to the child processes. However, problems arose:
- If more connections started coming in than there are preforked child processes, they cannot be handled, which slows down the server response. But there is also a performance and resource cost to the system for each process running, so there is a practical limit to the arbitrarily large number of processes we could prefork.
- If a child process crashes or is terminated, there's no way to replace it.
- There's no easy way to terminate the server: each child's PID would need to be discovered and each individual child would need to be explicitly terminated.
- With all those children blocking on on the same listening socket, when a new client tries to connect then all of those children will compete for that connection at the same time, straining the system as several processes all wake up at the same time and compete for the same resource. Stein calls this phenomenon "the thundering herd."
- Some OSes will not allow multiple processes to call on the same socket at the same time.
Stein addresses these problems (except for the first) with his second iteration:
- By installing signal handlers and keeping the parent process alive, it can respond to the killing of a child by spawning its replacement. It can also terminate the server by signalling all the children to terminate, detect when they have all terminated, and then exit.
- By "serializing the call", it solves both the "thundering herd" and multiple s problems. Set up a low-system-overhead synchronization mechanism, such as a file-lock on an open file, and the child that is able to gain access will then be allowed to call . As soon as that child accepts a connection, it will release that file-lock allowing another free child to call and so on.
- Adaptive Preforking:
- This is where it gets ambitious and more like a real-world server. We want to have enough children to service all the clients currently connected, plus a few more to immediately handle new clients coming in, but at the same time we don't want to have too many children sitting idle wasting system resources. A lot of supply-and-demand scenarios could illustrate this, but one analogy would be a "just in time" approach to a factory inventory system. You want to have enough parts on hand to keep the assembly line running smoothly, but every extra unused part in inventory is wasted capital. You want to minimize the cost of maintaining your inventory while maximizing its ability to feed production. How to do that is a complete field of study in itself. To really simplify it, ou work with many factors -- eg, how long it takes to get the part from the moment you order it (lead time), how many parts get used in a period of time (consumption rate) -- and come up with two important figures: the "high water mark" and the "low water mark" (also the terms Stein uses). If the number of parts in inventory drops to the "low water mark", then you run the risk of running out of parts (which would completely halt production) so it's time to increase the number of parts normally ordered. But if it rises to the "high water mark", then you run the risk of having too many parts in inventory whereupon you need to reduce the number of parts ordered. As mentioned before, a lot of supply-and-demand systems have high and low water marks to keep the system within an optimal operating range.
In the case of the adaptive preforking server, in addition to the simple preforking server's solutions, the parent process keeps track of the status of its children. If the "high water mark" of too many children being busy is reached, then the parent spawns more children to handle the increased work load; while the system is busy creating those new processes, there are still a few idle children to immediately handle incoming new connections. Then when the work load drops off and the number of idle children reaches the "low water mark", the parent kills the excess children to reduce wasted system overhead.
The increased complexity of the adaptive preforking server (ie, "where it gets ambitious") lies in its need to make more extensive use of inter-process communication (IPC). The parent needs to be kept appraised of each child's current status so that it can detect the high and low water marks. It also needs to be able to command a child to terminate, both when the "high water mark" has been reached and when the server is shutting down.
There are several possible IPC methods that an adaptive preforking server could use. Stein investigates two, pipes and shared memory.
A lot of the approaches and problems in the preceding forking and preforking schemes also apply in general to multithreading.
- Simple Multithreading:
- As with the simple "accept-and-fork" server, the simple threading server's main thread spends most of its time blocking on . When a new connection comes in, it creates a worker thread to service that client and goes back to blocking on . The new thread exists long enough to service the client and then terminates when the client disconnects.
Again, as with the simple "accept-and-fork" server, this constant creation and destruction of worker threads puts an extra load on the system when the server experiences heavy demand. The overhead is not as bad as with spawning and destroying processes (multithreading started out being called "light weight processes"), but it is still there and it does still have an impact on performance.
- Simple Prethreading:
- As with the simple preforking server, upon start-up the main thread creates the listening socket and then creates all the threads, passing the listening socket to each thread. Each thread will then block on , service the client that connects to it, and then after that session ends will clean up and go back to blocking on , waiting for the next client. In the meantime, the main thread sits idle with nothing to do, but it cannot exit as in the first simple preforking example, because with multithreading that would close all the threads.
The same problems of the "thundering herd" and multiple s exist as in simple preforking and they are solved in the same manner.
The approach I thought of differs from Stein's in that I would prethread the ClientThread working threads and then have a separate AcceptThread that would pass the new client socket on to an idle thread. My multithreading server anticipates this prethreading approach, though without actually implementing the prethreading itself. My approach would require communication with the threads to keep track of their status, which presages the next approach, adaptive prethreading.
- Adaptive Prethreading:
- Again, this method mirrors that of adaptive preforking, except that the actual implementation is specific to threading.
- Pthreads Thread Pools:
- Another source, Pthreads Programming by Nichols, Buttlar, & Farrell (O'Reilly, 1998, the "silkworm book"), includes some sample applications, including an ATM server. In developing the design of the ATM server they discuss Thread Pools (pages 98 to 107). It's a prethreading example that they examine in detail with C code (as opposed to Stein's Perl listings).
This scheme prethreads a predetermined number of worker threads whose information structs are kept in an array (dynamically created on start-up). The work to be performed arrives on a request queue and is accepted by one of the idle threads. When the task is completed, the thread returns to an idle state. When the request queue is empty, then all worker threads will be idle. When all threads are busy, then new requests cannot be honored and the system will be notified depending on what operational options were selected.
Note that this is not an adaptive pool, which would require a more dynamic data structure to hold the thread pool, but that could be done.