Understanding the subtleties of hardware-cache-bus-memory interactions is an essential component of “doing” a kernel for a multiprocessor system.
Book Review
Unix Systems for Modern Architectures
Author: Curt Schimmel
Publisher: Addison-Wesley
ISBN: 0-201-63338-8
Reviewer: Randolph Bentson (bentson@grieg.seaslug.org)
“What is involved in a multiprocessor version of Linux?” has almost become a “Frequently Asked Question” in the Linux newsgroups. The answer is contained in Curt Schimmel's UNIX Systems for Modern Architectures.
Schimmel can speak from experience on this topic. He worked on Unix systems at AT&T Bell Laboratories and at Silicon Graphics, Inc., and has offered tutorials on symmetric multiprocessor Unix systems at USENIX and UKUUG. This book is an outgrowth of those tutorials.
At first glance, the book seems to offer too much detail about hardware for a programmer. But as one proceeds, one sees that understanding the subtleties of hardware-cache-bus-memory interactions is an essential component of “doing” a kernel for a multiprocessor system.
After a brief (17 page) description of Unix processes, another 130 pages are devoted to discussing uniprocessor cache systems. I was surprised and delighted to find out how hard it can be to get the right results. Fortunately, folks do seem to have done this right on the systems I've used.
With this foundation well established, the remainder of the book deals with the new domain of multiprocessor systems.
The keys to any such system are protecting shared data and efficient interprocess communication. Mutual exclusion mechanisms are cast in three forms - short term, medium term, and long term. We are shown how uniprocessor implementations of Unix depended on a single-threaded kernel and interrupt masking to protect shared data and, more importantly, we are shown how these methods are inappropriate for a multiprocessor system.
Schimmel shows how one can build locks for all three levels of mutual exclusion (and points out where they are needed in a typical Unix kernel). Although the master/slave scheme is straightforward to implement, it has much the flavor (and bottlenecks) of a uniprocessor system. The more promising symmetric multiprocessor scheme is not as easy to do correctly. The essence of the problem is finding the right granularity (or size) for the critical sections. Granularity that is either too large or too small can harm system performance. We are shown the analysis that leads to good designs.
The book concludes with more memory access and caching issues - this time with multi-processor systems. Some recent RISC chips have memory models which allow for stores and loads to be re-ordered from what the programmer intended, in order to gain performance. We are shown how RISC chips have mechanisms to force the correct results for implementing locks and accessing data in critical sections. Even when memory requests are issued in the order they were programmed, cache consistency is a serious issue in multiprocessor systems. The final chapters of the book address the interactions that must be dealt with by a serious system designer.
This book is written as a textbook, with questions and references at the end of each chapter. Selected questions have answers provided in an appendix. Another appendex summarizes a dozen popular chips found in Unix systems.