Changing my mind about distributed computing made me aware of the sweet aroma of opportunity.
I started writing about computing back in the 1980s. I don't want to say which year, or do the math for how long I've been doing this. It makes me feel old.
I've made a plethora of predictions since then. Some of them left me red-faced and embarrassed. Some of them were spot-on. Some of them have not yet been fulfilled, but I still think my predictions are on target.
One of my earliest predictions was rather easy, but it was considered controversial back in the 1980s. I said that it was only a matter of time before we bumped up against the limits of Moore's Law, and the only viable answer would be parallel processing. Lo and behold, dual-core processors are now common, and it won't be long before we see quad-core processors, and the multicore cell processor in the PlayStation 3 is around the corner.
Naturally, the next logical step is clustering or other means of distributed processing. Here's where I begin to get nervous. When “grid computing” became a buzzword, my knee-jerk reaction was, “no, thanks”. I don't work in a company office anymore, but if I did, I wouldn't want the company off-loading processing to my desktop workstation unless I was certain that everything ran in a completely isolated sandbox. Put the grid processes in a chroot environment on Linux, for example. Even then, I'm not sure I'd be happy about the idea. What if I want to do something compute-intensive, and the grid process decides it wants my CPU cycles more than I do? This isn't supposed to happen, but since it's all in the hands of some administrator with his or her own agenda, why should I trust that it won't happen?
It's the lack of control and fear of security breaches that make me nervous. I've got four computers in my home that nobody ever turns off, and two more for special purposes that I turn on as needed. The two hand-me-down computers my kids use sit idle much of the time, unless my daughter is browsing the Web, or my son is playing World of Warcraft. I use a server as a centralized provider of resources such as printers, files and e-mail. It's a very old machine, but it never breaks a sweat given its purpose. All this represents a tremendous amount of wasted processing power. I'd love to tap in to that unused power at home. This is a safe environment, because I'm not talking about exposing my processing power to everyone on the Internet. I'm talking about distributing workloads across local machines.
In principle, however, Sun was right all along when it said, “the network is the computer”. Other companies, such as IBM, worked along the same lines before Sun did, but I don't know of any company that said it better than Sun. “The network is the computer” is a powerful phrase. As long as there is adequate security built in to every aspect of distributed processing, it makes perfect sense to provide common services as remote procedure calls and distribute every conceivable workload across as many computers as you want to make available to the system. If someone could make me feel comfortable about security and control, I'd buy into distributed processing in a big way.
Here are the challenges as I see them. First, there's the problem of heterogeneous platforms. How do you distribute a workload across machines with different processors and different operating systems? ProActive is one of several good platform-agnostic distributed computing platforms (see www-sop.inria.fr/oasis/ProActive). It is 100% pure Java, so it runs on any platform that supports Java. It has a great graphical interface that lets you manage the way you distribute the load of a job. You can literally drag a process from one computer and drop it onto another.
The problem is that a tool like ProActive doesn't lend itself to the way I want to distribute computing. I want it to be as transparent as plugging a dual-core processor in to my machine. Unfortunately, you can't get this kind of transparency even if you run Linux on all your boxes. The closest thing to it that I can think of is distcc, which lets you distribute the workload when you compile programs. Even this requires you to have the same version of compiler (and perhaps some other tools) on all your boxes. If you want this to be a no-brainer, you pretty much have to install the same distro of Linux on all your machines.
The bottom line here is that I smell an opportunity for Linux. I would love to see a project that makes distributed computing on Linux brainlessly transparent and distribution-agnostic. I'm talking about the ability to start up any computation-intensive application and have it automatically distribute the work across other machines on the network configured to accept the role as yet another “processor core”. You can make this transparent to the application by building it into the core user-space APIs. You manage it like you would any other network service. Is this too pie in the sky? I'd love to hear your opinions.