A real life experience in remote upgrading of a Linux PC across the Pacific Ocean.
As a business offering software internationalization services, we operate a small office in western Massachusetts as well as a small sister company in Taipei, Taiwan. We also need to support a distributed software development environment for engineers working remotely. While our bandwidth demands are not great, we do need reliable e-mail, web, news and FTP services. Primarily used to provide connectivity from the inside office to the Internet, the connection has to be available for external access from remote users on a 24 by 7 basis.
The network consists of a variety of UNIX workstations and PCs running Windows 95 and Windows NT, used for software development and support as well as the usual office applications. We use Linux running on Intel-based PCs as our network servers because it provides one of the most cost-effective small business network server solutions. The network uses a private class C Internet address from the 192.168.*.* block, since it is not directly connected to the Internet.
Our internal Linux server, with a Pentium 133MHz processor, an Adaptec 2940 SCSI card, a bunch of SCSI drives and a 4mm DAT tape drive, provides the backbone of our computing infrastructure. As a mail hub, it provides POP3 and SMTP support for mail-client applications running on the network. By running Samba, it acts a network file server for Windows-based PCs. Finally, it provides name resolution services using BIND.
The second Linux server in this “dynamic duo”, an old 486 PC with a 500MB hard disk and a monochrome monitor, is our external gateway machine. It connects us to the Internet through a persistent PPP connection with a static IP address over a 28.8K dial-up phone line to a local Internet service provider. This machine also acts as a dial-in server and as a firewall. It provides an e-mail relay and spam filter to and from the internal mail hub. HTTP, FTP and NNTP proxy services are also provided by this machine to allow internal users access to these Internet resources.
Both Linux machines were running Debian version 1.3. On an Internet firewall machine, you want to have precise control over what software is loaded on the machine. You want the minimum necessary to do the job, no more. Since the machine was to be remotely administered, it was even more important that it be easy to upgrade individual packages as necessary without having to do cold installs for new OS versions. Debian's dselect/dpkg system of handling software packages is ideal in this situation. We could easily select the software required to run the system, knowing that all prerequisite packages were included. Plus, Debian's large collection of software packages included almost everything we needed in its convenient dpkg format.
Debian Linux can be downloaded for free from http://www.debian.org/ or a host of mirror sites. In our case, we purchased a CD-ROM from Linux Software Labs, which also made it easy to add a contribution to the Debian project, whose work we greatly appreciate.
The Taipei office used a Linux gateway to connect to the Internet, but the configuration was quite different. We were issued a block of class C addresses from the Taiwanese ISP which advertised a route to them. The gateway machine was running a publicly accessible FTP server, HTTP server and mail hub, as well as being the primary public name server for our domain in Taiwan—all using a very old version of Caldera Linux.
When the operating systems on the server machines in the U.S. office were upgraded to take advantage of the new features in Linux 2.0, it seemed an ideal time to upgrade the systems in Taiwan, as well as reconfigure the network to more closely match the one in the U.S. office.
While the project seemed straightforward enough, the problem was that the work had to be done from ten thousand miles away across the Pacific Ocean using the Internet.
Upgrading Linux boxes remotely, especially across the ocean, requires some advance planning. Some of the issues we had to deal with were:
Which Linux to use?
What were the security concerns?
Were we going to set up both a private and public side network?
Our choice for the first question was to stay with Debian Linux version 1.3. It was the same version we were running in Massachusetts, so we could essentially install a copy of what was on the U.S. system, reconfigure it for the different names and addresses in Taiwan, and be all set.
Since the upgrade was to be done across the Internet, security was a major concern. We needed a secure connection from the U.S. to Taiwan so that logins and passwords would not be revealed to Internet eavesdroppers and Ethernet sniffers; thus, we chose the Secure Shell (SSH) package. Due to U.S. export restrictions, we could not just upload the software from Massachusetts, so we downloaded the source for the SSH package from a Taiwanese FTP site to the Linux machine in the Taipei office. We then compiled and installed it, so the install/upgrade could proceed in a secure fashion.
While our U.S. setup is required to service only an internal network, our Taiwanese operation decided they needed to set up an area to allow public Web and FTP access. To do this without compromising security for the internal network, things had to be set up a bit differently.
Taiwan's block of Class C addresses, assigned by their ISP, were used by both the internal machines and the firewall. We designed a network setup including a publicly accessible network created using these addresses for use by the public HTTP and FTP servers. The rest of the machines were connected to a private network, once again using addresses from the 192.168.*.* block as in the U.S. office. The firewall machine was then configured with a second Ethernet interface: one to connect the outside PPP connection to the publicly available network and the other to connect the private network. We then used the IP firewalling capabilities of the Linux kernel to keep network traffic where it belonged.
The Taiwan office already had an operational gateway PC named “dragon”. Rather than upgrading it while using it to provide our connection, a second machine, “dolphin” was identified as the new gateway machine. This way, we could be sure the upgrade was successful before putting it in place, and it gave us a fall-back position in case it was not. Since the name and address of dragon were in DNS maps outside of our control, and coordination with the local ISP had proved inconvenient in the past, we had to swap the identities of the machines before proceeding.
As the new dragon would be serving both public and private networks, two network cards were installed. Simple jumper-capable NE-2000 compatible cards were chosen so that their IRQs could be easily configured. In order for our system administrator to log in to dolphin through the Internet, a minimal Caldera Linux system was installed on it. Finally, dolphin was connected to the local network.
Since the new firewall machine was no longer going to act as a mail hub for the network, an existing server running Linux, “elephant”, was nominated. Sendmail and a POP3 server were installed on elephant. Dragon was reconfigured to relay e-mail in and out of the domain rather than acting as a hub. Elephant was also configured to act as the DNS server for the internal network, with dragon as a forwarder, since elephant would no longer be directly connected to the Internet. In turn, dragon was configured to continue acting as primary DNS server for the domain to the outside world while using elephant as its resolver. This way, only publically accessible machine names and addresses would be visible from the Internet, while dragon would continue to be able to resolve all internal addresses, both public and private.
Two concerns arise when doing remote upgrades:
Disruption of Internet access must be avoided as much as possible.
A human being must be present to act as a remote pair of hands in the unlikely event that the new machine was hung or rendered inaccessible or unbootable as the upgrade proceeded.
To avoid disruption, we decided that the upgrade should be done during the weekend in Taiwan. Since a time zone difference of exactly 12 hours exists between Massachusetts and Taipei, it was agreed that the upgrade would start on Friday at 8 PM EST, or 8 AM Saturday in Taiwan. A human would not have to be at the office in Taiwan until 9 AM when the machine was ready to be rebooted.
In advance of all this, gzipped tar files of the root, /usr and /var file systems from the Massachusetts machine were downloaded via FTP to the Taiwan office Friday night Taipei time. The exercise of downloading, building and installing SSH was also accomplished at this time.
Communication between the upgrader in the U.S. and the human sentinel in Taiwan was necessary. To avoid making expensive long-distance telephone calls (although we still ran up a $200+ telephone bill) unless it was necessary, we decided to use computer communication whenever possible. Latency eliminated e-mail as a possible choice. We chose to use talk when it worked and write otherwise.
We started by adding partitions to the disk of the target machine. Three new partitions were created with fdisk in order to hold the new root, /usr and /var file systems. Next, a reboot was needed in order to ensure the new label was in force so that new file systems could be created and the tar files restored. We used rdev to set the new root device in the kernel so that it would be ready to boot the freshly installed operating system. Then we needed to localize the machine, changing the name and address of the machine to match the Taipei office network.
Sometime in the middle of this work, it was noon in Taipei. After sending a warning note to the upgrader in the U.S. that no human would be there for an hour to restart the machine in case of a foul up, the Taiwan staff headed off to lunch.
It took two more hours after the Taipei people came back from lunch before things became almost ready. The DNS maps were copied over from dragon so the machine would be ready to step right in as primary name server for the domain.
At that point, dolphin was rebooted into the newly installed system for the first time—all seemed well. It was also almost 3 AM the next morning in Massachusetts. We were now ready to hook up the new dragon to the Internet.
The first order of business was to switch the names and IP addresses of old and new dragon before performing the physical switchover. The files /etc/hosts, /etc/hostname and /etc/init.d/network all contain references to the hostname and IP addresses that needed to be changed. Once done, the modem was unplugged from old dragon and plugged into new dragon and it was time to go for the gold.
Dragon is connected to the ISP via a dedicated leased line. Its modem is designed for use on a 2/4-wire leased line circuit and is of the type that automatically connects to the ISP whenever the phone line is plugged in.
With bated breath, we waited for the new dragon to connect up. What we got instead were several screens of error messages. Dragon's modem has a large LCD display indicating that the modem was on-line, so the problem had to be in the configuration. It was 4 AM in Massachusetts.
We switched everything back to the way it was, so our upgrader could log in and find the problem. But we now realized that we must send our upgrader off to bed, as he was dozing off while typing. We decided to continue the upgrade the next morning, Taipei time.
Fortunately, it turned out that the problem was quite simple: we had not configured the routing table correctly. After fixing this, the new dragon was able to come up without a hitch and we were able to dispatch our upgrader to bed early that night.
After our upgrader had gone to bed and we had the system up and running, it was time to make sure everyone's web browser and e-mail continued to work. Because the internal network is now on the private IP, the IP addresses of all internal UNIX and Windows computers had to be changed to 192.168.*.*. The web browsers also had to be reconfigured to look for the web proxy server on dragon's new private IP address. Finally, e-mail clients had to be reconfigured to look for the POP3 server from elephant, the new mail server.
As access to the internal network from the Internet is through the use of a one-time password, this particular system had to be checked. Finally, we also wanted to serve web pages from the public side of the network, so a plug was put into the firewall toolkit configuration to the Windows NT machine running IIS (Internet Information Server). For a while, the plug was not working reliably—that is, until we found out we had accidentally messed up the name table. With that fixed, we had all the pieces the Taipei office needed in working order.
We eventually want to replace fwtk with IP masquerading. This makes the network more convenient to access from the inside network. We do have a test network that has it all working, so we will be deploying it shortly in the Massachusetts office. We want to be able to make public multiple web servers for corporate, testing and internal uses. These can be UNIX or Windows NT machines. The IP forwarding facility of the Linux kernel should make this fairly painless.