Virtualization 2.0: Where the Sidewalk Ends

Kevin Epstein

Issue #166, February 2008

With VMware launching an IPO with no peak in sight and XenSource a red-hot acquisition by Citrix, it's clear that server virtualization is the flavor of the moment. But, additional challenges are faced by all hypervisor vendors and users: performance and connectivity to local data networks and to storage. The answer? Virtualization 2.0—infrastructure virtualization.

Having been indoctrinated in the ways of virtual machines (VMs) by VMware, where I spent the last four years, I feel like I'm committing heresy when I say that VMs aren't the sole answer to data-center ills. Beyond that, VMs actually can create some additional challenges, such as things to trip over when that moving sidewalk ends.

Don't get me wrong. I like VMware and Xen; I'm learning to believe in Solaris LDOMs on SPARC, and I'm a huge believer in the power of virtual machines. I think that for development, it's hard to beat the utility of having full multi-tier systems, virtually networked together, inside a single physical machine. And, it's certainly convenient to have packaged VMs to trade to other folks for easy replication of scenarios.

But, there are additional challenges faced by all hypervisor vendors (and users), performance and connectivity to local data networks and to storage.

An immediate aside is probably needed here. No, VMware VI3 doesn't even touch, much less solve, these issues—something that the world is beginning to recognize. Consider the following comments made by some well-known names in the industry:

  • “Be forewarned—as soon as companies deploy wider virtualization, a completely new class of problems will arise. Here the market is still failing to offer qualified solutions, solutions that largely involve more sophisticated automation.”—Allessandro Perilli, Virtualization.Info (world's leading virtualization blog)

  • “The ability to do server repurposing is critical for customers who want to implement a real-time infrastructure.”—Donna Scott, VP and Distinguished Analyst, Gartner Group

  • “Servers virtualized? Great! Now you need to virtualize your entire data center. Virtualization isn't happening only within servers. Infrastructure virtualization applies a software abstraction layer across the entire data center.”—Rachel Chalmers, The451Group

In short, it's not enough to have virtualized servers. You need virtual connectivity and real servers too—aka infrastructure virtualization.

But, let's consider the two challenges again, performance and connectivity to local data networks and storage. The performance challenge is straightforward, and therefore, it's not worth much discussion.

Hypervisors take up CPU cycles, are a potential point of failure (if the hypervisor fails, the whole set of VMs are lost), and due to the round-robin nature of their scheduling, they actually can mask race conditions in tests. Honest hypervisor vendors always will admit this point quite openly—in fact, I'm proud that during my tenure, VMware quite openly stated “don't virtualize applications above a certain threshold CPU, network or disk I/O level”.

The point? To test and run many multi-tier production applications, such as Citrix, SQL Server, SAP and so on, you want to run on bare metal. This, unfortunately, raises the challenge of movement between physical and virtual (P2V & V2P), as well as raising the second challenge, networking.

Networking is a more subtle challenge, because it's out of box—outside the physical machine in which a hypervisor (okay, let's be honest, it's an operating system) lives.

Again, it's time for an aside: hypervisors are an operating system. A hypervisor, like VMware ESX Server or Citrix XenSource Enterprise Server, is installed on a bare-metal server-class computer. The computer is cabled in to a LAN (data) network switch and a SAN (storage) switch. The computer is turned on, boots up with the hypervisor and then can run multiple full servers, in virtual machines, each with its own operating system and applications (for example, Windows/Exchange e-mail server and a Linux/Apache Web server) concurrently on top of that hypervisor operating system.

Hypervisors, therefore, control events happening in-box, within a computer. The moment data leaves the physical machine, bound out to another physical box via NIC to LAN or to storage via HBA to SAN (or NIC to NAS), it has left the hypervisor's control.

What does this mean or matter? Well, envision your nice multi-tier application—Web servers, app servers, databases—built in VMs and “cabled” inside single physical machine A. Works great.

Now, take one of those VMs, and put it on a different physical machine B (running a hypervisor) that's somewhere else.

Unless that physical machine is on the same LAN subnet, with the same SAN access, you've just broken your data center. The server (VM) you just moved is reaching out to LAN and SAN paths that don't physically exist anymore, because machine B doesn't have them.

Not to mention that little part about machine B already having to be running a hypervisor. (And, if you can't get to machine B to turn it on and install that hypervisor, such as in a DR case, then what? Don't say, “oh gosh, that will all be fixed by virtual HBAs and NPIV”, either. NPIV requires a full data-center rollout, supported across all hardware—got legacy switches, anyone?)

Virtual HBAs make a bad problem worse. Now you have even more bound HBAs. (Heaven forfend that you're allowed to change virtual HBAs like virtual MACs, that'd create an even worse problem, as without locking, people would duplicate HBAs and kill off entire SANs by accident.)

But, what's the solution? Ideally, you'd be able to do a few things:

  • Move easily between physical and virtual. Enable four-minute P2V/V2P conversions to increase hypervisor proliferation and adoption. Seamlessly move servers from running on physical machines to running on virtual machines—and back. This functionality provides a safety net for applications not suited for hypervisors, allows unlike hardware to be used for disaster recovery and high availability, and removes potential application vendor-support issues for hypervisors by providing a vehicle to test applications quickly on bare-metal hardware.

  • Move what was running on your physical machines around as easily as what was running on your virtual machines. So, you could remote-start machine B, above. (For bonus points, remote-start hypervisors like ESX Server or Xen too.) This then provides high availability on a per-physical-machine level. Note that the VMware VirtualCenter HA option (and the Xen equivalent) depends on the availability of another physical machine running ESX server with the right physical network and storage connectivity. Scalent creates such a machine in real time.

  • Take the network topology, LAN and SAN, with you. So, no more SAN configuration adjustments or opening all SAN LUNs to all machines (good grief). Ideally, you'd be able to install hypervisors that could communicate within your existing multi-tier networks and have real-time access to storage to enable features like VMotion. (Again, VMware ESX requires that physical machines share common pools of storage and the same network subnet if they are to share virtual machines in a VMware ESX farm or cluster.) This can present logistical or security concerns when applied to current data-center architectures. A good solution would allow VMware ESX physical machines to be in physically disparate locations—behind layers of LAN switches, in different racks or data centers, connected to storage LUNs as needed—thus allowing a simple ESX farm or cluster creation in existing data-center architectures.

This solution would manifest as infrastructure virtualization (IV) software—effectively, a data-center topology manager. IV software turns on physical computers, sets up the networks between them, sets up their connections to storage and points them at the right software (operating system/application package) to run.

For example, extending the above hypervisor example, at any time, IV software could turn on any three physical computers, with no software running on them and set computer one to network A/storage LUN 1, running the Windows/Exchange e-mail server package; set computer two to network B/storage LUN 2, running the Linux/Apache Web server; and set computer three to network C/storage LUNs 3 and 4, running a hypervisor, with the hypervisor load of virtual machines.

How would this look in real life? Let's consider a disaster recovery scenario.

Visualize a data center. There are many physical computers running various operating systems (Windows, Linux, Solaris, AIX, VMware ESX, Xen and so on) on various hardware (x86, SPARC, PowerPC), connected to various data networks and various storage LUNs.

Suddenly, a rack of computers in the corner goes up in flames.

Luckily, the customers are running IV software, as well as their existing automation, virtualization and bare-metal operating systems. That infrastructure virtualization or server repurposing software immediately realizes that the physical computers are down, hunts around the data center and finds some computers that are off or some that are running low-priority jobs, like print servers, that the IT staff has designated “repurposable”.

The IV software turns on or reboots those physical machines, assigns them the networks and storage connections of the burned machines and tells the computers to load the software to which the IV points them.

Unfortunately, there are only ten new working computers—20 went up in smoke. So, the IV software loads two of the new computers with a hypervisor and runs six of the burned systems as virtual machines on top of the physical computers running the hypervisor. This is possible because the IV software set up all the necessary network and storage connectivity for all those servers and associated them with the physical machine running the hypervisor in real time.

Pipe dream, you say? Not at all. The scenario just described is alive and well, running on Linux and controlling multiple other operating systems, in some of the world's largest data centers today. And, disaster recovery is only one of the benefits; others include performance, real-world test harness automation and eased hypervisor deployment.

So, enjoy server virtualization. Go deploy hypervisors; they're a good thing. But, don't be caught unaware when the sidewalk ends. There's more to virtualization than servers. Your infrastructure is waiting.

Kevin Epstein is the VP Marketing & Products for Scalent Systems, makers of Infrastructure Virtualization software. Prior to Scalent, he served as a Director for VMware, Inc., from 2002 until 2006, and previously for Inktomi Corporation's Network Products division, RealNetworks, Netscape and others. Kevin holds a BS degree in High Energy Physics from Brown University and an MBA from Stanford University, and is the author of Marketing Made Easy, a popular trade book from Entrepreneur Magazine Press/McGraw Hill.