Running the Numbers

Doc Searls

Issue #69, January 2000

According to Benjamin Disraeli “There are three kinds of lies: lies, damn lies and statistics.” But, according to Jon “maddog” Hall, “There are three kinds of lies: lies, damn lies and benchmarks.”

Businesspeople and journalists have at least one thing in common—they love numbers. Their appetite for numbers constitutes the enormous demand market that keeps think tanks, research firms and other professional guessticians in business.

The “suits” have a legitimate need for data to make informed business decisions. Good data often mean the difference between life and death for a company. We writers have a need no less compelling, though far less legitimate. We need to tell stories, and numbers make great story material—especially when we turn number-packed spreadsheets into pretty pictures worth a thousand words. Hey, it saves us from writing, and as the graphics people will tell you, readers look at pictures first.

Admit it; you looked at this graphic first, didn't you? Of course you did. Well, the graphic tells a lie. I made up the numbers, the spreadsheet made up the graphic, the artist made it pretty, and here we are, making a point. It's not just that “figures lie and liars figure”. It's that we rely a great deal on both. Take it from an old liar, or as I hate to admit, a PR guy. Let me tell you a true PR story.

The year was 1988, and I was working for a hot networking company that sold more than half the network connectors in its market. Since that market consisted of boxes sold by one company and sales figures for it were easily (though not precisely) estimated, we could figure out our market penetration—or something close to it.

However, there was a hitch: we competed directly with that box company (and lived at their mercy), so we didn't want to release our actual sales figures. Yet we wanted to publicize our success. We knew our story would be easier for editors to accept if our numbers were “objective”. Since editors go to industry research firms for their objective data, we had to get those firms to OEM our numbers for us.

We created a nice “internal” graphic showing our best guess—55% market penetration—and called Analyst A at the biggest research firm. Would he like to know how we were doing? Of course! So we sent him the graph. A few days later, after the firm finished digesting this “fact”, we started referring editors to the firm. Sure enough, new graphs began to appear with stories of our success, all listing the research firm as the “source”.

Were we lying? No. We were simply dropping our best facts into the bottom end of the data food chain, having faith that it would find its way to the top. Now here I am, at the top of the same food chain, and I see very few Linux companies which know how this system works. I think the problem is techies are too honest and too literal. Most Linux companies are run by techies—guys like you—if we trust our own readership figures (from an “objective” source, of course).

Take an issue like host web servers and operating systems. At Netcraft's site, http://www.netcraft.com/whats/, you can discover, for example, that the Microsoft Network is running Microsoft-IIS/4.0 on NT3 or Windows 95. That information is returned when Netcraft interrogates the site host, copied out of the browser and pasted right into this text. Can we trust Netcraft, or the information it automatically obtains? Our techies here at Linux Journal say “No.” In fact, one techie believes MSN may actually be hosted by UUNET using BSDI. There's no way to tell by using Netcraft's method, because it's too easy for the host to spoof out a wrong answer, just to be perverse.

I still wanted that kind of data, however qualified, so I went ahead and put a chart together with the results of Netcraft's interrogations of the top 25 U.S. hosts (see “Work Still Cut Out” in upFRONT). Highly qualified findings are better than no findings. Some interesting data are there, such as Hotmail, a Microsoft property, running Apache on FreeBSD; and some significant ones, such as Linux running on only two (Real Networks and Go2Net). These findings address the concerns of several parties—regular readers of Linux Journal, BSD advocates, “suits” who need unbiased numbers and others—who feel that Linux Journal should remain as informative and unbiased as possible, while still advocating Linux to the world.

No problem with doing that, but we need the numbers. The numbers we get from research firms will be no better than those obtained from Linux vendors and other involved parties.

Linux IPOs per Moon Phase

Let's look at two of the oldest and largest research firms operating today: Gartner Group and International Data Corp. In recent months, the Linux community has made the most of IDC's very positive numbers, which show Linux as the fastest-growing server operating environment. At the same time, many in the Linux community bristled at Gartner's research findings, which called Linux “hype du jour” (among other unkind things) in its report, “Will Linux Be Viable Competition for Windows Desktops?” A Google lookup finds over 2,000 pages mentioning both Gartner and Linux. Most of those flame Gartner for its cluelessness. One anonymous coward on Slashdot writes, “The reality is, while the Gartners of the world are fudding away, Linux is already installed at all levels. Get a clue, PHBs [bosses]. Now, why would anybody pay Gartner for advice?”

The answer is simple: because they need objective information from somebody, and they're not getting it from you. Flames are easy. Facts are hard. Like them or not, Gartner and IDG traffic in facts or the most educated possible guesses. If you have good hard information on how Linux is changing the business world, let them know. Or skip them, and let us know. We're in the same business. That's what you—our readers—pay us for. No lie.

Doc Searls is the Senior Editor for Linux Journal. He can be reached via e-mail at info@linuxjournal.com.