To Open The Sky

The Front Pages of Christopher P. Winter

HOME Dreamers'
Wine Robert
Goddard Space Age
Visions Cassini
Mission CATS
Quest Open
Skies Open
Minds Clothe
Minds Fellow
Travelers Distant
Orisons Enduring
Erudition

Kayak
Computer

Adding a Second CPU to the HP Kayak

Getting Workstation Performance in a PC

When it comes to heavy-duty number-crunching, those in the know have long preferred the traditional workstation over the personal computer. (Here I exclude mainframes and supercomputers, because I'm considering only those systems that a wealthy individual or a small group can afford.) The reason comes down to two words: BANDWIDTH and MEMORY. A PC might have a fast CPU, but if it can't move data from memory and to the display at rapid rates, it's not very effective for computation-intensive applications. Such work also typically involves large data structures, so oodles of random-access memory are a virtual necessity. The PC is only beginning to match traditional workstations in memory capacity. Although it gets around this lack by swapping data to the hard disk, performance takes a big hit.

These are the major factors that make the traditional workstation vastly superior to the PC for sheer number-crunching ability. The operating system can make a big difference too; there are reasons that most workstations run some flavor of UNIX.1 Another difference is system integration — as anyone who has struggled to get a new video card working in an off-the-shelf PC can attest. And finally, workstations tend to use top-quality components, thus achieving better hardware reliability.

All that said, the steady improvement in PC speed, memory size and bandwidth, and operating-system quality blurs the boundary between it and the workstation. Of course, the proof is in the performance. So let's look at some benchmark numbers and see how some PCs — specifically models using the Pentium 3 and newer processors — stack up against the numbers turned in by workstations in scientific tasks.2

		Peak	Memory
System Name	Number	Throughput	Bandwidth
or Description	of CPUs	(MFLOPS)	(TRIAD)
Benchmarks for the Altix, a supercomputer of Cray lineage, are presented solely to induce envy.
SGI Altix 3000	512	3072000.0	125978.5
Cray X1	1	12800.0	3002.9
Apple Power Mac G5	1	8000.0	220.5
HP Integrity rx1620	1	6400.0	628.9
Intel 875 P4-2800	1	5600.0	339.4
ASUS P4T533 P4-2800	1	5600.0	281.1
ASUS SK8N Opteron 248	1	4400.0	585.2
AMD Opteron 248	1	4400.0	393.0
Cray SV1ex	1	2500.0	448.0
Sun Blade 1750	1	1500.0	111.3
Tyan S2518 P3-1266	1	1266.0	63.8
Cray C90	1	960.0	1187.6
IBM RS6000-43P-260	1	800.0	97.0
Tyan P3-800EB	1	800.0	69.3
Intel 440BX-2 P3-650	1	650.0	62.5
SGI Octane 300	1	600.0	56.9
Intel 440BX P3-600	1	600.0	51.1
HP Kayak P2-300	1	300.0	23.5

There are three things to note about the above benchmarks.

The first is that PC performance is approaching the workstation level. In some cases it even exceeds that level. Consider a workstation like the SGI Octane2, which a friend is using for CFD work. According to the numbers here, it is surpassed by a Pentium 3 not much faster than the ones I now have. Of course, benchmark numbers can be misleading because they involve so many variables. For example, workstations are typically better at floating-point calculations which are the heart of CFD. And numbers are never the whole story: The Octane will be more reliable than most PCs, and my friend undoubtedly has a service contract that assures him no downtime longer than a weekend.

The second thing is that, again purely by the numbers, machines like the Cray C90 — a supercomputer of yesteryear — run a poor second to the latest personal computers.3 But again, there is more to the story. The Cray can use up to 16 processors, and its performance scales accordingly. Note too its greater memory bandwidth, which lets it make good use of those multiple processors. Many PCs can handle two processors; but few can handle more than that. And their performance is limited by I/O bandwidth — as witness the continuing push for faster expansion slots (PCI, AGP and now PCI Express) and memory (SDRAM, RDRAM, DDR, DDR2).

Thirdly, and more specifically related to my goals for the Kayak, these benchmarks confirm that there is life in the old Pentium 3: A system using two 1GHz P3s should turn in a very respectable performance.

1 Support for multiple processors is one area where UNIX has a long lead over Windows. There are many others. The advantages of Windows are affordability (vs. most UNIX distros) and better peripheral device support.

2 These numbers come from John McCalpin's pages on the STREAM benchmark, designed to test memory bandwidth as well as CPU power.

3 This, of course, is just the march of progress. Personal computers will continue to get faster, will handle more processors and more memory, will do floating-point better, and so on.