Retro: Benchmarks through the ages
Take a look at PC Pro Issue 150 p144 and you’ll see our rough-and-ready interpretation of the raw processing power of a CPU since PC Pro’s first issue in 1994. But this is only a basic indication, based on maximum theoretical FLOPS (floating-point operations per second). It’s a purely theoretical number based on the notion that the floating-point units can complete a certain number of operations per clock cycle. For early processor designs this was simply one operation per clock, hence a 333MHz Pentium II could, in theory, manage 333MFLOPS (a MFLOPS is one million FLOPS).
Real performance benchmarks, on the other hand, are based on hundreds of other factors. You’ll notice that the theoretical MFLOPS performance didn’t alter between our issue 100 and 125 average Labs-test PC, because both had 3GHz Pentium 4 processors. The maximum theoretical floating-point performance of any Pentium 4 is simply the maximum number of floating-point operations per clock, multiplied by its clock frequency; the 128-bit SSE registers in a Pentium 4 allow four floating-point operations in a single clock cycle in the ideal case, making for a maximum FLOPS rating of 12,000MFLOPS (or 12GFLOPS). But in fact the Pentium 4 variant in our later tests was a superior part for real-world applications, having more cache memory and Intel’s HyperThreading system, which presented the system with a second virtual processor constructed from idle resources.
In practice, of course, real-world performance is about the complete PC, which is what PC Pro’s application-based benchmarks have always been about. In fact, in our very first View from the Labs column in issue 1, inaugural Labs editor Ian Mason talked about application-based benchmarks and the difficulty he’d been having tuning them.
Application benchmarks are the only way to really measure true performance, especially today. Most architectural enhancements in processors – such as branch prediction, out-of-order execution and speculative instruction fetch – rely on the typical non-linear nature of applications. Feeding them synthetic tasks that simply repeat the same tight loop over and over again doesn’t simulate that.
Second, of course, the processor is only one of half a dozen or so key components that affect performance. The speed of ancillary components is more important in the real world for many apps – hard disk speed chief among them – and the bandwidth of internal buses and interconnects is a major bottleneck when it comes to shuffling data from processor to memory and graphics card.
With that in mind, how does a machine from several years ago fare against one of today?
PC Pro’s Labs have relied on application benchmarks since day one.
Back in issue 1 of PC Pro, the PCI bus was the height of new technology. Running at 33MHz with a 32-bit bus width to the ageing ISA bus’ lowly 8MHz at 16 bits, it could push around 125MB/sec between a graphics card and main memory. The (now obsolete) AGP port wouldn’t be invented for another three years, and the idea of a serial interface for anything that needed seriously fast transfer rates was pure fantasy; parallel buses were clearly superior, since you could push a couple of bytes per clock down the pipe as opposed to just one measly bit.