ATI Radeon 4000 series: full technical details review
A week after the ATI Radeon HD 4850 first broke cover, AMD has released full technical details of the card – and of its high-end brother, the Radeon HD 4870, which is due for release today.
The RV770 GPU that underpins the new series uses the same 55nm process as the RV670 found in the old HD 3000 series. Core speeds are, surprisingly, slightly down: the HD 4850 runs at 625MHz, compared to the 3850’s 666MHz, while the dual-slot HD 4870’s core speed of 750MHz is 25MHz slower than the HD 3870.
ATI has, however, made numerous internal improvements to the core, which the manufacturer claims enable it to leapfrog Nvidia’s current high-end cards.
The most dramatic development is a huge increase in the number of stream processors (or shaders) integrated into the core. Where the HD 3870 offered 320 shaders, the HD 4850 and 4870 have 800 shaders apiece, distributed across ten SIMD cores.
As ATI is keep to point out, this enables both cards to exceed 1 teraflops (1012 floating point operations per second), and represents the greatest amount of parallel computing power ever offered on a single consumer board, beating even the recent dual-GPU PCBs offered by both ATI and Nvidia. In comparison, the Nvidia GeForce GTX 280 offers just 240 stream processors.
ATI has also focused on scalability not just within the core, but across multiple cores: the company’s own figures promise that installing a second GPU in CrossFire mode will deliver a speed boost of between 60% and 90% in various games, including Call of Juarez, S.T.A.L.K.E.R. and Half-Life 2.
ATI is also working with physics simulation specialists Havok to allow their Havok FX physics engine to take advantage of HD 4000 series hardware. The goal is to allow secondary or tertiary graphics cards to perform physics processing duties as well as graphical rendering.
A third area that’s received special attention is texture units. Though the RV770’s 40 texture units represent a big increase over the RV670’s 16 units, it’s a rather smaller count than the 64 offered by the GTX 260 or the 80 found in the GTX 280.
However, by ramping up the bandwidth and rejigging the design so that each unit has its own L1 cache, ATI claims a render rate of 26.1 texels per clock – nearly double the rate of the GTX 280.
Though the emphasis is clearly on gaming, media applications have received an upgrade too: HDMI audio support has been boosted to 7.1, up from the previous generation’s 5.1. The Unified Video Decoder has also been updated, now allowing secondary video streams (such as Blu-ray picture-in-picture extras) to be decoded in parallel with the main stream and composited into a final view, all directly on the GPU. What’s more, the driver now exposes a video transcoding API, enabling third-party applications to use the GPU for various video operations.
The 4850 ships with GDDR3, but the new core also supports GDDR5, which is supplied with the 4870 variant. A stock RAM clock of 1.8GHz over a 256-bit bus gives an effective memory bandwidth of around 115GB/sec. This is slightly slower than the GTX 280’s 142GB/sec, but that’s achieved by using 1.1GHz GDDR3 over a 512-bit bus. ATI claims that using a narrower bus is one of the ways they’ve been able to simplify the chip, keeping down costs and heat.