How to build a supercomputer
Last summer, three lorries travelled across the north of England hauling an unusual cargo. For 150 miles, almost 500 computers were individually wrapped in a duvet for protection.
These computers once made up part of a 13-tonne supercomputer in Cheshire and were being moved to their new home at Durham University. Now, after months of painstaking installation, this second-hand supercomputer has become one of the country’s largest machines for astronomy research.
With 4.3 petabytes of storage, it has around a tenth of the storage of the fastest supercomputer in the UK (the Met Office’s Cray XC40), and has a processing power of 166 TeraFlops, compared to Cray’s roughly 7,000 TeraFlops.
Putting the computer together was a community effort, according to Lydia Heck, who has run the computers at Durham’s Institute of Computational Cosmology for the past 30 years. Researchers, students and professors volunteered to help piece the jigsaw back together, combining some of Durham’s old equipment with the computers transported in the lorries.
This included taking out each of the 2,400 storage disks from eight racks, reinstalling them and then spending weeks trying to find and correct errors.
Why supercomputers are so important
Supercomputers play a crucial role in scientific research because pushing the boundaries of knowledge requires increasingly powerful computers able to perform thousands of calculations each second.
Astronomy, chemistry, biology and weather forecasting all rely on the power of supercomputers. Supercomputers are not just the most powerful computers; they work in a completely different way to normal computers. They perform many calculations at once, known as parallel processing, rather than the computer you are reading this on which performs each task one at a time.
Typically, they are used for calculating complex problems, many of which are exponential, meaning each time you want to add a parameter, you double the power needed for a simulation.
Computing power has long been a limiting factor in astronomy, for example. In the 1980s, researchers were debating whether or not the solar system moved in a chaotic way, but only when more powerful computers were invented was the debate settled, it is somewhat chaotic.
Durham’s ‘upcycled’ supercomputer is now up and running, contributing to the UK’s national research facility DiRAC which specialises inresearch in particle physics, astronomy and cosmology. The new machine runs codes for research into particle physics, nuclear physics, astronomy and cosmology. Researchers from all over the country can apply for time on the machine, which can be accessed remotely from anywhere in the world.
But there is much more to running a supercomputer than meets the eye. The machine requires an uninterruptible power system, water and air-cooling, and has its own diesel power generators for back-up.
Walking into the room that houses the machine, in the ground floor of Durham’s Earth Sciences building, the first sensations to hit are the heat and noise.
“There are more than 500 computers in here, all screaming,” says Heck. Imagine the heat that can be generated from your own desktop computer or laptop, and multiply that by 500.
Cans of nitrogen worth £150,000 are held in the roof, poised and ready to deploy in the event of a fire. Since water would be no match for the energy given off by these machines, the only option would be to starve a fire of oxygen.
Next door lies the newest supercomputer’s little brother, a measly line-up of some 200 computers that is slowly being decommissioned. Each time one of the machines breaks, Heck says, instead of fixing it, the team is just letting it die.
Working constantly, some of the codes running on the computer last for millions of computing hours, meaning there is no room for a power cut, or even a dip. A generator supplies the computer’s power system, while two diesel generators on campus are there in case of emergency. “Last year we had to use these a few times,” Heck tells me, “but not so much this year, luckily – they make such a racket and don’t smell too pleasant!”
The DiRAC Data Centric HPC system installed at Durham has been enhanced by the deployment of COSMA6, a machine with 8,000 Intel Sandy Bridge cores and 4.3 petabytes of storage. The additional resource was needed to maintain the competitiveness of the research community being used by DiRAC for 12 months from April this year.
“The new HPC system at Durham University is a testament to the skills of all involved in the project who were able to re-install a second-hand cluster, add to it new RAM memory and design a solution that will prove invaluable to the research community”, comments Julian Fielden, Managing Director of computer service provider OCF. “We have a long history of working with Durham University so we’re really pleased to have been involved in such a unique project.”
And none of it would have been possible without Heck. What is clear after spending an afternoon with her and her supercomputers, is how much love she has for the machines – a feeling which is contagious. In her decades working at Durham, she has continued to work closely with students. She says it is because “they keep me young.”
Above all, Heck is passionate about inspiring young women to go into engineering, and although students may pass through the corridors and researchers could potentially even use the capability of the supercomputers without meeting Heck, she is a fundamental part of the university and the wider research community.
Images: Abby Beall