Faulty switch drops the BBC from the web

UPDATED 30-03-2011 10:17

Richard Cooper, the BBC's controller of digital distribution, has offered a more in-depth explanation for the outage. See the final paragraphs for details.

The BBC's on-line services had a bit of a breather last night as a failure in its network infrastructure left it inaccessible for around an hour.

Every BBC web-based service was affected by the outage, which caused a certain amount of consternation on social networking services like Twitter - on which site the outage spawned the hashtag '#bbcblackout.'

The outage left the BBC website and ancillary services like media-streaming service iPlayer inaccessible for the duration of the outage - which appears to have been caused by a critical piece of network hardware giving up the ghost.

While most sysadmins will be aware that it's important to avoid creating a single point of failure in network infrastructure, nobody appears to have told the BBC's network elves. The failure of a single switch, it seems, is enough to take down the entire infrastructure - including internal intranets and staff e-mail systems.

"Love the terse bulletin on last night's BBC web failure," the broadcaster's technology correspondent Rory Cellan-Jones explained on microblogging service Twitter. "Cause of issue: faulty switch. Services impacted: everything.

"Turning it off and on again worked. Lucky nobody noticed it was down," Cellan-Jones quipped following the restoration of the service last night.

"I'm afraid that last night we suffered multiple failures, with the result that the whole site went down," director of digital distribution Richard Cooper has revealed in a blog post. "Enough of the systems were restored to bring BBC Online pretty well back to normal by 23:45, and we were fully resilient again by 04:00 this morning.

"For the more technically minded, this was a failure in the systems that perform two functions. The first is the aggregation of network traffic from the BBC's hosting centres to the internet. The second is the announcement of 'routes' onto the internet that allows BBC Online to be 'found.' With both of these having failed, we really were down!"

Cooper explains that his team will be taking 'a very hard look' at how similar outages can be prevented in the future.