A closer look at Intel’s Broadwell delay and defect density problems

Intel has just announced that the introduction of its next-generation chips, codenamed Broadwell, will be delayed. The company demoed the 14nm processors at IDF last month, and claims that the new chip can cut power consumption a further 30 per cent below the reductions we saw with Haswell thanks to the move to 14nm process technology, down from 22nm.

Broadwell is also set to be significantly smaller than Haswell – small enough to fit into tablets and form factors without requiring a fan, and supposedly incorporating a GPU that pushes performance up a further 40 per cent over the current generation.

During its Q3 conference call, Intel CEO Brian Krzanich noted that the issues facing Broadwell are technical (as opposed to marketing related), saying: “It was simply a defect density issue.”

The chip will begin production in the first quarter of 2014. Intel claims that it’s “comfortable” with yields, but is still baking in fixes and changes to the core to better improve its standing.

This is unsurprising – but what Intel dismisses as “just a defect density” issue is, in fact, profoundly at the heart of the problems facing modern semiconductor manufacturing.

How defect densities wreck cost curves

As semiconductor nodes shrink, the difficulty of building ever-smaller transistor layouts becomes increasingly acute. The shift to double-patterning can increase defect densities on its own, the fundamental limitations of 192nm lithography are a constant pressure, and the need to ensure ever-higher levels of control over dopant distribution and voltage characteristics are slamming up against the fundamental limits of physical laws. Defect density is a metric that refers to how many defects are likely to be present per wafer of CPUs.

It’s important to understand that defects aren’t binary. Chips don’t just work or not work. A chip may work perfectly but consume more power than intended. Imperfect dopant distribution or nanometre-size errors in transistor placement can cause issues related to frequency scaling. The problem with low-power, low-cost cores is that the manufacturer needs to tightly control both binary work/don’t work defects and smaller problems that don’t destroy the processor, but prevent it from hitting power targets.

One way of lowering the impact of defects is to build redundant circuit paths within the processor itself. All manufacturers build in a degree of redundancy, but when manufacturing tolerances are being tightly squeezed, adding redundant circuits also pushes up complexity. A balance must be carefully struck to ensure that the evaluation and duplicate structures don’t end up exacerbating the problem.

Consider the impact of defects that cumulatively increase CPU TDP by 50 per cent. A 50-75 Watts desktop chip now has a TDP of 75-112 Watts – well within the cooling capabilities of a modern tower. A 17 Watt laptop chip at 50 per cent more TDP can fit into any chassis capable of handling a 25 Watts TDP. But a tablet chip, already borderline at 5 Watts, may be pushed out of the space altogether if it hits the 7.5 Watt mark. With Intel fighting hard to shake the perception of x86 chips being too power-hungry to fit into ARM-competitive form factors, it’s imperative that each generation of x86 processor deliver dividends on this front, even if it costs in terms of top-end performance, as it did with Haswell.

All the goals make sense, but the chips have to be yielding optimally to drop them into place.

Expect similar announcements in years to come

Intel’s troubles in this area should be considered a bellwether for the industry. It’s not that companies will stop advancing, but the rate of next-generation ramps is going to slow as manufacturers struggle to ramp products through an increasingly uncooperative chain. From extreme ultraviolet lithography to the 450mm wafer transition, some of the best engineers on the planet are trying to build equipment that can continue scaling, even as the cost per square millimetre of silicon increases at 20nm for the first time, ever.

With GlobalFoundries and TSMC still ramping 20nm, Intel’s 14nm delay shouldn’t impact the company’s roadmaps or the lead it has opened up over its competitors. TSMC is working to ramp 20nm and 16nm FinFETs simultaneously, with the former debuting in 2014 and the latter launching in the 2016 timeframe. GlobalFoundries, Samsung, and IBM are pushing ahead with plans for a hybrid 14-20nm process, in which chips would marry 14nm front-end manufacturing with 20nm interconnects. The result (if it works) would be a chip with 14nm-style power consumption and performance, but 20nm size.

GlobalFoundries hasn’t issued firm guidance on when it expects to start ramping 20nm, but 2014 is the generally accepted date, with the 14nm technology coming along one or two years thereafter as well. In both cases, slowdowns and delays could impact customers or the foundries themselves – not because of any inherent flaw, but because the scaling has become so difficult. Moore’s law’s long-term prognosis may be gloomy, but there are still options for boosting enthusiast performance.