Apple iPhone 6 design, features and release rumours: LIVE

Feedback

Intel Ivy Bridge Architecture Breakdown

HardwareFeatures
by Hugo Jobling, 24 Apr 2012Features

Intel's tick-tock development strategy has proved its worth over the years. Several ticks have brought existing architectures to new manufacturing processes, while their successive tocks have used the established manufacturing processes as the base for a new architecture. Bringing as it does the transition to a new 22nm manufacturing process Ivy Bridge is ostensibly a tick on Intel's development cycle but with this refresh Intel's engineers found enough time to turn Ivy Bridge into more of a tick-and-a-half, delivering tock-like improvements ahead of the normal schedule.

The CPU core itself has been given a small clock-for-clock performance boost, but the real improvements come in the integrated controllers and, most significantly, the integrated graphics which, have been given a significant overhaul. Ivy Bridge will be the first Intel CPU line-up to offer native USB 3.0 support, which debuts alongside PCI Express Gen 3.0. The biggest change, however, is applied to the integrated graphics, which although still no replacement for a dedicated card, offer a decent performance boost over Sandy Bridge and pave the way for further improvements in the future.

Arguably, the real boost Ivy Bridge brings is in power efficiency. At the low end, the Core i5-3470T has a TDP of just 35W while the top-end Ivy Bridge Core i7-3770k has the same 3.5GHz base clock, and 3.9GHz turbo maximum as the Sandy Bridge Core i7-2700k, but with TDPs of 77W and 95W respectively, it's clear that Ivy Bridge is a much less power hungry platform. This, coupled with a few other changes, also means that Ivy Bridge CPUs should be able to overclock better than their Sandy Bridge predecessors. For details of just how Ivy Bridge performs coupled with a Z77 chipset, I direct you over to Leo's performance review of the Core-i7 3770k.

Ivy Bridge processors use the same LGA-1155 socket as Sandy Bridge, and are, therefore, backwards-compatible with the previous generation of chipsets. However, although this provides the stepping stone upgrade path of dropping a new CPU into an existing system, a number of Ivy Bridge's features are only unlocked with a the 7-series chipset that Intel is launching alongside this new CPU range. All 7-series motherboards offer four USB 3.0, and a pair of 6GB/s SATA ports natively, and most board makers are providing more 6Gb/s SATA ports and USB 3.0 ports via third party controllers. Z77 and H77 support Intel's Smart Response Technology, which enables an SSD to be used as a cache for a mechanical drive, but Z75 doesn't; all three chipsets combined with an Ivy Bridge CPU provide 16 PCI Express gen 3.0 lanes in varying configurations.

The dawn of 22nm

Intel's 22nm manufacturing process marks the first commercial products using tri-gate (or 3D) transistors, which the company has been working on for some 10 years. Although simply continuing to shrink its existing transistor technology to 22nm would have been possible, the move from planar to tri-gate transistors delivers a much more significant improvement. As well as boosts to performance and power efficiency, the move to tri-gate transistors also boosted the transistor density of Intel's 22nm process to around twice that of 32nm. Where a Sandy Bridge processor packed approximately 1.16 billion transistors into a 212mm^2 die, Ivy Bridge's 160mm^2 die features close to 1.4 billion transistors.

The 50-thousand foot view of how these new transistors work isn't overly complex. In both tri-gate and planar transistors, the on and off state is determined by whether a gate allows current to flow through a dielectric. In planar transistors this dielectric material sits on a silicon substrate, and the gate is then layered over the dielectric, giving a two-dimensional contact point, limiting the area around which the gate can control current flow. Tri-gate transistors improve upon this by raising the dielectric up on a vertical fin, around which the gate is then wrapped.

The benefit is two-fold: when the transistor is in its on state, a greater surface area is available for current to flow through, and when the transistor is off, electrons are forced out of the conducting channel faster and more forcefully. This improved switching characteristic means that Intel's 22nm tri-gate transistors operate up to 37 per cent faster than its 32nm planar transistors at low voltages, with a still worthwhile 18 per cent improvement at higher voltages. The improved performance at lower voltages also gives Intel the option to reduce operating power by more than 50 per cent with the same performance as on its 32nm process.

Although Ivy Bridge undoubtedly benefits from this 22nm tri-gate transistor underpinning, Intel's future mobile platforms are going to see a particular boost. Power efficiency isn't so much of an issue on the desktop, but in its processors designed for laptops, tablets and mobile phones, this is arguably the priority concern. Although other processor manufacturers are working on similar technology, Intel is a number of years ahead of the curve, and will likely have time to establish a considerable foothold in the mobile market currently dominated by ARM, should it chose to do so.

Core changes

The majority of big changes with Ivy Bridge have been made outside of the core, but Intel's engineers still found time to provide some clock-for-clock performance improvements. Instruction set updates add support for conversion between 16-bit floating point and 32-bit single precision floating point numbers, enabling memory savings at the cost of fidelity; new instructions for ring-3 access to FS and GS base registers, offering improved access for thread-local storage; and MOVSB/STOSB operations to augment the existing MOVS/STOS instructions, which abstract away architecture specific optimisation.

Ivy Bridge adds a pair of security-focussed features in the form of a Digital Random Number Generator (DRNG), and Supervisory Mode Execution Protection (SMRP). The DRNG enables Ivy Bridge processors to produce standards-compliant random numbers, for use in applications that require them - the canonical example being cryptography - with a throughput of up to 3Gb/s. SMEP, meanwhile, protects against a certain type of vulnerability that can enable malicious programs to execute code at restricted privilege levels - the caveat is that this protection must be enabled.

Updates to the way Ivy Bridge CPUs handle caching aim to improve performance in a number of edge cases. The Adaptive Fill Policy looks for applications streaming large amounts of data and stops them from throwing other data out of the CPU caches, a Quad-Age Least Recently Used (QLRU) has been implemented to allow finer grained control over what data should be purged from the cache first, Dynamic Realtime Prefetch throttling reduces the amount of pre-fetch requests made when bandwidth is constrained, ensuring that the CPU doesn't add additional pressure in high-load situations, and channel hashing works to improve distribution of DRAM access over the available channels.

Power management, memory, PCI Express

Ivy Bridge offers a small improvement in power consumption with the introduction of DDR I/O power gating. This works by turning off the memory controller when not in use, stopping power leakage, and Intel claims around 70mW of power saving in DVD playback at the cost of a little latency. It doesn't sound like much but over the course of watching a two-hour film that could have a noticeable impact on battery life in a laptop. Also aiding in power saving is a new ability called Power Aware Interrupt Routing (PAIR), a configurable option which toggles the CPU between two different states of handling interrupts: power or performance. In performance mode, powered down cores will be woken up to handle interrupt instructions if needed, whereas in power mode, the CPU will favour using already active cores, at the cost of a slight processing overhead.

Taking advantage of the performance characteristics of the underlying 22nm tri-gate transistors, select Ivy Bridge processors will offer a number of different TDP configurations, which OEMs and users can switch between. A CPU could, for example, offer a 'TDP' down mode for when operating on battery power and a 'TDP up' level for when plugged in, with the CPU adjusting voltages and clock speeds to suit. Although making those changes manually has been possible previously, Intel now guarantees the performance characteristics at these different TDP levels.

Ivy Bridge, like Sandy Bridge, offers two DDR3 channels, but now supports up to 1,600MHz memory officially, up from 1,333MHz, and overclocked speeds of 2,800MT/s up from 2,133MT/s. A new 200MHz frequency step has also been added, providing more options when matching RAM and CPU clock speeds. When overclocking on the CPU side, the same base clock limitations as Sandy Bridge apply, but the max base to core clock ratio is raised from 57 to 63, and can now be changed from within the operating system, although how useful that ability will prove is debatable.

Ivy Bridge brings with it an updated PCI Express controller, which now offers gen 3.0 support natively. Gen 3.0 support is layered on top of the Gen 2.0 support implemented by Sandy Bridge. Intel's implementation officially offers up to 8GT/s, but has been tested delivering a throughput of greater than 12GT/s - almost twice that available from PCI Express gen 2.0. Intel also implemented PCI Express' bandwidth management and Active State Power Management support. According to Uncore Lead Architect Rob Milstrey that wasn't the easiest task, but the decision was made early on that Ivy Bridge would support the complete PCI Express gen 3.0 specification.

Graphics

This leaves only the biggest change that Ivy Bridge brings over Sandy Bridge to discuss, the updated GPU. Although Intel won't say exactly how much of the increased transistor count of Ivy Bridge is dedicated to the GPU, comparing die shots shows Ivy Bridge has a much larger percentage of its area dedicated to graphics. Despite this increase in GPU size, Ivy Bridge's 22nm architecture enables the GPU to deliver either the same performance with half the power draw or double the performance-per-watt compared to Sandy Bridge. In practical terms this will most benefit laptops, where the power savings will be very welcome.

As well as improving performance, the architecture of Ivy Bridge's IGP is also intended to be more easily scalable in future incarnations by splitting the GPU into five of what Intel calls domains. First is the Global Assets domain, which deals with geometry and setup, and also features a hardware tessellation unit, required to deliver the DirectX 11 support this GPU provides.  The Slice Common houses the rasterizer, pixel pack-ends and an L3 cache. A GPU-specific L3 cache wasn't included in Sandy Bridge, but Ivy Bridge's GPU uses it to reduce bandwidth on the ring bus that runs between the GPU and CPU.

Domain three, called the Slice contains the GPU shaders, or Execution Units (EUs), caches, texture samplers and media sampler. The number of these is a fairly good indicator of the likely performance of the IGP - Sandy Bridge's top-end had 12 EUs, while Ivy Bridge gets 16. Image quality is also said to be improved, thanks to better anisotropic filtering.  Domain four then deals with fixed-function media and CODEX tasks while domain five drives the IGP's displays. Ivy Bridge ups Sandy Bridge's two display maximum to three, although the combination possible will depend on what outputs motherboard makers provide.

As well as this rejig of how the GPU is laid out, Intel has also taken the time to improve the performance of the components within, increasing the instructions per clock the graphics can process, in a further boost to both performance and efficiency. In the right circumstances, Ivy Bridge's GPU is significantly faster than Sandy Bridge, but even normally an up to 60 per cent improvement is expected.

Also of note is that Intel's power management takes account of the performance of both while sticking within its TDPs, such that when the CPU is sitting idle, more power is made available to the GPU should it need it, and vice versa. None of the improvements Intel has made to the IGP with Ivy Bridge will let it replace a dedicated card for high performance tasks such as heavy GPGPU number crunching, or of course gaming, but what Ivy Bridge's graphics do accomplish is to render (no pun intended) low-end dedicated cards all but redundant.

Summary

As might be expected of a tick development, Ivy Bridge's improvements aren't sufficient to make it a compelling upgrade option for those with an existing Sandy Bridge system.  The real success of Ivy Bridge will be that Intel has now set itself up well for the future with a successful transition to 22nm and the much improved tri-gate transistors that process uses. Once mobile Ivy Bridge CPUs start making their way into laptops the benefits the update brings will become far more noticeable than on the desktop, where energy efficiency isn't usually as much of a concern as outright performance.

Topics
blog comments powered by Disqus