The best way to understand the features and capabilities of the Cortex-A15 is to keep the chip’s heritage firmly in mind. Each iteration of the Cortex-A family has increased CPU execution efficiency and performance, but always with the goal of remaining in a low-power envelope. The Cortex-A15 is meant to continue this tradition, but it’s also the first chip ARM has ever explicitly aimed at the low-end server market.
The logical evolution of Cortex-A9
The A15 is exactly what you’d expect from a company that wants to target very different usage scenarios with the same processor. It’s a 32-bit chip that supports 40-bit physical addressing, multiple power domains, hardware-level virtualisation, and several new instructions to the ARMv7 ISA. The L1 cache is twice as wide (128-bit, up from 64). The Cortex-A15 can decode three instructions per clock cycle (the A9 could only decode two), and it can issue eight micro-ops per cycle compared to the A9′s four.
The A15′s branch predictor is more advanced than the A9′s, it can execute a greater range of instructions out-of-order, and it can execute 128-bit NEON/SIMD instructions in a single cycle. It can execute a pair of Load/Store commands simultaneously and supports multiple clock domains.
These enhancements make the Cortex-A15 significantly more powerful than the Cortex-A9; Anandtech’s recent benchmarks of a new Chromebook from Samsung show it outpacing Intel’s dual-core/quad-threaded Atom as well. This is unsurprising. Virtually all of the improvements Intel has made to Atom since it launched the core in 2008 have been on the power efficiency side of the equation – the chip’s performance has scarcely budged.
One of the Cortex-A15′s most important tools isn’t actually part of the CPU. Server chips require sophisticated linkages and cross-communication capabilities, but it didn’t make sense to build those components into the base CPU design when the average smartphone/tablet would never need them. ARM’s CoreLink CCI-400 (Cache Coherent Interconnect) is a separate block of silicon that connects the CPUs, MMUs, graphics, Ethernet, and memory controller.
The CCI-400 is integral to ARM’s plans to push into servers. Each component attaches to the CCI using a 128-bit data path, and the unit is designed to run at 50 per cent of the clock speed of the Cortex-A15s. That works out to a clock speed of between 500MHz and 1.25GHz. ARM is talking up the CCI-400 as a key component of big.LITTLE, its pairing scheme that combines Cortex-A15 and Cortex-A7 CPUs on the same SoC to maximise execution efficiency, but we expect the chip will primarily be used to connect dense server SoCs.
Putting the A15 in context
Based on the figures we’ve seen to date, the Cortex-A15 is one of the fastest – possibly the fastest – mobile architecture currently on the market. Much depends on implementation; smartphone/tablet performance can vary considerably depending on cache size, RAM speed, the number of memory channels, and of course the power envelope.
The Cortex-A15 will show up in mobile phones, but this slide, from ARM’s own presentations, puts a major emphasis on tablets, notebooks, NAS’s, routers, and servers. For now, the A-15 will likely be confined to high-end phones. Even there, battery life will take a hit, if ARM’s guidance on frequency is anything to go by.
For now, the best way to think about the Cortex-A15 is as the architecture that puts the greatest emphasis on performance, with a subsequent hit to power consumption. It may not be as flexible as the Cortex-A9, Krait, or even Atom, but it could rack up huge wins in tablets and low-end Windows RT notebooks, where batteries are larger.
Does the Cortex-A15 threaten Chipzilla’s mobile plans?
There are people who will look at the performance gap between Atom and the Cortex-A15 and conclude that ARM has won the day – game, set, and match. This is inaccurate. The current 32nm Atom parts (Medfield and Clover Trail) draw significantly less power than the old 45nnm hardware. More importantly, there’s the fact that no one will be squeezing a 1.7GHz Cortex-A15 into a phone any time soon.
Intel’s first efforts in phones and tablets have been aimed at carving out a competitive space for itself in midrange markets. The Cortex-A15, in contrast, is a high-end part. It won’t face serious competition from either x86 manufacturer until Intel launches its 22nm Atom refresh and AMD’s 28nm Kabini debuts. Both of these events are scheduled for mid-2013.
We expect ARM’s various partners to make more noise around the Cortex-A15 in servers, cloud devices, and tablets/notebooks than smartphones. It’s not a threat to Intel’s enterprise interests at the moment, but it could present Santa Clara with a problem long term if ultra-dense servers, such as those produced by AMD/SeaMicro, start gnawing into the traditional x86 market.
For more on ARM and the future further down the line, check out our performance analysis of the ARM Cortex-A57 and A-53 vs. Cortex A-8, A-9, A-15 and A-7.