Beyond mere megapixels: The smartphone camera of the future

Smartphone cameras have come a long way – moving from being a convenient way to share a mediocre snapshot, to near pro-quality image capture tools in the right hands. While the old benchmark of resolution seems to have topped out, innovation is accelerating in many areas of mobile camera technology, so we can look forward to continuing upgrades as new phones are released.

Bigger, better pixels

After years of racing towards higher megapixel counts, camera vendors have finally come to realise that more isn’t always better. More pixels packed into the same size sensor means smaller pixels. Smaller pixels typically have more noise than their larger brethren – they capture less photons in a given time period, so each stray photon or electron contributes a larger percentage amount of noise. Tiny pixels also run closer to the diffraction limits of optics – particularly the inexpensive kind found in phones – so the added resolution gain isn’t really all it’s cracked up to be. In some high resolution cameras, a 50 per cent increase in pixel resolution only equates to an effective resolution boost of around 10 per cent.

HTC has led the way in the retro effort to go back to fewer, larger pixels. Its 4-megapixel Ultrapixel cameras feature sensor sites that have three times the surface area of rival 13-megapixel cameras – 4 square micrometres versus 1.3. In a somewhat odd move, Nokia has also swerved from offering the uber-resolution 41-megapixel Pureview 808 to trumpeting the “good enough” 8.7-megapixel resolution of its new flagship, the Lumia 920. In exchange, the Lumia 920 picks up amazing low light performance, and is going head-to-head with the HTC One in that category.

And what’s different to HTC’s Ultrapixels is that there doesn’t seem to be any one technical change or breakthrough that gives the 920 its excellent low light performance. Instead it seems to be a combination of high fill factor from its back-illuminated sensor, better optical image stabilisation, a Zeiss “low light optimised” f/2 lens, and lots of fancy noise reduction and image processing done by the phone immediately after the capture.

Faster, cheaper focusing

Autofocus (AF) has been a major source of irritation for smartphone and point-and-shoot camera owners alike. Never fast enough to capture quickly moving action, AF speed has been one of the features that keeps DSLR makers in business. Smartphone makers are moving to change that. HTC’s ImageChip 2, for example, offers 200ms full-range autofocus, a fraction of the second or more found in previous phones.

Traditionally, camera phone autofocus modules have relied on voice coil motors (VCMs) to move lens elements in order to focus. Sending a current through a coil causes the elements to move towards a magnet. Removing the current allows a spring to pull the elements back in the other direction. The VCM is used to move the focus a little bit at a time, with an image captured and evaluated at each step. A typical camera can have as many as ten or twenty steps between near and far extreme focus, which can easily mean a second or more of total time to acquire focus. No spring chicken, VCM technology dates back to the first telephone, and comes with baggage in the form of heat, noise, and lack of precision.

Microelectromechanical (MEMS) technology offers the promise of improving AF speed, while reducing the size and power consumption of camera modules. MEMS devices use an electrostatic charge to pull two comb-shaped surfaces together, much like a solid-state version of what happens in a VCM. DigitalOptics Corporation (DOC) has created a MEMS-based AF system that requires only one moving lens element (instead of moving the entire lens assembly like many VCM systems). MEMS also allows DOC to create thinner camera modules, helping to make thinner smartphones a possibility. DOC claims that its MEMS-based system reduces lens tilt during autofocus, which in turn reduces image distortions including vignetting.

DOC is planning to sell a 5.1mm tall, 8-megapixel camera module with its MEMS-based AF technology (that it calls mems|cam) to Chinese smartphone makers. Its high-end solution doesn’t appear to come cheap though: The list price for 10,000 units of its camera module with included ISP is $25 (£16) per module. DOC claims its solution offers similar AF speeds to those of the nippy HTC One, with peak power consumption of only 1mW for the AF functionality. Faster AF speed also makes some interesting post-processing options available. DOC has demoed its mems|cam module capturing six frames with different focus depths in quick succession, allowing Lytro-like DOF fiddling after the fact – at least if not much moves in the frame during the exposures.

Lower end phones have typically not even bothered with autofocus, instead saving the dollar or so in cost that a typical voice-coil-based AF element adds. Startup LensVector is hoping to address that need with its low cost Liquid Crystal (LC) AF element. By cleverly re-aligning the orientation of the liquid crystal molecules in a gradient, its AF element can change the refractive index of different areas of the lens, effectively changing the focus. While no faster than its voice coil competition, it is substantially less expensive and operates silently.

HDR: Post process your images before you take them

The relatively small photo sites in camera phone sensors (even the large Ultrapixels are smaller than the pixels in high end point-and-shoot or DSLR cameras) restrict their dynamic range. As a result, backlit photos or photos combining sun and shade don’t look natural when processed. Either the shaded areas lack all detail or the bright areas are completely burned out. High-dynamic range (HDR) photography combines two or more images with different exposures in a process called tone mapping, to try to take the “best of both” images and create a single image more accurately reflecting the original scene.

For many years, HDR could only be done after the snap had been taken, with processing software on a computer. Until Apple introduced in-phone HDR with the iPhone 4S and changed all that. That announcement was only the beginning of what is possible for intelligent image processing right in the phone at the moment of capture. Now phones are being introduced with not just HDR still image capture, but full-time HDR video. The bottleneck for this type of new feature has been that they have needed to be custom-coded by the phone vendor and rely on the image signal processor (ISP) chip to do the work. Nvidia is smashing through that limit with its new Chimera architecture, which will be available starting with its Tegra 4 family of processors.

Beyond Lytro: Nvidia Chimera pushes the computational photography envelope

All that most people know about computational photography is that it includes fancy maths and allows Lytro users to refocus their images after the fact. Those tricks with focus are only the tip of the iceberg for what high powered processors coupled with computational photography software can achieve with the right image data.

Nvidia has taken a bold step by developing an entire architecture for computational photography. Its Chimera architecture allows the sensor, ISP, CPU, and GPU to all work together to do real-time processing of captured frames. Better shared memory access, harnessing of the GPU cores, and an open application interface enable an entire new class of mobile device camera capabilities.

The first application using Chimera is full-time HDR for both still and video photography in the upcoming Tegra 4 family of mobile processors. Having HDR integrated helps not only with high contrast images, but also to reduce the awful glare that low end lenses like those found in smartphones exhibit when pointed at bright surfaces or light sources. Because the HDR processing is done at the lowest level of the architecture, the multiple images are captured very close together in time, reducing the ghosting artifacts found in existing phone camera HDR solutions.

By unleashing the horsepower of the GPU during image capture (Nvidia claims up to 100 billion operations per second currently), features formerly only found on high end cameras, like real-time object tracking, will become available on smartphones. Real-time panoramas will be possible by simply sweeping the camera in any direction with Chimera. Best shot selection will also be a natural application for Chimera.

Other vendors are putting together systems with many of these capabilities, but what makes Chimera unique is that it has an open interface. This interface allows other companies to write plug-ins that have access to the low-level data straight off the sensor, coupled with the computing power of the ISP, GPU and CPU.

While it remains to be seen whether Google and Microsoft allow these programming interfaces to shine through in stock Android or Windows RT, there will certainly be an opening for custom camera applications integrated with homebrew ROM versions. Chimera is already part of the Tegra 4 announcement, and is open enough to support this type of advanced functionality.

What will the ultimate smartphone camera look like?

Putting all these innovations together will take a few years, but it’s inevitable. Combining a Lytro-like light field sensor with a high powered architecture like Chimera will make amazing photo effects and post-processing possible in real-time, in the phone. MIT’s Camera Culture team, along with startups Pelican, Heptagon, and Rebellion are all working on the light field sensor component – as are Apple and HTC (it’s expected, anyway).

Google certainly doesn’t want to be left out. Hiring computational photography guru, Pelican adviser, SynthCam creator, and Stanford professor Marc Levoy to work on its mobile photography architecture is just one indication of how serious it is. To quote Google’s senior vice president Vic Gundotra: “We are committed to making Nexus phones insanely great cameras. Just you wait and see.”

Sensor architecture will also continue to advance, with stacked sensors allowing greater on-chip innovation. Expect zero-lag global shutters (shutters which read out the entire frame at once, eliminating motion artifacts) to become commonplace. Not content with autofocus, real zooms will soon start to be available. Add-on lenses will also increase in functionality, providing true wide-angle and telephoto capabilities for our mobile devices. Rumours for the Nexus 5 even include the possibility of a camera module co-branded with Nikon. The only question will be whether anyone will need a point-and-shoot any more once these innovations come to smartphones.

Image Credit: Pelican Imaging