As we approach Microsoft’s and Sony’s announcements for their next generation game consoles, the rumour mill is kicking into high gear. The latest rumour has the next Xbox, codenamed Durango – also known as the Xbox 720 – performing natural language recognition in a similar way to Apple’s Siri. While the Xbox 360 currently has some voice recognition through the Kinect, it’s limited in usefulness by the restrictive implementation of the actual voice controls.
After the rise of motion controls in the last generation, it’s clear that the consoles aren’t just going to compete on performance anymore. Both Nintendo and Sony have bet on touchscreens to different extents, so moving the SmartGlass technology forward isn’t going to be enough of a distinguishing feature for the next generation Microsoft hardware. While the Xbox 360 is currently the king of consoles in the US, it’s not doing nearly as well in the rest of the world. It’s going to need a hook of some sort, and this voice functionality just might be a big part of what Microsoft has up its sleeve.
Microsoft has had some fairly good ideas with the Kinect up to this point, but the hardware has some serious technical constraints. Originally, the Kinect was supposed to have a better camera and standalone image processing, but that was tossed out for the final version. Sadly, this meant developers were hamstrung by the hardware limitations. It’s not just the voice controls that are stilted and wonky. Still, they’ve been able to ship 18 million units, so it has been an unqualified financial success as an add-on.
The Verge has sources claiming that the next Xbox will be able to perform speech-to-text, natural language recognition, and wake on voice tasks. If Durango ships paired with a next-gen Kinect that offers substantially better voice and motion detection, Microsoft could have something special here.
There are roughly 75 million Xbox 360s in the wild, but only a small fraction of those have a Kinect paired with them. If every Durango console has a next-gen Kinect, developers will actually be incentivised to create games that take full advantage of those features instead of haphazardly slapping on Kinect functionality. Financially speaking, reaching an entire user base is much more compelling than reaching the fraction that went out to buy a peripheral device.
Information is still scarce, so we don’t know all of the implementation details quite yet. The consoles will undoubtedly be much more powerful than current phones and tablets, so it seems plausible that the next-gen Xbox could handle the speech recognition on the hardware itself instead of sending it to the cloud like Siri does. It’s not a forgone conclusion, though, and having everything pass through servers does have benefits. It allows for better analysis of how people use the service, and offers on-the-fly updating to the software. Even with an iPhone sitting next to a wireless router, Siri still sometimes chokes on very basic commands. With the current always-on DRM rumours, don’t be surprised if the Xbox 720 needs a constant network connection in either scenario.
In the cloud or handled locally, better speech is a safe bet going forward from Microsoft. Let’s hope that this time around, Redmond knocks it out of the park. It’d be nice to see the Xbox team finally deliver on the full potential of the Kinect concept, since they certainly didn’t do that with the current hardware.
For more on the next Xbox’s potential, check out our piece on IllumiRoom, and to keep up to date with all the latest rumours and news on both Sony and Microsoft next-gen fronts, see our “all you need to know” round-ups for the Xbox 720 and PlayStation 4.Leave a comment on this article