AppleTV Needs a Kinect Style Camera
The AppleTV needs a camera, that at a minimum, replicates the Kinect’s functionality.
I’m not pretending this is a revolutionary idea. I am quite sure, that in Apple’s unrelenting march to expand its halo of products, there is an entire lab of people in Cupertino working on this (to get an idea of how much Apple has in the pipe visit the folks at Patently Apple). The battle for the living room has begun and two of the leading warriors are Apple and Microsoft. At stake are billions via control of the entertainment experience in the living room and these two companies already have a long-standing grudge.
Apple can make an even better Kinect.
The beauty of Apple products, is that through an amazing User Interface (UI), things that are complicated seem very simple, even for those that are unfamiliar with computers. Two examples:
Routers and Back-ups
Both are necessary, finicky and a pain. They are also intimidating for novices. Apple has made products for both that are idiot proof (Time Capsule and Airport Express/Extreme). I have both an Express and Extreme and they only take moments to set up. They aren’t the best routers I have ever owned, but they are stable, never have to be rebooted and a breeze to set up. Literally anyone can figure these out without a manual.
iPad Photo Album
My second example is the photo browsing functionality in the iPad (and iPhone). Sometimes I hand computing products to people without explaining how they work. I like to see how intuitive (the device, not the person) they are. You can learn a lot by watching someone use a new interface, whether it be an OS, an App, a peripheral or a website. So, when my mother asked to see my wedding photos, I handed her the iPad, with the album open, but no explanation on how to use it. She immediately, without any instruction “knew” that brushing her finger back and forth would flip the “pages” of the album. Very shortly thereafter she realized she could pinch, expand and manipulate her view. Soon she was browsing through other albums and had grasped the whole UI.
In the above examples, Apple took products that were already well established in the computing world (back-ups, routers and photo albums) and made them so much better. They elevated the dull and commonplace to something elegant. I fully believe they can do this in the living room as well.
Why would people use this?
With regards to Microsoft’s Kinect, I think everyone understands this is one of the biggest wins a tech company has had in the living room in a long time. Microsoft’s “controller free gaming and entertainment experience” is the fastest selling consumer electronics device (Wikipedia article here). As of March 2011, 10 million units have shipped. It’s bringing a higher-end gaming experience to users who need a more intuitive interface (and previously would have used the Wii). It also helps cement the Xbox as the Entertainment Centre for the living room.
At about the same time, Apple began having its own living room win with the AppleTV (you can read more about my thoughts on it here). Like most recent Apple products, it’s intuitive, graceful and “just works”. Apple’s brilliance with the AppleTV wasn’t just in the UI, which by Apple standards I find to be mediocre. It was in the price. It’s $99; people buy cables that cost more than that. One of the things it lacks most though, is a FaceTime interface. I think about my parents living 6 hours away from my niece and nephew and how they would love to sit in the living room and interact with them on their TV. Currently, my mother uses Skype for this purpose – I love that she gets so much use out of her little Netbook, but it’s not natural. People converse in living rooms, not hunched over a laptop in a home office or den.
Now imagine a $75 peripheral. A tiny camera that sits very subtly on top of your tv. With FaceTime enabled, families could now connect in a way that encourages longer, better conversations in a more natural setting. Quite literally, when using FaceTime on the AppleTV, you would be welcoming people into your living room. People’s living rooms aren’t just comfortable, but they also reflect what owners want visitors to see and are an expression of their personalities. Home offices, dens and bedrooms generally are not. So instead of seeing a big pile of books and unfolded clothes in the background, your friends and family would see a couch, a painting and perhaps a sunny window. Moreover, your “guests” wouldn’t just see you. The great flaw (for families) in today’s video chatting solutions is that they are really for one-to-one contact. The cameras and devices are made to capture one person, sitting very close. Not at all suitable for a couple to sit on a couch with their child(ren) and talk to grandparents.
Think it’s far-fetched that Apple would do something like this? They have the scope – the iPhone is the world’s most popular camera now (read a bit more about that here). That means a lot of experience in integrating cameras, understanding how people use them and a pre-existing user base for FaceTime.
What would this camera do?
Apple isn’t in the game of playing catch up. A camera (even if its amazing) isn’t going to deliver a mind-blowing experience. What would however, is if the camera had a similar functionality to the Kinect. One of the greatest flaws of the AppleTV UI is that the remote is very limited. It’s brilliant for your regular select/fast forward/play style commands. However, its shortcomings become apparent as soon as a person wants to start inputting more information (trying to search for a movie in Netflix by its name for example). A Kinect style webcam could assist with this in two ways:
1. Gesture and voice based controls. People love this about the Kinect. This is the minimum functionality that Apple would need to include. Microsoft has demonstrated it can be done and created the demand; now it would just be up to Apple to improve how it works.
2. An “in the air” keyboard. Essentially the camera would track your fingers as you type on an imaginary keyboard. Like the iPad and iPhone, when typing was required, a keyboard would slide onto to the screen. Then your hands would appear on the screen (being tracked by the camera). This is important because people have a very hard time typing without feedback. On a regular keyboard this is done through touch and sound (clicking). On an iPad, its through sound and sight (clicking and seeing the letters pop up). In the case of the AppleTV camera, a user would hold their hands in the air and type (or alternatively use a coffee table), while seeing their hands on-screen and hearing the clicks.
This “airboard” would open up a world of possible applications for the AppleTV. It’s severely limited now (even when Apple opens an AppleTV App store), because the remote is not robust enough to handle complex input. To a certain extent, this will be solved through a better version of the Remote App, but relies on households sharing iPads or iPhones (or each having one).
This solution would provide an input device that could never be lost, never needed batteries and was easy for anyone to understand. In short, the perfect remote.
Update: Microsoft has just rolled a lot of what I mention above into the Xbox Kinect’s functionality for Netflix. You can read more here.