Sunday, October 23, 2011

Siri Will Need To Do More

Apple releases Siri for their latest iPhone, which lets you order your phone around better than any previous voice recognitions system by having an amazing understanding of context. Of course, mobile phones are utter context machines, knowing so much about where you are, who you are, and how you will pay for it, but Siri has an inkling of also knowing about your intent, the context of what you want to do, from one command to the next.

(Incidentally, I'd really like an AIer to write a comparison of Siri, formerly a DARPA project to create a voice personal assistance by creating understanding of the world, and Cyc, a god knows what it is now project to give computer programs smarts by creating understanding of the world.)

Right now the reports from the field are that, when Siri has a connection and can properly use big computers on the Internet to decode your voice, users feel like they have a whole new relationship to their iPhones, feeling empowered and in control in a whole new way, dictating messages and asking questions like they have been using computers this way all their lives. One field report actually casually used the word "hate" to describe working with the phone without Siri. But Siri is constrained, it will only work with the basic functions of an iPhone: make a call, set reminders, write a text, look things up. And we ask our phones to do so much more these days.
Nokia N9 Press Shot

Just check Helen Keegan's writings on The Trouble With Apps to see how all the things we can do with our smartphones is breaking using them down. Or another example: I remember seeing this first press shot of the Nokia N9 and thinking, in rapid succession:

  1. My, that's gorgeous.
  2. How the hell am I going to get what I want done with all those little icons?
Basically, we can't find how out to do what we want to do, and this is getting a worse problem on every smartphone. The apps revolution is now at the point where the current model is broken with too much choice: which app does exactly what we want instead of the other 7 that kind of do it, where do I get it, how do I find it back, how does it work?

(As an aside: this is now true for almost every area of connected computing: from eBay to Amazon, it has become impossible to find what we really want, instead of just an approximation of features, unless brands or word of mouth or professional mediators help us. News requires aggregator sites of a political slant like blogs and newspapers to manage, who then get aggregated in meta-publications. Netflix spends tons of resources trying to make a better recommendation engine which ends up being tweaks on two other recommendation algorithms combined. We need better ways to let the systems know what we want and like so they can find it for us, even things we did not consider.)

Will Siri help iPhone users? It would require opening apps on an intention level, making apps be able to declare to the phone "I can do this" like "I can play Words With Friends", "I can edit a document", "I can can make a restaurant reservation", which actually requires a lot of careful thought from an app maker and can become a nuclear war between apps when they try to game the system by declaring they can do something they can only do half way. There are some really awfully unethical app makers out there. But it pretty much has to be somehow done, Siri is too much of an advantage to limit it.

So now that all technology is evolving to emulate the expectations set by Star Trek, a couple of predictions:


  • We need Comm Badges so we do not even need to take Siri out of our pocket to make a hands-free call. I am thinking something in the current iPod Nano form factor you just touch and hold and give a Siri command, which it then forwards over Bluetooth. People in public places are about to become a whole lot more irritating.
  • If Apple makes a TV, it will have a webcam built in, and we will be able to tell our iPhones to move the faceTime video call from our iPhones to "On screen". The deeper the voice you say it with, the faster it happens, and suddenly grandma can see the whole family who were watching something, while their show is properly paused.