When does voice interface to computers become primary?

When do you expect voice to become the primary channel of communication
with computers?

-- Claudio Gatti, December 22, 1996


Never. That's not to slam Victor Zue, MIT's speech recognition wizard, whose research group is down the hall from my office. Victor's software works amazingly well, especially when it has some context (e.g., making airline reservations). But a computer that has to be told what to do is like a Web site where they require registration: lacking in subtlety. I expect my environment to be permeated with computers. They will notice that I always draw the blinds when the sun starts to hit my computer screen and start to do it automatically. They will notice that I'm in bed together with my dog Alex and turn down the heat. They will notice the kinds of Web pages I bookmark and the products that I buy with my credit card. I will be communicating with computers all day every day without being conscious of the fact.

I don't necessarily think this is a great thing. Probably most of these computers will be rather intrusive. But that is how I think the landscape will look.

-- Philip Greenspun, December 22, 1996

Let's hope the answer reaaly is never, and let's hope that the "intelligent adaptive behaviour" that Philip describes never happens either. There is little that is so annoying as intrusive, unwanted "helpfullness". It is bad enough that a bloody paper clip interferes all the time in MS Word. I want my computing tools (or at least the interface) to be as direct and obedient as a hammer, saw, chisel, plane, axe, spade or screwdriver.


-- Tom Rose, November 14, 2001

Computers as they currently are -- never.

Most of the work done in front of computer today consists of manipulating various objects on the screen in one way or the other. It is easily done with a mouse, it can be done with a keyboard with some effort.

I don't see anyone in his right mind sitting in front of PC a la Blade Runner and saying "Switch to the next window... next.. there. Now move that cursor left... more... more... there. Select an area of 100 by 100 pixels down and to the right... extend it right... more... more... there". Such actions take 1-2 seconds with a mouse and a minute with voice.

Compared to mouse movements or typing on the keyboard, voice is terribly redundant. It evolved to communicate with other humans via an extremely noisy channel. In case of our interaction with computers there's virtually no noise, and all the complexity of our voice is just an unnecessary overhead. An atavism, may I say.

-- Alexei Kornienkov, November 15, 2001

Like other respondents: primary? - never.

Other points (cumbersome, redundant) are right, I think, but I suspect that most of all, it's just not private enough (not simply for snooping - simple embarrassment too). For blind people it could be excellent, but most people mainly want better kinds of keyboard and screen, and to write letters and search for info silently.

So I think improved keyboards offer big opportunities closer to hand than voice - followed by thought-to-computer interfaces further away. Training a computer to recognise an individual's speech patterns will turn out to be not that different in all the important difficulties from training that computer to recognise, from some reasonably short distances (c.f. those US Navy thought-interface experiments in the 1980s), that user's thought intentions to say or write something.

Voice will probably turn out to have been an arid middle ground - much harder than better keyboards, and not that much easier than thought-recognition.

-- Mark Griffith, November 15, 2001

I personally would love to see it as a option! You know that it's like everything else. People are used to doing things one way, but as technology advances, perhaps it won't take a minute to tell the computers to do something a mouse can do in seconds. Afterall, even if it does, you'll notice in futuristic visions like startrek the next generation, sometimes they still default to using the LCARS interface instead of pure vocal. I think that representation is not pure fantasy, I think it really could be that seamless if people would stop being such whiny nincompoops and be willing to put the effort into using the mouse and keyboard a tad less. :-)

-- TNR TNR, November 9, 2005