Bringing
It All Together
Why
is the VoiceXML approach important? First,
the phone is important. There are over
1.5 billion phones in use, far more than
there are Internet-connected computers.
Phones are easy to use and don't need
to be booted up. Telephone networks are
much more reliable than data networks.
Mobile
phones are achieving large penetration
rates too: unlike notebook computers and
many PDAs, mobile phones are highly portable,
inexpensive, and have long battery lives.
Mobiles are a natural match for location-based
applications. They can be used while driving
(though not always safely).
Second,
voice is important on the phone. Voice
has always been the natural mode of communication
for phones. Even though some mobiles have
WAP/XHTML browsers, their small screens
and keypads make micro browsers hard to
use, especially while driving. The i-mode
system is more compelling, though shares
the same limitations.
But there are advantages to combining visual
browsing and voice browsing. For instance, complex
information is hard to remember when spoken to the
user, but easy to remember if it is presented in a
persistent visual form. And some misrecognitions of
spoken input are easy to correct with keypad entry.
Therefore, we should soon begin to see multi-modal
applications deployed alongside pure visual applications
and pure voice applications.
Third, the Internet is important to voice applications:
- Voice
application development is easier because
VoiceXML is a high-level, domain-specific
markup language, and because voice applications
can now be constructed with plentiful,
inexpensive, and powerful web application
development tools.
- Voice
applications are now far easier to deploy.
No longer must they reside on a special-purpose
voice server in a proprietary "walled
garden": they can be placed anywhere
on the Internet and accessed from any
VoiceXML-compliant voice server.
- Applications
can be cleanly structured into service
logic on the web server, and presentation
logic, in VoiceXML pages delivered to
the voice browser. This has many advantages,
not the least of which is that a common
application back end on the web server
can serve up different types of presentation
logic based on the user's device. This
factoring leads to huge savings.
Finally,
voice, and therefore VoiceXML, is important
for web devices other than the phone.
For example, a voice actuated "universal
remote" could have an on-board voice
browser and VoiceXML content generated
from all the devices in its vicinity.
You could walk into your family room,
pull the remote from your shirt pocket,
press its push-to-talk button and say
"stereo: off; television: what action
movies are playing?"