In a show like Star Trek, it’s easy to get lost in some of the more ambitious tech that illustrates humanity’s future. There’s the replicator, a machine that makes basically anything you want. The transporter is another good one. NASA would kill to get its hands on just one warp drive. But there’s another piece of tech that lies in the background, and it’s much more important to our lives today: the ship’s computer.
In every Star Trek series, captains and crewmen bark orders to a faceless computer, and those orders are executed with 100 percent accuracy (barring any mechanical malfunctions, that is). Even when the Klingons are attacking, the computer never misunderstands commands—human or otherwise.
While complex military maneuvers might still require a standard graphics UI, Star Trek proves that the future of personal computing is all about voice user interface (VUI).
And at its annual software event today, Google went full Star Trek when revealing some new AI tricks headed to Android Q, the latest software that’ll find its way into billions of devices. Arguably, it’s the first time we’ve truly glimpsed the promise of voice-activated interfaces in real life and what they could mean for the future of tech design.
It’s a long-accepted idea that the very best user interfaces are the ones that feel the most natural. Perhaps no one understood this better than Steve Jobs, who rejected the common idea of the smartphone in 2007 and instead relied on the “digital styluses” that nature gave us—all 10 of them.
But this is only one instance in a steady evolution of forming technology to our natural human inclinations. While our fingers navigated our phones, digital pens returned creating improved drawing and note-taking tools, fingerprints become our lock buttons, and even our facial expressions were imported with Apple’s Animoji.
“Voice assistants represent the third key UI and technology platform shift of the past three decades,” says Harvard Business Review. “Web pages gave us ‘click’… smartphones introduced ‘touch’…these transitions required consumers to learn a new language…the shift to voice doesn’t require any training.”
It’s our voices that will really change the way we think about computing, and it’s why Amazonhas at least 10,000 employees working on Alexa and Google has outspent other companies in AI research by more than $3 billion.
After all, complex language is what separates us from every other species on the planet. It’s unique to us and it’s our most powerful natural tool for communication. So it only makes sense that eventually it would also become the best means to communicate with our devices.
Just like Captain Picard tasking the ship’s computer with a string of complex actions in near-real time, all of us will be able to do the same with our phones and laptops using just our voices.
The Reality of Voice Computing
“What if we could bring the AI that powers the Assistant right onto your phone?” asked Huffman. “What if the Assistant was so fast at processing your voice, that tapping to operate your phone would almost seem slow?”
This, of course, addresses one of VUI’s many limitations compared to the traditional graphics user interface. With billions of different voices, deep subtleties of human language, and additional processing lag for speech recognition, VUI feels beneficial for one-off Google queries, but nearly unusable if you’re trying to get real work done.
And that’s where Google’s so-called “Next Generation Assistant” comes in.
“Running on device, [Google Assistant] can process and understand requests in real time,” said Huffman. “And deliver the answers up to ten times faster.”
After stating this bold claim, Huffman invited a fellow (human) assistant to walk through Google Assistant’s new tricks. The Google AI blazed through several apps, completing tasks like “open my calendar,” “what’s the weather,” and “book a Lyft to my hotel.”
Google wasn’t just able to answer quickly, but it could also respond in a string of commands, each providing context for the next command. This means you don’t have to say “OK Google” a million times, and it retains the context of your previous questions.
Other demonstrations showed off the new Assistant’s ability to send texts—sans fingers—and, even more impressively, email. The Assistant was able to differentiate between actions like “set subject as” or “send it” and the actual text of the email itself, providing a deeper understanding of words than the Assistant has ever had.
Deeper language comprehension and real-time processing seem like fun parlor tricks, but it’s complex abilities that our impressive human brains often take for granted. These improvements help evolve mobile AI from just some novel way to use a Google Search into something much more.
A UI for Everyone
One of the biggest drawbacks of the past three decades of technological interfaces is that they’re inherently exclusive.
Between the young and the old, it created a digital divide between those who were raised in a world of smartphones and those where computing became a learned skill. And while most of us can use a computer with a mouse and keyboard, the disabled community became unfairly marginalized.
But everyone can speak—even people who, at first glance, appear like they can’t. Google is trying to make voice computing as accessible as possible for anyone with Project Euphoria, an initiative to make all voices—no matter what—understandable.
Of course, Google’s dream is still just that: a dream. The new and improved Google AI will make its way to devices this fall, but that doesn’t mean a brave new world of voice will immediately ensue.
Tech events are always suspiciously flawless and real-world use often differs from these highly manicured, onstage experiences. But the vision is coming sharply into focus as more and more tools are created that will one day replace the old era of screens and silence.
Five years ago, talking to your phone seemed abnormal, even creepy. But soon it could become as easy as ordering a ship’s computer to fire photon torpedoes.
Originally posted on Popular Mechanics