Your AI is ready to talk: voice tech is getting conversational

September 19, 2018

Those voices aren’t in your head: voice technology is taking Silicon Valley by storm.

As one of the hottest topics at TechCrunch Disrupt 2018 in San Francisco (where for the first time each keynote presentation was transcribed automatically in real-time), voice technology dominated two panels. Executives from top tech companies, including Google and Amazon, discussed aspects of voice technology ranging from uses to ethical implications.

Cathy Pearl, Head of Conversation Design Outreach at Google, said that years after leaving the conversational systems space, she was drawn back into it because of the “whole different use case.” Instead of trying to keep humans away from humans to save money, Pearl said voice technology nowadays can be used to connect people in more efficient ways, especially for people with disabilities.

“I think these voice systems present this new opportunity to allow people into a world that perhaps they didn’t have access to before,” Pearl said. “We’re still in the infancy of this technology, but I see so much potential and I’m really excited about it.”

The Sutardja Center for Entrepreneurship and Technology is also making strides in the field, with the Data Lab allowing students to work on interactive technology. SCET’s Alexander Fred-Ojala recently became a recipient of the Alexa Innovation Fellowship, which provides funding to support student startups in artificial intelligence, or AI, and voice technology.

At the panel, in response to whether constantly recording people was legal, AISense CEO Sam Liang compared voice technology to Facebook collecting data on its users, or security cameras recording public spaces. Liang explained that he anticipates the system remaining active continuously, which could help people better record and remember minute details of their lives.

Although cell phones have microphones and are “more easily hackable,” people are fine with carrying cell phones around all day, whereas people tend to be suspicious of a smart speaker in their home. Pearl said that transparency was an important component of making people more comfortable with speaking to machines.

“Part of it is perception, but perception is more important than reality,” Pearl said. “So it’s important to us to make sure that we address that and don’t try to dismiss it, but really listen to what people are concerned about.”

At a separate panel, Clinc CEO Jason Mars said that developers are currently working on making the interaction between human and machine more conversational. Mars described how technology is working towards taking very little information from a statement — for example, “Hey man, I remember I was in California last year, and I dropped a bunch of cash on groceries. Could you just sum that up real quick and give it to me?” — and producing helpful, tangible results for the user.

“There’s a lot of salient bits of information in that messy utterance,” Mars said. “And our system would be able to extract all of the bits of information relevant to answer your question cleanly and crisply. It’s divine design from scratch.”

Alex Smola, director of machine learning at Amazon, said he believed that voice technology is moving towards being an enhancement for processes that are currently in use, such as driving with voice commands or an app that tells you out loud whether your garage door is closed.

Smola also discussed the importance of keeping entire systems “debuggable” to avoid harmful biases that might appear as a result of large data collection, while Mars stressed the importance of constantly reevaluating reasons for using that data.

“Voice and conversational experiences (are) a means to an end, and not an end in and of itself,” Mars said. “The overarching thing about technology is we want to reduce complexity in our lives and improve our quality of life. That’s it.”