Alexa and Siri Wont Work in Noisy Environments

Matthew Goldey

July 17, 2018

Voice assistants are everywhere, especially in people’s homes. We use Alexa, Siri, and Google Home to control our lights, answer trivia questions, and play music. Notably, sales of voice assistant devices have more than doubled in the last year.1

Beyond the home, more people are using voice commands on their phones. By 2020, nearly half of all internet searches are predicted to be voice driven.2 Why type when you can talk? Speech recognition is now three times faster than a human for short phrases.3

Not far behind the consumer marker, the business world is catching on. Unlike households (well, some households), business environments are often noisy. Even further, daily activities are not transferable from business to business. A financial trader wants to pull up a chart of Apple’s stock price. A first responder needs to look up warrants for an address. A logistics professional will pull up orders from the last three months. These noisy workplaces make developing skills challenging.

What’s a skill?

To help people, voice assistants have specific “skills” they can understand. These skills have an “intent” and “entities” related to that “intent”. You might tell your Amazon Echo “Alexa, turn on the bedroom lights”. The “intent” of the voice command is what Alexa should do. The entities are the action “turn on” and the object “bedroom lights.”

These skills often fail in noisy environments. Let’s say you were building a skill to react to the command:

Get me directions to 55 West Monroe Street

Behind each voice command is a voice-to-text transcription engine. In practice, this is at most 95% accurate, and 75% or less accurate in noisy environments. You may get transcripts back like this:

Get me directions to 55 West Monroe Street
Get me directions to 55 West Rose Street
Give me directions to 55 ... Monroe...
Give me directions to 55 Western Row Street

If you build a skill with the entities direction and address, your skill can miss the address – or find a wrong address – if you only rely on the most likely transcript.

Enter Scribe Discovery Engine

We built Scribe Discovery to help developers write skills for noisy environments.

Scribe’s real-time transcription engine provides hundreds of possible transcripts.

Scribe ranks these by how much they sound like the audio it heard. Discovery searches the rich output to find the targeted terms.

In our example, Discovery finds not just the target address, but also these other possible addresses:

55 West Monroe Street 5 West Monroe Street 55 West Monroe 55 Monroe 5 Monroe 55 West Rose 55 Western 5 Western

As a developer, you can build skills for transcripts with no mistakes. Discovery will sort through all possible transcripts for you, uncovering 10-25% more entities than present in the transcript, clearly an advantage in noisier environments. 

 

Check out the video below to see Discovery in action. 

 

 

 

Interested in learning more? Check out our documentation on Discovery.

Blog

August 2019 Newsletter

Tejas Shastry, our Chief Data Scientist, recently participated in a highly successful webinar with Greenwich Associates where he discussed artificial intelligence and NLP in trading, including his thoughts on potential future uses in structuring data.
Read More
August 2019 Newsletter
Blog

Artificial Intelligence on the Trading Desk Webinar: A Summary

GreenKey’s Chief Data Scientist, Tejas Shastry, was a featured panelist in a July 25 webinar hosted by Greenwich Associates that discussed various aspects of artificial intelligence now being used on trading desks.
Read More
Artificial Intelligence on the Trading Desk Webinar: A Summary
Blog

There are trade signals within your audio data

Identification and interpretation of pace, volume, pitch, and cadence are part of the human brain’s processing ability to identify sentiment, based on years of training.
Read More
There are trade signals within your audio data
Blog

July 2019 Newsletter

This month we highlight our ability to unlock conversations across voice and chat - leading to fewer missed trading opportunities across clients and asset classes - as well as new analytics reports specifically tailored to show desk heads what's trending.
Read More
July 2019 Newsletter
{"slides_column":1,"slides_scroll":1,"dots":"true","arrows":"true","autoplay":"true","autoplay_interval":2000,"speed":300}

By signing up, you agree to the Terms of Use, the Privacy Policy and the transfer of your personal data from your country of residence to the United States (if different).

We promise we don’t send spam.