Breakthrough Voice Extraction™ Technology Enhances Voice Interface Performance -- Even in Extreme Noise Conditions. Doubles the accuracy of automatic speech recognition in noisy environments and that dramatically improves far-field voice quality in smartphones and wearables We should change this to far-field voice from far-outgoing voice because “far-field” voice is a standard, recognizable term.
No Hands Needed
The 4x4mm, low-power, digital Whisper Voice Chip enables wearable electronics to perform accurate speech recognition and to provide more natural voice quality. What’s more, it provides a true hands-free, noise-cancelled speech-recognition interface that works nearly everywhere – regardless of environmental noise – addressing the key pain-point about which consumers complain today.
The key to Whisper’s extreme accuracy is Kopin’s Voice Extraction™ Filter.
Feeding Whisper’s exceedingly clean linear voice signal into a speech engine results in significant improvements in speech-recognition rates and a more natural-sounding voice to the far-end listener. (In tests, Whisper maintained an accuracy rate of 93 percent in a 95 dB noise environment, compared to competitive products, which maintained an accuracy rate of just 40 percent or less.)
The accompanying chart compares the performance of smart glasses with the Whisper chip against the automatic speech recognition (ASR) and noise cancellation technologies found in two popular devices: a leading Bluetooth earphone and a leading smart phone. While the Whisper chip’s performance remains consistent as noise levels increase, the earphone’s performance begins degrading at 75 decibels (the amount of noise associated with a car interior or dishwasher) while the smart phone’s ASR performance starts to drop at approximately 85 decibels (or the amount of noise associated with restaurant).
For manufacturers, the Whisper Chip is simple to implement, sitting between the microphones and speech engine. It works with the leading operating systems, processors and speech recognition engines. Key features and benefits include:
Better ASR performance – Voice Extraction Filter dramatically increases the accuracy of existing speech recognition engines in noisy environments, whether the processing resides on the local device or in the cloud.
Clearer human-to-human voice – sounds natural and ‘clean’ even in noisy environments
Faster speech engine response – Whisper enables a faster response because the clean voice signal requires less processing.
Enhanced privacy – There is no need to yell into your device, even in a noisy environment, so the only person that hears you is the person or machine that you are intending to speak to.
Tunable – parameters can be adjusted to optimize for different applications and different microphone and speaker configurations.
The Whisper Chip is completely different than other audio chips: it is an all-digital solution that runs at only 16MHz, consumes less than 12mW of power and replaces the CODEC – no ADC or DAC is needed. It is also compact (4 x 4 mm) and accepts up to four (4) digital microphone inputs. The technology embedded in the Whisper chip is protected by more than 20 patents and patents pending.
A completely different approach to noise: We use dynamic AI:
- Sample acoustic environment 16,000 times per second
- Dynamic analysis of noise and voice activity
- Voice Extraction Filter to ‘extract’ voice without distortion
- More than 20 patents issued and pending
Other systems use “physics” approach to suppress noise signals and boost voice signals – this inherently introduces distortions which speech engines cannot process. Kopin audio solution outperforms anything we have tested against, once the parameters are tuned for the device and the application.
Whisper Voice Solution: An AI Approach
|Existing Speech Engine or Other Audio Applications