Consider for example a grocery store shopping mobile app. What every smartphone owner could use is a voice-enabled version, which would accept spoken input instead of compelling the user to fumble on a tiny keyboard. And the app would actually understand and respond to what has been said. Using an online grocery store’s app, they could just rattle off their shopping list and let the app search for the items and fill the cart. Then they could command the app to proceed to checkout, and voila! Without a single keystroke, the chore is done (leave aside waiting for the home delivery). No fumbling, no worn-out thumbs. Nothing beats hands-free voice input—wouldn’t you agree?
Can an app really be so user-friendly? Welcome to the talky online world populated by the likes of Google Voice and Siri, Apple’s personal assistant app. Technology to voice-enable mobile apps is already a reality.
GoVivace Inc. of McLean, VA has designed an advanced Automatic Speech Recognition engine that is specifically suited for voice-enabled mobile apps. Why’s that?
The key to building reliable and robust voice-enabled mobile apps is to construct a comprehensive application grammar and vocabulary, technical jargon for a set of pre-specified possibilities that the app will look up to understand the speech input. The more inclusive the grammar, the better the app will understand and seem intelligent to the user!
Say the user asks the voice-enabled mobile app of an online grocery store to add two packets of Oreo’s stuffed chocolate chip cookies to the shopping cart. A number of things happen behind the scenes. Just like Siri or any other voice-enabled mobile app, the audio stream representing the spoken input is compressed and sent to a waiting farm of servers. Those servers have also notified the context in which the input was spoken. Putting together the context and the input, the servers quickly change their language model to suit the situation and then convert the audio into text. The servers recognize that “two packets” is the quantity and “Oreo’s stuffed chocolate cookies” is the name of the item.
Essentially, the item is looked up in the apps grammar and vocabulary representing the hundreds and even thousands of possible inputs the servers may have to process, and finally, the cart is updated. It involves a lot of steps but everything happens so quickly that the app user doesn’t notice the number of steps involved, and happily goes on shopping.
The performance of voice-enabled mobile apps also depends on the quality of the speech recognition engine. Ideally, the engine must be capable of understanding natural language and adapting to variations in voice quality and spoken content. At the same time, it should be easy to use and integrate.
GoVivace’s Automatic Speech Recognition uses both grammars and a statistical language model to understand natural language, which helps build highly precise voice-enabled mobile apps.
So, employ our voice recognition technology in your business and organizations to enhance data processing and more importantly for enhancing customer engagement and satisfaction.