Application Enablers

In addition to high performance speech recognition engines, SRI develops software-based systems that enable advanced functionality for a broad range of speech-based applications.

Spoken Address Capture: 
Capturing address information is one of the most common yet time-consuming tasks in such applications as automated attendants and in-car telematics. Conventional systems employ a series of voice prompts for each address field, as well as confirmation prompts. SRI's Spoken Address Capture (SAC) system accepts an entire spoken address at once, the way a person would naturally say it to another person. SRI's SAC system combines speech recognition technology with database lookup techniques to provide a friendlier, faster, and more accurate one-step entry process.

Content Search by Voice: 
Mobile communications, wireless computing, and digital devices deliver a world of information at consumers' fingertips. But accessing that information on portable devices can be difficult. SRI's Content Search by Voice (CSV) system enables one-step, voice-based retrieval of data from vast content databases, to search for or purchase music, ringtones, books, movies, or any other item that can be catalogued in a database.

Phrase-based and Free-form Translation: 
In fields such as emergency and relief services, healthcare, fire and public safety, and law enforcement, the need to communicate quickly with speakers of other languages is critical, yet human translators are not always available. SRI has developed a range of systems to provide high accuracy, speech-to-speech translation in portable devices, such as the Phraselator™ unidirectional speech translation system that translates up to thousands of phrases from English or Spanish to a foreign language. SRI's newest translation technology provides full spontaneous speech-to-speech translation, integrating speech recognition, speech synthesis, and natural language processing technology. Quicktime WMV

Data Entry and Form-filling: 
Mobile workforces, such as salespeople, field technicians, health care workers, or manufacturing and logistics personnel, often need to enter information into their organizations' IT systems. To enable easy data entry on mobile computing devices, SRI's DynaForm data entry and form-filling tool allows developers to build data entry templates on Windows Mobile devices, for more natural data entry and an alternative to stylus-based input.

Audio Mining, Transcription, and Time Alignment: 
The field of Internet search is expanding beyond text and images to include audio and video content. Making audio and video content searchable requires the capability to transcribe speech and time-align it with the original source material. SRI's Decipher™ large vocabulary, speaker independent, continuous speech recognition system transcribes and indexes broadcast audio or audio contained in broadcast video to enable search in the rapidly growing audio/video field.

Distributed Speech Recognition: 
To date, speech recognition systems have been deployed in two ways: on a remote server or pre-loaded on a mobile device. Either approach forced makers of mobile phones, PDAs, PCs, and consumer and automotive electronics products to accept tradeoffs. To eliminate design sacrifices, SRI has created a third mode of deploying speech recognition: DynaSpeak with Distributed Speech Recognition (DSR). With DSR, a user's speech is preprocessed on the user device and transmitted over a low bandwidth channel to a full-featured server-side system. The benefits are numerous: higher quality audio capture, lower cost per device, and centralized management of speech applications.

Director and ActiveX Interfaces: 
Developers of computer-aided learning and training systems often use multimedia authoring environments such as Macromedia Director. SRI has made it easy for Director developers to incorporate speech recognition into their projects by creating a speech recognition Xtra which provides access to EduSpeak's recognition, recording, playback, and pronunciation scoring functions. Developers of distance learning products or who employ Internet-based models for delivering educational software and training can incorporate speech recognition through use of EduSpeak's ActiveX control, which enables speech recognition functions within a Microsoft Internet Explorer browser.

back to top