Usefull for artificial intelligence Part-6,A I,Artificial Intelligence
Artificial Intelligence
6 Dictation Grammars
Dictation grammars come closest to the ultimate goal of a speech recognition system that takes natural spoken input and transcribes it as text. Dictation grammars are used for free text entry in applications such as email and word processing.
A Recognizer that supports dictation provides a single DictationGrammar which is obtained from the recognizer's getDictationGrammar method. A recognizer that supports the Java Speech API is not required to provide a DictationGrammar. Applications that require a recognizer with dictation capability can explicitly request dictation when creating a recognizer by setting the DictationGrammarSupported property of the RecognizerModeDesc to true (see Section 4.2 for details).
A DictationGrammar is more complex than a rule grammar, but fortunately, a DictationGrammar is often easier to use than an rule grammar. This is because the DictationGrammar is built into the recognizer so most of the complexity is handled by the recognizer and hidden from the application. However, recognition of a dictation grammar is typically more computationally expensive and less accurate than that of simple rule grammars.
The DictationGrammar inherits its basic functionality from the Grammar interface. That functionality is detailed in Section 6.4 and includes grammar naming, enabling, activation, committing and so on.
As with all grammars, changes to a DictationGrammar need to be committed before they take effect. Commits are described in Section 4.2.
In addition to the specific functionality described below, a DictationGrammar is typically adaptive. In an adaptive system, a recognizer improves its performance (accuracy and possibly speed) by adapting to the style of language used by a speaker. The recognizer may adapt to the specific sounds of a speaker (the way they say words). Equally importantly for dictation, a recognizer can adapt to a user's normal vocabulary and to the patterns of those words. Such adaptation (technically known as language model adaptation) is a part of the recognizer's implementation of the DictationGrammar and does not affect an application. The adaptation data for a dictation grammar is maintained as part of a speaker profile (see Section 6.9).
The DictationGrammar extends and specializes the Grammar interface by adding the following functionality:
Indication of the current textual context,
Control of word lists.
The following methods provided by the DictationGrammar interface allow an application to manage word lists and text context.
Table 6-5 DictationGrammar interface methods
Name Description
setContext Provide the recognition engine with the preceding and following textual context.
addWord Add a word to the DictationGrammar.
removeWord Remove a word from the DictationGrammar.
listAddedWords List the words that have been added to the DictationGrammar.
listRemovedWords List the words that have been removed from the DictationGrammar.
6.1 Dictation Context
Dictation recognizers use a range of information to improve recognition accuracy. Learning the words a user speaks and the patterns of those words can substantially improve accuracy.
Because patterns of words are important, context is important. The context of a word is simply the set of surrounding words. As an example, consider the following sentence "If I have seen further it is by standing on the shoulders of Giants" (Sir Isaac Newton). If we are editing this sentence and place the cursor after the word "standing" then the preceding context is "...further it is by standing" and the following context is "on the shoulders of Giants...".
Given this context, the recognizer is able to more reliably predict what a user might say, and greater predictability can improve recognition accuracy. In this example, the user might insert the word "up" but is less likely to insert the word "JavaBeans".
Through the setContext method of the DictationGrammar interface, an application should tell the recognizer the current textual context. Furthermore, if the context changes (for example, due to a mouse click to move the cursor) the application should update the context.
Different recognizers process context differently. The main consideration for the application is the amount of context to provide to the recognizer. As a minimum, a few words of preceding and following context should be provided. However, some recognizers may take advantage of several paragraphs or more.
There are two setContext methods:
void setContext(String preceding, String following);
void setContext(String preceding[], String following[]);
The first form takes plain text context strings. The second version should be used when the result tokens returned by the recognizer are available. Internally, the recognizer processes context according to tokens so providing tokens makes the use of context more efficient and more reliable because it does not have to guess the tokenization.
Comments
Post a Comment