Usefull for artificial intelligence Part-8 , A I, Artificial Intelligence

     Artificial Intelligence


8     Recognizer Properties

A speech engine has both persistent and run-time adjustable properties. The persistent properties are defined in the RecognizerModeDesc which includes properties inherited from EngineModeDesc . The persistent properties are used in the selection and creation of a speech recognizer. Once a recognizer has been created, the same property information is available through the getEngineModeDesc method of a Recognizer (inherited from the Engine interface).

A recognizer also has seven run-time adjustable properties. Applications get and set these properties through RecognizerProperties which extends the EngineProperties interface. The RecognizerProperties for a recognizer are provided by the getEngineProperties method that the Recognizer inherits from the Engine interface. For convenience a getRecognizerProperties method is also provided in the Recognizer interface to return a correctly cast object.

The get and set methods of EngineProperties and RecognizerProperties follow the JavaBeans conventions with the form:

Type getPropertyName();

void setPropertyName(Type);
A recognizer can choose to ignore unreasonable values provided to a set method, or can provide upper and lower bounds.

Table 6-10 Run-time Properties of a Recognizer
Property  Description  

ConfidenceLevel  float value in the range 0.0 to 1.0. Results are rejected if the engine is not confident that it has correctly determined the spoken text. A value of 1.0 requires a recognizer to have maximum confidence in every result so more results are likely to be rejected. A value of 0.0 requires low confidence indicating fewer rejections. 0.5 is the recognizer's default.  

Sensitivity  float value between 0.0 and 1.0. A value of 0.5 is the default for the recognizer. 1.0 gives maximum sensitivity, making the recognizer sensitive to quiet input but more sensitive to noise. 0.0 gives minimum sensitivity, requiring the user to speak loudly and making the recognizer less sensitive to background noise. Note: some recognizers set the gain automatically during use, or through a setup "Wizard". On these recognizers the sensitivity adjustment should be used only in cases where the automatic settings are not adequate.  

SpeedVsAccuracy  float value between 0.0 and 1.0. 0.0 provides the fastest response. 1.0 maximizes recognition accuracy. 0.5 is the default value for the recognizer which the manufacturer determines as the best compromise between speed and accuracy.  

CompleteTimeout  float value in seconds that indicates the minimum period between when a speaker stops speaking (silence starts) and the recognizer finalizing a result. The complete time-out is applied when the speech prior to the silence matches an active grammar (c.f. IncompleteTimeout). 

A long complete time-out value delays the result and makes the response slower. A short time-out may lead to an utterance being broken up inappropriately (e.g. when the user takes a breath). Complete time- out values are typically in the range of 0.3 seconds to 1.0 seconds.  

IncompleteTimeout  float value in seconds that indicates the minimum period between when a speaker stops speaking (silence starts) and the recognizer finalizing a result. The incomplete time-out is applied when the speech prior to the silence does not match an active grammar (c.f. CompleteTimeout). In effect, this is the period the recognizer will wait before rejecting an incomplete utterance. The IncompleteTimeout is typically longer than the CompleteTimeout .
  
ResultNumAlternatives  integer value indicating the preferred maximum number of N-best alternatives in FinalDictationResult and FinalRuleResult objects (see Section 7.9).

 Returning alternatives requires additional computation. Recognizers do not always produce the maximum number of alternatives (for example, because some alternatives are rejected), and the number of alternatives may vary between results and between tokens. A value of 0 or 1 requests that no alternatives be provided - only a best guess.  

ResultAudioProvided  boolean value indicating whether the application wants the recognizer to audio with FinalResult objects. Recognizers that do provide result audio can ignore this call. (See Result Audio for details.)  

TrainingProvided  boolean value indicating whether the application wants the recognizer to support training with FinalResult objects.  



Comments

Popular Posts