Usefull for artificial intelligence Part-5

Artificial Intelligence


5.1     Rule Definitions

A rule grammar is defined by a set of rules. These rules are defined by logical combinations of tokens to be spoken and references to other rules. The references may refer to other rules defined in the same rule grammar or to rules imported from other grammars.

Rule grammars follow the style and conventions of grammars in the Java Speech Grammar Format (defined in the Java Speech Grammar Format Specification). Any grammar defined in the JSGF can be converted to a RuleGrammar object. Any RuleGrammar object can be printed out in JSGF. (Note that conversion from JSGF to a RuleGrammar and back to JSGF will preserve the logic of the grammar but may lose comments and may change formatting.)

Since the RuleGrammar interface extends the Grammar interface, a RuleGrammar inherits the basic grammar functionality described in the previous sections (naming, enabling, activation etc.).

The easiest way to load a RuleGrammar, or set of RuleGrammar objects is from a Java Speech Grammar Format file or URL. The loadJSGF methods of the Recognizer perform this task. If multiple grammars must be loaded (where a grammar references one or more imported grammars), importing by URL is most convenient. The application must specify the base URL and the name of the root grammar to be loaded.


Recognizer rec;
URL base = new URL("http://www.acme.com/app");
String grammarName = "com.acme.demo";

Grammar gram = rec.loadURL(base, grammarName);

The recognizer converts the base URL and grammar name to a URL using the same conventions as ClassLoader (the Java platform mechanism for loading class files). By converting the periods in the grammar name to slashes ('/'), appending a ".gram" suffix and combining with the base URL, the location is "http:// www.acme.com/app/com/acme/demo.gram".

If the demo grammar imports sub-grammars, they will be loaded automatically using the same location mechanism.

Alternatively, a RuleGrammar can be created by calling the newRuleGrammar method of a Recognizer. This method creates an empty grammar with a specified grammar name.

Once a RuleGrammar has been loaded, or has been created with the newRuleGrammar method, the following methods of a RuleGrammar are used to create, modify and manage the rules of the grammar.

Table 6-1 RuleGrammar methods for Rule management
Name  Description
setRule  Assign a Rule object to a rulename.
getRule  Return the Rule object for a rulename.
getRuleInternal  Return a reference to the recognizer's internal Rule object for a rulename (for fast, read-only access).
listRuleNames  List known rulenames.
isRulePublic  Test whether a rulename is public.
deleteRule  Delete a rule.
setEnabled  Enable and disable this RuleGrammar or rules of the grammar.
isEnabled  Test whether a RuleGrammar or a specified rule is enabled.
Any of the methods of RuleGrammar that affect the grammar (setRule, deleteRule, setEnabled etc.) take effect only after they are committed (as described in Section 4.2).

The rule definitions of a RuleGrammar can be considered as a collection of named Rule objects. Each Rule object is referenced by its rulename (a String). The different types of Rule object are described in Section 5.3.

Unlike most collections in Java, the RuleGrammar is a collection that does not share objects with the application. This is because recognizers often need to perform special processing of the rule objects and store additional information internally. The implication for applications is that a call to setRule is required to change any rule. The following code shows an example where changing a rule object does not affect the grammar.


RuleGrammar gram;

// Create a rule for the word blue
// Add the rule to the RuleGrammar and make it public
RuleToken word = new RuleToken("blue");
gram.setRule("ruleName", word, true);

// Change the word
word.setText("green");

// getRule returns blue (not green)
System.out.println(gram.getRule("ruleName"));

To ensure that the changed "green" token is loaded into the grammar, the application must call setRule again after changing the word to "green". Furthermore, for either change to take effect in the recognition process, the changes need to be committed (see Section 4.2).

5.2     Imports

Complex systems of rules are most easily built by dividing the rules into multiple grammars. For example, a grammar could be developed for recognizing numbers. That grammar could then be imported into two separate grammars that defines dates and currency amounts. Those two grammars could then be imported into a travel booking application and so on. This type of hierarchical grammar construction is similar in many respects to object oriented and shares the advantage of easy reusage of grammars.

An import declaration in JSGF and an import in a RuleGrammar are most similar to the import statement of the Java programming language. Unlike a "#include" in the C programming language, the imported grammar is not copied, it is simply referencable. (A full specification of import semantics is provided in the Java Speech Grammar Format specification.)

The RuleGrammar interface defines three methods for handling imports as shown in Table 6-2.

Table 6-2 RuleGrammar import methods
Name  Description
addImport  Add a grammar or rule for import.
removeImport  Remove the import of a rule or grammar.
getImports  Return a list of all imported grammars or all rules imported from a specific grammar.
The resolve method of the RuleGrammar interface is useful in managing imports. Given any rulename, the resolve method returns an object that represents the fully-qualified rulename for the rule that it references.

5.3     Rule Classes

A RuleGrammar is primarily a collection of defined rules. The programmatic rule structure used to control Recognizers follows exactly the definition of rules in the Java Speech Grammar Format. Any rule is defined by a Rule object. It may be any one of the Rule classes described Table 6-3. The exceptions are the RuleParse class, which is returned by the parse method of RuleGrammar, and the Rule class which is an abstract class and the parent of all other Rule objects.

Table 6-3 Rule objects
Name  Description
Rule  Abstract root object for rules.
RuleName  Rule that references another defined rule. JSGF example: <ruleName>
RuleToken  Rule consisting of a single speakable token (e.g. a word). JSGF examples: elephant, "New York"
RuleSequence  Rule consisting of a sequence of sub-rules. JSGF example: buy <number> shares of <company>
RuleAlternatives  Rule consisting of a set of alternative sub-rules. JSGF example: green | red | yellow
RuleCount  Rule containing a sub-rule that may be spoken optionally, zero or more times, or one or more times. JSGF examples: <color>*, [optional]
RuleTag  Rule that attaches a tag to a sub-rule. JSGF example: {action=open}
RuleParse  Special rule object used to represent results of a parse.
The following is an example of a grammar in Java Speech Grammar Format. The "Hello World!" example shows how this JSGF grammar can be loaded from a text file. Below we consider how to create the same grammar programmatically.


grammar com.sun.speech.test;

public <test> = [a] test {TAG} | another <rule>;
<rule> = word;

The following code shows the simplest way to create this grammar. It uses the ruleForJSGF method to convert partial JSGF text to a Rule object. Partial JSGF is defined as any legal JSGF text that may appear on the right hand side of a rule definition - technically speaking, any legal JSGF rule expansion.


Recognizer rec;

// Create a new grammar
RuleGrammar gram = rec.newRuleGrammar("com.sun.speech.test");

// Create the <test> rule
Rule test = gram.ruleForJSGF("[a] test {TAG} | another <rule>");
gram.setRule("test", // rulename
test, // rule definition
true); // true -> make it public

// Create the <rule> rule
gram.setRule("rule", gram.ruleForJSGF("word"), false);

// Commit the grammar
rec.commitChanges();

5.3.1     Advanced Rule Programming

In advanced programs there is often a need to define rules using the set of Rule objects described above. For these applications, using rule objects is more efficient than creating a JSGF string and using the ruleForJSGF method.

To create a rule by code, the detailed structure of the rule needs to be understood. At the top level of our example grammar, the <test> rule is an alternative: the user may say something that matches "[a] test {TAG}" or say something matching "another <rule>". The two alternatives are each sequences containing two items. In the first alternative, the brackets around the token "a" indicate it is optional. The "{TAG}" following the second token ("test") attaches a tag to the token. The second alternative is a sequence with a token ("another") and a reference to another rule ("<rule>").

The code to construct this Grammar follows (this code example is not compact - it is written for clarity of details).


Recognizer rec;

RuleGrammar gram = rec.newRuleGrammar("com.sun.speech.test");

// Rule we are building
RuleAlternatives test;

// Temporary rules
RuleCount r1;
RuleTag r2;
RuleSequence seq1, seq2;

// Create "[a]"
r1 = new RuleCount(new RuleToken("a"), RuleCount.OPTIONAL);

// Create "test {TAG}" - a tagged token
r2 = new RuleTag(new RuleToken("test"), "TAG");

// Join "[a]" and "test {TAG}" into a sequence "[a] test {TAG}"
seq1 = new RuleSequence(r1);
seq1.append(r2);

// Create the sequence "another <rule>";
seq2 = new RuleSequence(new RuleToken("another"));
seq2.append(new RuleName("rule"));

// Build "[a] test {TAG} | another <rule>"
test = new RuleAlternatives(seq1);
test.append(seq2);

// Add <test> to the RuleGrammar as a public rule
gram.setRule("test", test, true);

// Provide the definition of <rule>, a non-public RuleToken
gram.setRule("rule", new RuleToken("word"), false);

// Commit the grammar changes
rec.commitChanges();

5.4     Dynamic Grammars

Grammars may be modified and updated. The changes allow an application to account for shifts in the application's context, changes in the data available to it, and so on. This flexibility allows application developers considerable freedom in creating dynamic and natural speech interfaces.

For example, in an email application the list of known users may change during the normal operation of the program. The <sendEmail> command,

<sendEmail> = send email to <user>;
references the <user> rule which may need to be changed as new email arrives. This code snippet shows the update and commit of a change in users.


Recognizer rec;
RuleGrammar gram;

String names[] = {"amy", "alan", "paul"};
Rule userRule = new RuleAlternatives(names);

gram.setRule("user", userRule);

// apply the changes
rec.commitChanges();

Committing grammar changes can, in certain cases, be a slow process. It might take a few tenths of seconds or up to several seconds. The time to commit changes depends on a number of factors. First, recognizers have different mechanisms for committing changes making some recognizers faster than others. Second, the time to commit changes may depend on the extent of the changes - more changes may require more time to commit. Thirdly, the time to commit may depend upon the type of changes. For example, some recognizers optimize for changes to lists of tokens (e.g. name lists). Finally, faster computers make changes more quickly.

The other factor which influences dynamic changes is the timing of the commit. As Section 4.2 describes, grammar changes are not always committed instantaneously. For example, if the recognizer is busy recognizing speech (in the PROCESSING state), then the commit of changes is deferred until the recognition of that speech is completed.

5.5     Parsing

Parsing is the process of matching text to a grammar. Applications use parsing to break down spoken input into a form that is more easily handled in software. Parsing is most useful when the structure of the grammars clearly separates the parts of spoken text that an application needs to process. Examples are given below of this type of structuring.

The text may be in the form of a String or array of String objects (one String per token), or in the form of a FinalRuleResult object that represents what a recognizer heard a user say. The RuleGrammar interface defines three forms of the parse method - one for each form of text.

The parse method returns a RuleParse object (a descendent of Rule) that represents how the text matches the RuleGrammar. The structure of the RuleParse object mirrors the structure of rules defined in the RuleGrammar. Each Rule object in the structure of the rule being parsed against is mirrored by a matching Rule object in the returned RuleParse object.

The difference between the structures comes about because the text being parsed defines a single phrase that a user has spoken whereas a RuleGrammar defines all the phrases the user could say. Thus the text defines a single path through the grammar and all the choices in the grammar (alternatives, and rules that occur optionally or occur zero or more times) are resolvable.

The mapping between the objects in the rules defined in the RuleGrammar and the objects in the RuleParse structure is shown in Table 6-4. Note that except for the RuleCount and RuleName objects, the object in the parse tree are of the same type as rule object being parsed against (marked with "**"), but the internal data may differ.

Table 6-4 Matching Rule definitions and RuleParse objects
Object in definition  Matching object in RuleParse
RuleToken  Maps to an identical RuleToken object.
RuleTag  Maps to a RuleTag object with the same tag and with the contained rule mapped according to its rule type.
RuleSequence  Maps to a RuleSequence object with identical length and with each rule in the sequence mapped according to its rule type.
RuleAlternatives  Maps to a RuleAlternatives object containing a single item which is the one rule in the set of alternatives that was spoken.
RuleCount **  Maps to a RuleSequence object containing an item for each time the rule contained by the RuleCOunt object is spoken. The sequence may have a length of zero, one or more.
RuleName **  Maps to a RuleParse object with the name in the RuleName object being the fully-qualified version of the original rulename, and with the Rule object contained by the RuleParse object being an appropriate match of the definition of RuleName.
As an example, take the following simple extract from a grammar. The public rule, <command>, may be spoken in many ways. For example, "open", "move that door" or "close that door please".


public <command> = <action> [<object>] [<polite>];
<action> = open {OP} | close {CL} | move {MV};
<object> = [<this_that_etc>] window | door;
<this_that_etc> = a | the | this | that | the current;
<polite> = please | kindly;

Note how the rules are defined to clearly separate the segments of spoken input that an application must process. Specifically, the <action> and <object> rules indicate how an application must respond to a command. Furthermore, anything said that matches the <polite> rule can be safely ignored, and usually the <this_that_etc> rule can be ignored too.

The parse for "open" against <command> has the following structure which matches the structure of the grammar above.


RuleParse(<command> =
RuleSequence(
RuleParse(<action> =
RuleAlternatives(
RuleTag(
RuleToken("open"), "OP")))))

The match of the <command> rule is represented by a RuleParse object. Because the definition of <command> is a sequence of 3 items (2 of which are optional), the parse of <command> is a sequence. Because only one of the 3 items is spoken (in "open"), the sequence contains a single item. That item is the parse of the <action> rule.

The reference to <action> in the definition of <command> is represented by a RuleName object in the grammar definition, and this maps to a RuleParse object when parsed. The <action> rule is defined by a set of three alternatives (RuleAlternatives object) which maps to another RuleAlternatives object in the parse but with only the single spoken alternative represented. Since the phrase spoken was "open", the parse matches the first of the three alternatives which is a tagged token. Therefore the parse includes a RuleTag object which contains a RuleToken object for "open".

The following is the parse for "close that door please".


RuleParse(<command> =
RuleSequence(
RuleParse(<action> =
RuleAlternatives(
RuleTag(
RuleToken("close"), "CL")))
RuleSequence(
RuleParse(<object> =
RuleSequence(
RuleSequence(
RuleParse(<this_that_etc> =
RuleAlternatives(
RuleToken("that"))))
RuleAlternatives(
RuleToken("door"))))
RuleSequence(
RuleParse(<polite> =
RuleAlternatives(
RuleToken("please"))))
))

There are three parsing issues that application developers should consider.

Parsing may fail because there is no legal match. In this instance the parse methods return null.
There may be several legal ways to parse the text against the grammar. This is known as an ambiguous parse. In this instance the parse method will return one of the legal parses but the application is not informed of the ambiguity. As a general rule, most developers will want to avoid ambiguous parses by proper grammar design. Advanced applications will use specialized parsers if they need to handle ambiguity.
If a FinalRuleResult is parsed against the RuleGrammar and the rule within that grammar that it matched, then it should successfully parse. However, it is not guaranteed to parse if the RuleGrammar has been modified of if the FinalRuleResult is a REJECTED result. 

Comments

Popular Posts