The Semantics Modules
Thus far, we have focused on constructing a parser that produces a constituent analysis for sentences. It can parse simple declarative, interrogative and imperative sentences, and also handle limited types of lexical and structural ambiguities. If we are going to use our parser to produce natural language statements that refer to our miniature world, we will need a way to reverse the process so that the program can use the grammar to generate responses. Parsing and production are complementary pieces of natural language processing. The first is commonly referred to as Natural Language Understanding, while the second is referred to as Natural Language Production. Our focus in this and subsequent weeks will be on the task of getting our program to produce responses that are relevant to the miniature world.
Lexicalization
The first task we face is to turn our parser’s output back into words. Our parser operates by first translating an input sentence into a string of code that marks the part of speech, person number and subcategorization frame of each word in the sentence. The parser then groups these codes into syntactic constituents and checks the features that mark syntactic relations between specific constituents such as the subject and verb. The first problem that we need to solve is how to convert our parsed constituent structure back into the original words of the input sentence. I will label this process relexicalization since we originally used lexicalization processes (stemming and lexical look up) to convert the input string of words into their corresponding parts of speech.
The parse string that our program produces already contains all of the information we need for relexicalization. Each element of the parse string is separated by a space from the other elements of the parse string in the same order as the words in the original sentence. Our program already uses one array to keep track of the words in the input sentence (@words). It is a simple matter to build another array that parallels the @words array, but contains the part of speech information in our parse. Let’s call this new array @parsed for the parsed constituents that we will store in it. We can use Perl’s split function to store the parsed constituent data in @parsed in the same way that we used the split function to store the original words in @words, namely
@parsed = split / /, $parse[$j];
Now we need a way to build a new parse string out of the original words in the sentence, but with the constituent boundaries marked in the proper places. This procedure will allow our parser to identify which words correspond to the subject and which words correspond to the other sentence constituents. This information is also available in our parse string in the form of the constituent labels that mark the beginning of each phrasal constituent (i.e., NP, VP, PP, CP or AUX). Our relexicalization procedure can add words from the @words array to a new @relex array until we find a constituent label in the $parse[$j] string. At this point our procedure can add a left bracket to the @relex array and start a new phrasal constituent. I provide the relex subroutine in Figure 11.1 that accomplishes this task.
Figure 11.1 The relexicalization subroutine
sub relex {
@con = ''; # constituent array
my $con = 0; # constituent count
my $pattern = "[NVPC]P";
my @parsed = split / /, $_[0];
if ( $parsed[0] eq '' ) {
shift( @parsed ); }
my @words = @{ $_[1] };
foreach $parsed (@parsed) {
my $word = shift( @words );
if ( $parsed =~ /($pattern)/ || $parsed =~ /(AUX)/ ) {
$con++;
$con[$con] = $con[$con] . $word;
}
else {
$con[$con] = $con[$con] . ' ' . $word;}
} # end foreach
} # end sub relex
This procedure uses Perl’s shift function to get the word from the @words array that corresponds to each $parsed constituent in the @parsed array. The shift function extracts the first element from an array. If our @words array holds the elements ‘an, egg, is, in, the, bowl’, then the command shift(@words) would first extract the word ‘an’.
The relex subroutine stores the words that correspond to each phrasal constituent in the array @con. The phrasal constituents in @con will correspond to their order in the original sentence so that the subject, if there is one, will be the first element in the @con array (i.e., $con[1]), and the auxiliary, if there is one, will be the second element in the @con array. If our parse string has the form
$parse[$j] = "NP[$1] VP[[$12] NP[$15] PP[[P] NP[$21]]]";
we can build a new parse with the original words by adding the lines
relex($parse[$j]);
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]] PP[$con[4] NP[$con[5]]]]";
The first parse string contains the parts of speech and phrasal constituent boundaries. The call to the relex subroutine creates the @con array of relexicalized constituents. The final line creates a new parse string by replacing the parts of speech with their corresponding words from the sentence. The call to the print command will then output a parsed string that contains the original words in the input sentence with an indication of the phrasal boundaries.
Thematic Roles
The constituent information contained in the @con array also provides an excellent source of semantic information about our sentence. The major syntactic constituents in a sentence correspond to the major semantic roles or thematic elements of the sentence. The subject of most declarative sentences usually provides information about the Agent–the entity responsible for initiating the action. The verb provides information about the action, the object provides information about the Theme, or entity affected by the action, and the prepositional phrase provides information about the Location of the action. We can preserve this thematic data by assigning the relevant relexicalized constituents to a set of thematic variables. In the case of the previous example, we find the following thematic relations.
Variable Constituent
$Subject $con[1]
$Action $con[2]
$Object $con[3]
$Location $con[5]
The thematic variables will provide the basis for the computer’s response to questions about the objects in its miniature world. Once again, all of these modifications will be made to the Parse module. I provide the revised Parse module in Figure 11.2.
Figure 11.2 The relexicalization Parse module
#!usr/local/bin/perl
# Parse8.pm
# Works with relex2.pl
# implements relexicalization of Parse5
package Parse8;
use Exporter;
our @ISA = ('Exporter');
our @EXPORT = qw( &parse &print );
#The Grammar
my $NP = "(DET(\.[0-3]); )?(ADJ\.; )*N(\.[0-3]?)?;";
my $PP = "P(?:\.; )($NP)";
my $NPwh = "(DET(\.[0-3]); )?(ADJ )*Nwh(\.3);( $PP)*";
my $NP2 = "$NP( $PP)*";
my $V1 = "(V(\.[0-3])(;[:np_]*)) ?($NP)? ?";
my $V2 = "(V(\.[0-3])(;[:np_]*)) ?";
my $Swh = "([NP]wh(\.[2-3]))?(?:; )?(AUX(\.[0-3])(;[:nvp_]*)) ($NP2)? ?";
sub parse {
our @parse = '';
my @string = @{ $_[0] };
my @words = @{ $_[1] };
$i = 0; # initialize the string index (lexical ambiguity)
foreach $string (@string) {
chop($string);
$j = 0; # initialize the parse index (structural ambiguity)
my $string1 = $string;
my $string2 = $string1;
my $string3 = $string2;
#SVOP
if ( $string1 =~s/^($NPwh) ($V1)// || $string1 =~ s/^($NP2)? ?($V1)// ) { # VP with object NP
my $subject = $1; my $object = $16;
my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;
my $subcat = $15;
#There's another verb!
if ( $string1 =~m/$V2/ ) {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
#A question?
elsif ( $subject eq 'Nwh.3;' ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[3]; $Location = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]] PP[$con[4] NP[$con[5]]]\n";
}
}
#Declaratives & Imperatives
else {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";
relex($parse[$j], \@words);
if ( $subject ne '' ) {
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[3]; $Location = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]] PP[$con[4] NP[$con[5]]]\n";
}
} # if overt subject
else {
$Action = $con[1];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "VP[$con[1]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[2];
$parse[$j] = "VP[$con[1] NP[$con[2]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[3];
$parse[$j] = "VP[$con[1] PP[$con[2] NP[$con[3]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[2]; $Location = $con[4];
$parse[$j] = "VP[$con[1] NP[$con[2]] PP[$con[3] NP[$con[4]]]\n";
}
} # end else
}
agreechk($det_agr, $n_agr, $v_agr);
subcatchk($subcat, $object, $string1);
}
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
$j = $j + 1;
#SVO
if ( $string2 =~s/^($NPwh) ($V2)// || $string2 =~s/^($NP2)? ?($V2)// ) { # plain VP
my $subject = $1; my $object = $string2;
my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;
my $subcat = $15;
if ( $string2 =~m/$V2/ ) {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
#who moved the egg?
elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$NP/ ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
#what is in the egg?
elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$PP/ ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . PP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
#A PP verb complement?
elsif ( $string2 =~ /^$PP/ ) {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . PP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
#No direct object?
elsif ( $string2 !~ /$NP/ ) {
$parse[$j] = NP ($subject) . " VP[V$v_agr]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2];
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
#Declaratives & Imperatives
else {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($string2) . "]\n";
relex($parse[$j], \@words);
if ( $subject ne '' ) {
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' ) {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string2 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object ne '' ) {
$Object = $con[3]; $Location = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3] PP[$con[4] NP[$con[5]]]]]\n";
}
} # if overt subject
else {
$Action = $con[1];
if ($object eq '' ) {
$parse[$j] = "VP[$con[1]]\n";
}
elsif ($object ne '' ) {
$Object = $con[2];
$parse[$j] = "VP[$con[1] NP[$con[2] PP[$con[3] NP[$con[4]]]]]\n";
}
} # end else
}
agreechk($det_agr, $n_agr, $v_agr);
subcatchk($subcat, $string2);
}
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
$j = $j + 1;
# a question?
if ( $string3 =~ s/$Swh// ) {
my $comp = $1; my $aux = $3; my $subject = $6;
my $det_agr = $8; my $n_agr = $10; my $v_agr = $4;
my $subcat = $5;
# what do you see?
if ( $subject ne '' && $comp eq 'Nwh.3' && $string3 =~ s/$V2// ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . PP ($string3) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[3]; $Action = $con[4]; $Object = $con[1]; $Location = $con[6];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]] VP[$con[4] PP[$con[5] NP[$con[6]]\n”;
agreechk($8, $10, $4); }
# is the bowl on the table?
elsif ( $comp eq '' && $subject ne '' && $subject =~ /$PP/ ) {
$subject =~ s/($NP)//;
$location = $subject; $subject = $1;
$parse[$j] = "AUX[$aux] NP[$subject]" . PP ($location) . "\n";
relex($parse[$j], \@words);
$Subject = $con[2]; $Action = $con[1]; $Location = $con[4];
$parse[$j] = "AUX[$con[1]] NP[$con[2]] PP[$con[3] NP[$con[4]]]\n";
agreechk($det_agr, $n_agr, $v_agr); }
# where is the bowl?
elsif ( $subject ne '' && $string3 eq '' && $comp eq 'Pwh.2' ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . "\n";
relex($parse[$j], \@words);
$Subject = $con[3]; $Action = $con[2]; $Location = $con[1];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]]\n";
agreechk($8, $10, $4); }
# where did you put the eggs?
elsif ( $subject ne '' && $comp eq 'Pwh.2' && $string3 =~ s/$V2// ) {
my $object = $string3;
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . NP ($string3) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[3]; $Action = $con[4]; $Object = $con[5]; $Location = $con[1];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]] VP[$con[4] NP[$con[5]]]\n";
agreechk($8, $10, $4); }
# what is in the bowl?
elsif ( $subject eq '' && $string3 ne '' && $comp eq 'Nwh.3' ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . PP ($string3) . "\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Location = $con[3];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] PP[$con[3] NP[$con[4]]]\n";
agreechk($6, $2, $4);}
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
}
else {
$parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";
}
&print;
$i = $i + 1; # increment the string index
} #end foreach string
} #end sub parse
sub NP { # parse NP
my $string = shift;
my $parse;
if ( $string =~ s/($NP)// ) {
$parse = " NP[$1" . PP ($string) . "]"; # call PP
}
} # end sub NP
sub PP { # parse PP
my $string = shift;
my $parse;
if ( $string !~ s/(P\.;) ($NP)// ) { # end recursion
return;
}
else { # PP recursion
$parse = " PP[$1 NP[$2" . PP($string) . "]]";
}
} # end sub PP
# Check agreement
sub agreechk {
# NP Agreement Check
my($det_agr, $n_agr, $v_agr) = @_;
if ( ( $det_agr eq '.3' ) && ( $n_agr ne '.0' ) && ( $n_agr ne $det_agr ) ) {
$parse[$j] = $parse[$j] . " NP PARSE FAILS AGREEMENT CHECK!\n";
return;}
else {
$np_agr = $n_agr; }
# Subject-Verb Agreement Check
if ( $np_agr ne '.3' && $np_agr ne '.0' && $v_agr eq '.3' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";
return; }
elsif (( $np_agr eq '.3' ) && ( $v_agr ne '.0' ) && ( $v_agr ne '.3' )) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";
return; }
} # end sub agreechk
# check subcategory restrictions
sub subcatchk {
my($subcat, $object, $pp) = @_;
if ( ($subcat eq ";_np" || $subcat eq ";_np:_pp:") && $object eq '' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
elsif ( ($subcat eq ";_pp" || $subcat eq ";:_np:_pp") && $pp eq '' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
elsif ( $subcat eq ";_np_pp" && ( $object eq '' || $pp eq '' ) ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
} # end sub subcatchk
sub relex {
@con = ''; # constituent array
my $con = 0; # constituent count
my $pattern = "[NVPC]P";
my @parsed = split / /, $_[0];
if ( $parsed[0] eq '' ) {
shift( @parsed ); }
my @words = @{ $_[1] };
foreach $parsed (@parsed) {
my $word = shift( @words );
if ( $parsed =~ /($pattern)/ || $parsed =~ /(AUX)/ ) {
$con++;
$con[$con] = $con[$con] . $word;
}
else {
$con[$con] = $con[$con] . ' ' . $word;}
} # end foreach
} # end sub relex
sub print {
print "\nstring[$i] is: $string\n";
my $j = 0;
for ( $j = 0; $j <= 2; $j = $j + 1 ) {
print " parse[$j] = $parse[$j]"; }
@parse = '';
} #end sub print
return 1;
World Creation
To respond intelligently to questions about its miniature world, our program will require some information about the entities in its realm. We need to provide the computer with such information as the size, color, location and function of the objects as well as information about the specific responses that correspond to questions or commands. Much of this information can be supplied in the program’s lexicon.
We can begin with the objects. We have worked with a restricted set of objects to this point in order to focus on the syntactic side of parsing. Considering the egg, bowl and table, we can begin with a typical set of locative relationships where the egg rests in the bowl which sits on the table. These relationships illustrate the functions of containment and support that are basic spatial relations between entities. The egg must be smaller than the bowl to fit inside it, and the table must provide support to smaller objects on its surface. We can further assume some limits to the types of objects we can manipulate. We can move an egg, and can probably move a table, but we would be hard pressed to move a floor. We should add these properties to the entries for the objects in our lexicon.
These properties will require a more complex lexical structure than the flat structure that we have been using to this point. Perl’s hash structure makes it easy to look up values for the items in the hash. We need a way to create a hash for each of the objects in our lexicon so that we could easily access its size, color and location. For example, the egg hash might look something like that shown in Figure 11.3.
Figure 11.3 An egg hash
egg => {pos => "N/3/",
color => "brown",
loc => "bowl",
size => "2",
shape => "round", },
Here, each of the egg’s features correspond to the hash’s keys which point to their specific values. We can easily access the egg’s color by checking the value of the variable $egg{color}. The hash structure makes it possible to give each object its own set of properties. We do not have to provide each object with the same set of keys in our hash. The only problem is that our objects are themselves keys in our original %lex hash. We will require a hash of hashes structure to keep track of our objects and their corresponding properties. Figure 11.4 illustrates a hash of hashes structure for a few of our objects.
Figure 11.4 A hash of hashes lexicon
%lex = (
"egg" => {pos => "N/3/",
color => "brown",
loc => "bowl",
size => "2",
shape => "round", },
"bowl" => {pos => "N/3/",
color => "blue",
loc => "table",
size => "3",
shape => "square",
contains => "yes", },
"table" => {pos => "N/3/",
color => "brown",
loc => "floor",
size => "4",
shape => "square", },
)
Using this hash structure, we can access information about the egg’s size via the scalar variable $lex{“egg”}{size}, and access the word’s part of speech information via the variable $lex{$word}{pos}. Now that we have a set of objects with their own semantic features, we can begin to consider how the computer might manipulate each object.
Verb semantics
Commands to the program to manipulate its objects provide a good place to consider the semantic interpretation of verbs. Commands will have a general semantic form that instructs the computer to do something to some object (or group of objects). The particular outcome will vary according to the nature of the action and properties of the object. The general semantic form and the specific result can be factored into separate semantic modules. The general form provides an interpretation of the complete sentence. This task falls under the heading of propositional semantics in linguistics. The task of providing a specific outcome belongs to the realm of lexical semantics. The division between propositional semantics and lexical semantics provides a guide to constructing program modules that implement these levels of semantic analysis. We can build a propositional semantics module that implements the imperative instruction to ACT on the OBJECT. The specific outcome of this instruction will be guided by the instructions stored in the verb’s lexical entry and the properties of the object stored in the object’s lexical entry. I provide an example of a propositional semantics module for commands in Figure 11.5.
Figure 11.5 A propositional semantics module for commands
sub imperative {
$Action = $_[0];
$Object = $_[1];
$Loc = $_[2];
if ( $Object =~ s/the // ) {
$objdet = "the "; }
else { $objdet = ''; }
if ( $Loc =~ s/the // ) {
$locdet = "the "; }
else { $locdet = ''; }
if(exists $lex{$Action}{action}){
$lex{$Action}{action}->($Object); }
else{$response="I don't understand."}
} # end sub imperative
The heart of the command subroutine relies on the statement
$lex{$Action}{action}->($Object);
This statement looks up the {action} instruction stored in the lexical entry of the verb stored in the $Action variable. The {action} instruction is actually a subroutine that implements the command with the object from the variable $Object. The statement passes the object variable as an argument to the subroutine {action} in the verb’s lexical entry. The actual outcome is then just a matter of subroutine processing. The $Action and $Object variables, of course, were thematic arguments produced in the relexicalization process.
All that is required, now, are lexical entries for the verbs. I provide a lexical entry for the verb put in Figure 11.6. This entry has three parts: 1. part of speech information, 2. a test to see if the object can be moved, and 3. the instructions for moving the object. The test to see if the object can be moved checks to see if the object’s size is less than 4. If so, the object can be moved; if not, the program prints the message that the object is too heavy. Fixed objects such as floors do not have a size feature, and generate the message that they cannot be moved. The action subroutine first checks the value of the variable $canput to see if the object can be moved. If the value is ‘yes’, the action subroutine changes the value of the object’s location to the value of the location variable $Loc generated by relexicalization. The action subroutine also generates a reply that states the new location of the object. If the object is too big to fit in the location, the subroutine will complain.
Figure 11.6 Lexical entry for put
"put" => { pos => "V/2/_np_pp",
can=>sub{
if(exists $lex{$_[0]}{size}){
if($lex{$_[0]}{size}<4){
$canput="yes"}
else{$canput="The $_[0] is too heavy!"}}
else{$canput="You can't move a $_[0]"}
return $canput},
action=>sub{
if($lex{put}{can}->($_[0])eq"yes"){
if( $lex{$Object}{size} < $lex{$Loc}{size} ){
$lex{$Object}{loc} = $Loc;
if ( $lex{$Loc}{contains} eq 'yes' ) {
$reply = "Ok, $objdet$Object is in the $lex{$Object}{loc}."; }
else { $reply = "Ok, $objdet$Object is on the $lex{$Object}{loc}."; }
} # end if < size
else {$reply = "The $Object is too big to fit on $locdet$Loc."; }
}
else{$reply=$lex{put}{can}->($_[0])}}},
By combining lexical semantics with propositional semantics in this fashion, our program increases both the scope of the commands it can process and the specific outcomes it generates.
Locating Objects
Now that we have a procedure for moving objects, we should create a procedure that tells us where the objects are. Fortunately, we can use the imperative subroutine that we just examined to keep track of object locations as well. To answer the question ‘Where is the bowl?’, we can send information about the action (‘is’), the object (‘bowl’) and the location (‘where’) to the imperative subroutine. The subroutine will then check the action feature for the verb is. We only need to add an instruction to the verb’s action feature that tells the computer to look up the location of the object when the location is questioned by where. Figure 10.7 provides the necessary lexical semantics for the verb is. It checks to see if the question contained the word what or where. If the question begins ‘what is’, the action subroutine assumes the user is asking what things are in a particular location. If the question begins ‘where is’, the action subroutine assumes the user is asking about the location of a particular object. Otherwise, the subroutine assumes the question is of the form ‘Is the x in/on the y’. The action procedure then constructs an appropriate reply.
Figure 11.7 Lexical entry for the verb is
"is" => { pos => "AUX/3/_np_pp;V/3/_pp",
action=>sub{
# where is the block
if ( $Loc eq 'where' ) {
my $locat = $lex{$Object}{loc};
$reply = "The $Object $Action $lex{$locat}{prep} the $Loc.";
} # end where is
# what is on the table?
elsif ( $Object eq 'what' ) {
my @match = grep { $Loc =~ $lex{$_}{loc} } @things;
if ( $match[1] eq '' ) {
$reply = "The @match $Action $lex{$Loc}{prep} $locdet$Loc."; }
else {
my $last = pop( @match );
my $set = '';
foreach my $item ( @match ) {
$set = $set . "$item, ";
}
$reply = "The $set" . "and $last are $lex{$Loc}{prep} $locdet$Loc."; }
} # end what is
elsif ( $Loc =~ /$lex{$Object}{loc}/ ) {
$reply = "yes"; }
else { $Loc = $lex{$Object}{loc};
$reply = "No, the $Object $Action $lex{$Loc}{prep} the $Loc.";
} # end else
}},
Creating appropriate responses to questions about the location of objects requires some consideration of how objects are located with respect to other objects. We typically put things IN bowls and ON tables. A simple way to keep track of this detail is to add a preposition feature to each object to signal whether the object can support (‘on’) or contain (‘in’) other objects. The ‘is’ subroutine can then access this feature through the variable $lex{$Loc}{prep}, as shown in the statement
$reply = "The $Object $Action $lex{$Loc}{prep} the $Loc.";
We face another problem when we ask a question such as ‘What is on the table?’ If there is more than one object on the table, we would expect the computer to name all of the objects. The grep function in Perl provides a quick way to search through an array of objects to find the ones that satisfy our condition. I created the object array @things for this purpose. We can then apply the grep function to each object in this array and test to see if the object’s location matches the location in the question stored in the variable $Loc. The grep function adds all of the objects with the proper location to the array @match. All of these tasks can be accomplished with the statement
my @match = grep { $Loc =~ $lex{$_}{loc} } @things;
The routine then generates the reply that we found these things in the specified location.
Modifiers
In addition to simple objects such as the bowl and the table, the program should respond to questions about objects with modifiers such as the red block or the blue bowl. The syntactic parse already analyzes noun phrases containing adjectives, but these adjectives have a semantic effect as well. They suggest that objects with each of these descriptions are present in our miniature world. In order to add color modifiers to our world, we will need to add objects with these properties. We can begin by adding a red block and a blue block. We cannot simply add these colors to the lexical entry for block, however. We would end up with one block that was red AND blue, which is not what we want in this case. The simple solution is to add the items ‘red block’ and ‘blue block’ to our lexical hash. This allows us to keep the two blocks separate and give them their own individual identities.
It is important to remember that we have to supply a separate identity for each item in our miniature world. We could assign each item an arbitrary number. We would then refer to each object with names like ‘item12’ and ‘item56’. If we added two similar objects to our world, such as two red blocks, we would still need a separate designation for each. The two identical objects would still have separate physical locations and separate identities. We could ask for the red block on the left or right. If the blocks had different sizes, we could refer to them as ‘the big red block’ and ‘the little red block’. If the blocks were identical, we would need identifiers such as ‘red block 1’ and ‘red block 2’. With separate identifiers, we can instruct the program to put one red block on top of the other, or move a specific red block off the table.
The introduction of the red and blue blocks leads to an interesting complication for our program–how to deal with plural objects. When the program encounters a command such as “Where are the blocks?” we would like it to respond with the location of each of the blocks. Since our current program lacks a lexical entry for the word are we can simply add the necessary lines of code to address this question. Figure 11.8 provides an example.
Figure 11.8. A lexical entry for are
"are" => { pos => "AUX/3/_np_pp;V/3/_pp",
action=>sub{
$stem = $Object;
substr($stem, -1) = ''; # remove the -s suffix
my @objects = grep { $_ =~ $stem } @things;
foreach (@objects) {
$loc{$_} = $lex{$_}{loc};
$locat = $loc{$_};
}
my $loc1 = $loc{$objects[0]}; my $loc2 = $loc{$objects[1]};
# where are the blocks
if ( $Loc eq 'where' ) {
if ( $loc{$objects[0]} eq $loc{$objects[1]} ) {
$reply = "The $Object $Action $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}.";
}
else {
$reply = "The $objects[0] is $lex{$loc1}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc2}{prep} the $loc{$objects[1]}.\n";
}
} # end where is
elsif ( $Loc eq $loc1 && $loc1 eq $loc2 ) {
$reply = "Yes"; }
elsif ( $loc1 eq $loc2 ) {
$reply = "No, the $Object $Action $lex{$locat}{prep} the $locat."; }
else {
$reply = "No, the $objects[0] is $lex{$loc1}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc2}{prep} the $loc{$objects[1]}.\n";
} # end else
}},
This entry contains two separate blocks of code. The first processes the word for the object by removing the plural inflection /-s/, looking up all of the words in the lexicon that contain the object, and then looking up the locations for each of these objects. If the user asks “Where are the blocks?” the program removes the plural inflection from the word blocks and then searches the lexical hash for all of the keys that contain the word block. This search yields the entries ‘red block’ and ‘blue block’. The program can then look up the locations for the red block and blue block and report them back to the user.
While this code handles plural responses for the verb are, it violates the basic organizational goals of the project in that it puts code for object processing in the lexical entry of a verb. Using the same approach for our other verbs would require adding the same object processing code to each of the verb entries. This approach misses an obvious generalization in that the program should handle the processing of plural objects just once and leave the rest of processing to the individual verb entries. Figure 11.9 provides the code for the imperative subroutine that includes processing for plural objects.
Figure 11.9 Imperative subroutine for plural objects
sub imperative {
$Action = $_[0];
$Object = $_[1];
$Loc = $_[2];
if ( $Object =~ s/the // ) {
$objdet = "the "; }
else { $objdet = ''; }
if ( $Loc =~ s/the // ) {
$locdet = "the "; }
else { $locdet = ''; }
if(exists $lex{$Action}{action}){
if ( substr($Object, -1) eq 's' ) {
$stem = $plural = $Object;
substr($stem, -1) = ''; # remove the -s suffix
@objects = grep { $_ =~ $stem } @things;
foreach (@objects) {
$Object = $_;
$loc{$_} = $lex{$_}{loc};
$lex{$Action}{action}->($Object);
}
}
else {
$lex{$Action}{action}->($Object); }
}
else{$reply="I don't know how to \’$Action the $Object\’."}
} # end sub imperative
The subroutine checks to see if the object ends in /s/. If so, it removes the /s/ and creates the array @objects that contains all of the hash entries that contain the target. The program looks up the location of each of these items and stores the result in the hash %loc. Finally, the subroutine invokes the action separately for each object. This process can then be combined with the new entry for are shown in Figure 11.10. The new entry basically searches to see if the two objects are located in the same place and outputs a separate response for each contingency.
Figure 11.10 The revised lexical entry for are
"are" => { pos => "AUX/3/_np_pp;V/3/_pp",
action=>sub{
# where are the blocks
if ( $Loc eq 'where' ) {
if ( $loc{$objects[0]} eq $loc{$objects[1]} ) {
$reply = "The $plural $Action $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}.";
}
else {
$reply = "The $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";
}
} # end where is
elsif ( $Loc eq $loc{$objects[0]} && $Loc eq $loc{$objects[1]} ) {
$reply = "Yes"; }
elsif ( $Loc ne $loc{$objects[0]} && $loc{$objects[0]} eq $loc{$objects[1]} ) {
$reply = "No, the $plural $Action $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}."; }
elsif ( $Loc ne $loc{$objects[0]} && $Loc ne $loc{$objects[1]} ) {
$reply = "No, the $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";}
elsif ( $Loc ne $loc{$objects[0]} ) {
$reply = "No, the $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}."; }
else {
$reply = "No, the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";
} # end else
}},
Figure 11.11 provides a revised lexical entry for the put. The only change from the previous entry for put is the action for plural objects. I use the variable $plural created in the imperative subroutine to identify plural object contexts.
Figure 11.11 The revised lexical entry for put
"put" => { pos => "V/2/_np_pp",
can=>sub{
if($lex{$_[0]}{size}<6){
if($lex{$_[0]}{size}<4){
$canput="yes"}
else{$canput="The $_[0] is too heavy!"}}
else{$canput="You can't move the $_[0]"}
return $canput},
action=>sub{
if($lex{put}{can}->($_[0])eq"yes"){
if( $lex{$_[0]}{size} < $lex{$Loc}{size} ){
$lex{$_[0]}{loc} = $Loc;
if ( $plural eq '' ) {
$reply = "Ok, $objdet$_[0] is $lex{$Loc}{prep} the $Loc."; }
else {
$reply = "Ok, $objdet$plural are $lex{$Loc}{prep} the $Loc."; }
} # end if < size
else {$reply = "The $_[0] is too big to fit $lex{$Loc}{prep} $locdet$Loc."; }
}
else{$reply=$lex{put}{can}->($_[0])}}},
Figure 11.12 provides the comple program
Figure 11.12 A block world program
#!/usr/local/bin/perl
# block6c.pl
# Produces a block world with 'are'
use English;
use Stem3;
use Ambi3;
# Initialize parameters
my @string = ''; my $word = ''; @Object = ''; @Loc = '';
my %pos = (); my %number = (); my %subcat = (); my %loc = ();
my %action = (); my %can = ();
@cat = ''; #the parsed words
@things = ("bowl", "red block", "blue block", "egg", "table", "floor");
# Block 1. Set the scene
print "\nWelcome to the miniature block world.\n";
print "\nThis world has a room with a table in it. On the table\n";
print "is a block and a bowl. The bowl contains an egg. Type in\n";
print "an instruction to manipulate the objects, but be careful!\n";
print "Don't break anything!\n";
# Block 2. The Lexicon
%lex = (
"I" => { pos => "N/1/", loc => "room", },
"me" => { pos => "N/2/", },
"found" => { pos => "V/0/_np:_pp:", },
"find" => { pos => "V/2/_np:_pp:", },
"move" => { pos => "V/2/_np:_pp:",
can=>sub{
if($lex{$_[0]}{size}<6){
if($lex{$_[0]}{size}<4){
$canmove="yes"}
else{$canmove="The $_[0] is too heavy!"}}
else{$canmove="You can't move a $_[0]"}
return $canmove},
action=>sub{
if($lex{move}{can}->($_[0])eq"yes"){
$lex{$_[0]}{loc} = "floor";
if ( $plural eq '' ) {
$reply = "Ok, consider the $_[0] moved."; }
else {
$reply = "Ok, consider the $plural moved."; }
}
else{$reply=$lex{move}{can}->($_[0])}}},
"moved" => { pos => "V/2/_np:_pp:", },
"put" => { pos => "V/2/_np_pp",
can=>sub{
if($lex{$_[0]}{size}<6){
if($lex{$_[0]}{size}<4){
$canput="yes"}
else{$canput="The $_[0] is too heavy!"}}
else{$canput="You can't move the $_[0]"}
return $canput},
action=>sub{
if($lex{put}{can}->($_[0])eq"yes"){
if( $lex{$_[0]}{size} < $lex{$Loc}{size} ){
$lex{$_[0]}{loc} = $Loc;
if ( $plural eq '' ) {
$reply = "Ok, $objdet$_[0] is $lex{$Loc}{prep} the $Loc."; }
else {
$reply = "Ok, $objdet$plural are $lex{$Loc}{prep} the $Loc."; }
} # end if < size
else {$reply = "The $_[0] is too big to fit $lex{$Loc}{prep} $locdet$Loc."; }
}
else{$reply=$lex{put}{can}->($_[0])}}},
"give" => { pos => "V/2/_np_pp;V/2/_np_np", },
"gave" => { pos => "V/0/_np_pp", },
"block" => {pos => "N/3/",object => "category",color => "red",
loc => "table",size => 2,shape => "square", },
"red block" => {pos => "N/3/",color => "red",prep => "on",
loc => "table",size => 2,shape => "square", },
"blue block" => {pos => "N/3/",color => "blue",prep => "on",
loc => "table",size => 3,shape => "square", },
"egg" => {pos => "N/3/",color => "brown",loc => "bowl",
size => 2,shape => "round", },
"table" => {pos => "N/3/",color => "brown",prep => "on",
loc => "floor",size => 4,shape => "square", },
"bowl" => {pos => "N/3/",color => "blue",prep => "in",
loc => "table",size => 3,shape => "square", },
"floor" => {pos => "N/3/",color => "yellow",prep => "on",
loc => "ground",size => 7,shape => "square", },
"a" => { pos => "DET/3/", },
"an" => { pos => "DET/3/", },
"the" => { pos => "DET/0/", },
"has" => { pos => "V/3/_np", },
"red" => { pos => "ADJ//", },
"blue" => { pos => "ADJ//", },
"on" => { pos => "P//", },
"in" => { pos => "P//", },
"to" => { pos => "P//", },
"with" => { pos => "P//", },
"at" => { pos => "P//", },
"it" => { pos => "N/3/", },
"you" => { pos => "N/2/", },
"anyone" => { pos => "N/3/", },
"we" => { pos => "N/2/", },
"is" => { pos => "AUX/3/_np_pp;V/3/_pp",
action=>sub{
# where is the block
if ( $Loc eq 'where' ) {
$reply = "The $_[0] $Action $lex{$lex{$_[0]}{loc}}{prep} the $lex{$_[0]}{loc}.";
} # end where is
# what is on the table?
elsif ( $Object eq 'what' ) {
my @match = grep { $Loc =~ $lex{$_}{loc} } @things;
if ( $match[1] eq '' ) {
$reply = "The @match $Action $lex{$Loc}{prep} $locdet$Loc."; }
else {
my $last = pop( @match );
my $set = '';
foreach my $item ( @match ) {
$set = $set . "$item, ";
}
$reply = "The $set" . "and $last are $lex{$Loc}{prep} $locdet$Loc."; }
} # end what is
elsif ( $Loc =~ /$lex{$_[0]}{loc}/ ) {
$reply = "yes"; }
else { $Loc = $lex{$_[0]}{loc};
$reply = "No, the $_[0] $Action $lex{$Loc}{prep} the $Loc.";
} # end else
}},
"am" => { pos => "AUX/1/_np_pp;V/1/_pp", },
"are" => { pos => "AUX/3/_np_pp;V/3/_pp",
action=>sub{
# where are the blocks
if ( $Loc eq 'where' ) {
if ( $loc{$objects[0]} eq $loc{$objects[1]} ) {
$reply = "The $plural $Action $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}.";
}
else {
$reply = "The $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";
}
} # end where is
elsif ( $Loc eq $loc{$objects[0]} && $Loc eq $loc{$objects[1]} ) {
$reply = "Yes"; }
elsif ( $Loc ne $loc{$objects[0]} && $loc{$objects[0]} eq $loc{$objects[1]} ) {
$reply = "No, the $plural $Action $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}."; }
elsif ( $Loc ne $loc{$objects[0]} && $Loc ne $loc{$objects[1]} ) {
$reply = "No, the $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]},\n and the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";}
elsif ( $Loc ne $loc{$objects[0]} ) {
$reply = "No, the $objects[0] is $lex{$loc{$objects[0]}}{prep} the $loc{$objects[0]}."; }
else {
$reply = "No, the $objects[1] is $lex{$loc{$objects[1]}}{prep} the $loc{$objects[1]}.\n";
} # end else
}},
"where" => { pos => "Pwh/2/", },
"who" => { pos => "Nwh/3/", },
"what" => { pos => "Nwh/3/", },
"which" => { pos => "DET/0/", },
"did" => { pos => "AUX/0/_np_vp;V/0/_np", },
"do" => { pos => "AUX/2/_np_vp;V/2/_np", },
"does" => { pos => "AUX/3/_np_vp;V/3/_np", },
"can" => { pos => "AUX/0/_vp", },
"go" => { pos => "V/2/_pp", },
"goes" => { pos => "V/3/_pp", },
"have" => { pos => "V/2/_np", },
"has" => { pos => "V/3/_np", },
"saw" => { pos => "V/0/:_np:", },
"see" => { pos => "V/2/:_np:", },
);
foreach $word (keys %lex) {
$lex{$word}{pos} =~ /(.*)\/(.*)\/(.*)/ ;
$pos{$word} = $1;
$number{$word} = $2;
$subcat{$word} = $3;
} # end foreach word
#The Grammar
$NP = "(DET(\.[0-3]); )?(ADJ\.; )*N(\.[0-3]?)?;";
$PP = "P(?:\.; )($NP)";
$NPwh = "(DET(\.[0-3]); )?(ADJ )*Nwh(\.3);( $PP)*";
$NP2 = "$NP( $PP)*";
$V1 = "(V(\.[0-3])(;[:np_]*)) ?($NP)? ?";
$V2 = "(V(\.[0-3])(;[:np_]*)) ?";
$Swh = "([NP]wh(\.[2-3]))?(?:; )?(AUX(\.[0-3])(;[:nvp_]*)) ($NP2)? ?";
print "Please type a sentence\n\n";
chop( $input = <> ); #Get the string from the standard input
# Main program loop
until ( $input eq 'thanks' ) {
@words = split / /, $input;
my @copy = stem(\@words, \%lex, \%pos, \%number, \%subcat);
my @string = ambilex(\@copy);
parse(@string);
print "\n";
chop( $input = <> );
$plural = '';
} #end until
print "Any time!\n";
#################### Subroutines #######################
sub parse {
our @parse = '';
my @string = @_;
$i = 0; # initialize the string index (lexical ambiguity)
foreach $string (@string) {
chop($string);
$j = 0; # initialize the parse index (structural ambiguity)
my $string1 = $string;
my $string2 = $string1;
my $string3 = $string2;
#SVOP
if ( $string1 =~s/^($NPwh) ($V1)// || $string1 =~ s/^($NP2)? ?($V1)// ) { # VP with object NP
my $subject = $1; my $object = $16;
my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;
my $subcat = $15;
#There's another verb!
if ( $string1 =~m/$V2/ ) {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
#A question?
elsif ( $subject eq 'Nwh.3;' ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[3]; $Location = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]] PP[$con[4] NP[$con[5]]]\n";
}
}
#Declaratives & Imperatives
else {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";
relex($parse[$j], \@words);
if ( $subject ne '' ) {
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[3]; $Location = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]] PP[$con[4] NP[$con[5]]]\n";
}
} # if overt subject
else {
$Action = $con[1];
if ($object eq '' && $string1 eq '') {
$parse[$j] = "VP[$con[1]]\n";
}
elsif ($object ne '' && $string1 eq '') {
$Object = $con[2];
$parse[$j] = "VP[$con[1] NP[$con[2]]]\n";
}
elsif ($object eq '' && $string1 ne '') {
$Location = $con[3];
$parse[$j] = "VP[$con[1] PP[$con[2] NP[$con[3]]]\n";
}
elsif ($object ne '' && $string1 ne '') {
$Object = $con[2]; $Location = $con[4];
$parse[$j] = "VP[$con[1] NP[$con[2]] PP[$con[3] NP[$con[4]]]\n";
}
imperative ($Action, $Object, $Location);
} # end else
}
agreechk($det_agr, $n_agr, $v_agr);
subcatchk($subcat, $object, $string1);
}
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
$j = $j + 1;
#SVO
if ( $string2 =~s/^($NPwh) ($V2)// || $string2 =~s/^($NP2)? ?($V2)// ) { # plain VP
my $subject = $1; my $object = $string2;
my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;
my $subcat = $15;
if ( $string2 =~m/$V2/ ) {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
#who moved the egg?
elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$NP/ ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
#what is in the egg?
elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$PP/ ) {
$parse[$j] = "NP[$subject] VP[V$v_agr" . PP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Loc = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
#A PP verb complement?
elsif ( $string2 =~ /^$PP/ ) {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . PP ($string2) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2]; $Loc = $con[4];
$parse[$j] = "NP[$con[1]] VP[$con[2] PP[$con[3] NP[$con[4]]]\n";
}
#No direct object?
elsif ( $string2 !~ /$NP/ ) {
$parse[$j] = NP ($subject) . " VP[V$v_agr]\n";
relex($parse[$j], \@words);
$Subject = $con[1]; $Action = $con[2];
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
#Declaratives & Imperatives
else {
$parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($string2) . "]\n";
relex($parse[$j], \@words);
if ( $subject ne '' ) {
$Subject = $con[1]; $Action = $con[2];
if ($object eq '' ) {
$parse[$j] = "NP[$con[1]] VP[$con[2]]\n";
}
elsif ($object ne '' && $string2 eq '') {
$Object = $con[3];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3]]]\n";
}
elsif ($object ne '' ) {
$Object = $con[3]; $Loc = $con[5];
$parse[$j] = "NP[$con[1]] VP[$con[2] NP[$con[3] PP[$con[4] NP[$con[5]]]]]\n";
}
} # if overt subject
else {
$Action = $con[1];
if ($object eq '' ) {
$parse[$j] = "VP[$con[1]]\n";
}
elsif ($object ne '' ) {
$Object = $con[2]; $Loc = $con[4];
$parse[$j] = "VP[$con[1] NP[$con[2] PP[$con[3] NP[$con[4]]]]]\n";
}
} # end else
}
agreechk($det_agr, $n_agr, $v_agr);
subcatchk($subcat, $string2);
}
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
$j = $j + 1;
# a question?
if ( $string3 =~ s/$Swh// ) {
my $comp = $1; my $aux = $3; my $subject = $6;
my $det_agr = $8; my $n_agr = $10; my $v_agr = $4;
my $subcat = $5;
# what do you see?
if ( $subject ne '' && $comp eq 'Nwh.3' && $string3 =~ s/$V2// ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . PP ($string3) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[3]; $Action = $con[4]; $Object = $con[1]; $Loc = $con[6];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]] VP[$con[4] PP[$con[5] NP[$con[6]]]]\n";
agreechk($8, $10, $4); }
# is the bowl on the table?
elsif ( $comp eq '' && $subject ne '' && $subject =~ /$PP/ ) {
$subject =~ s/($NP)//;
$location = $subject; $subject = $1;
$parse[$j] = "AUX[$aux] NP[$subject]" . PP ($location) . "\n";
relex($parse[$j], \@words);
$Subject = $con[2]; $Action = $con[1]; $Loc = $con[4];
$parse[$j] = "AUX[$con[1]] NP[$con[2]] PP[$con[3] NP[$con[4]]]\n";
agreechk($det_agr, $n_agr, $v_agr); }
# where is the bowl?
elsif ( $subject ne '' && $string3 eq '' && $comp eq 'Pwh.2' ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . "\n";
relex($parse[$j], \@words);
$Object = $con[3]; $Action = $con[2]; $Loc = $con[1];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]]\n";
agreechk($8, $10, $4);
imperative ($Action, $Object, $Loc); }
# where did you put the eggs?
elsif ( $subject ne '' && $comp eq 'Pwh.2' && $string3 =~ s/$V2// ) {
my $object = $string3;
$parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . NP ($string3) . "]\n";
relex($parse[$j], \@words);
$Subject = $con[3]; $Action = $con[4]; $Object = $con[5]; $Loc = $con[1];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] NP[$con[3]] VP[$con[4] NP[$con[5]]]\n";
agreechk($8, $10, $4); }
# what is in the bowl?
elsif ( $subject eq '' && $string3 ne '' && $comp eq 'Nwh.3' ) {
$parse[$j] = "CP[$comp] AUX[$aux]" . PP ($string3) . "\n";
relex($parse[$j], \@words);
$Object = $con[1]; $Action = $con[2]; $Loc = $con[4];
$parse[$j] = "CP[$con[1]] AUX[$con[2]] PP[$con[3] NP[$con[4]]]\n";
agreechk($6, $2, $4);
imperative ($Action, $Object, $Loc); }
else {
$parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";
}
}
else {
$parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";
}
&print;
$i = $i + 1; # increment the string index
} #end foreach string
} #end sub parse
sub NP {
my $string = shift;
my $parse;
if ( $string =~ s/($NP)// ) {
$parse = " NP[$1" . PP ($string) . "]";
}
} # end sub NP
sub PP {
my $string = shift;
my $parse;
if ( $string !~ s/(P\.;) ($NP)// ) {
return;
}
else {
$parse = " PP[$1 NP[$2" . PP($string) . "]]";
}
} # end sub PP
# Check agreement
sub agreechk {
# NP Agreement Check
my($det_agr, $n_agr, $v_agr) = @_;
if ( ( $det_agr eq '.3' ) && ( $n_agr ne '.0' ) && ( $n_agr ne $det_agr ) ) {
$parse[$j] = $parse[$j] . " NP PARSE FAILS AGREEMENT CHECK!\n";
return;}
else {
$np_agr = $n_agr; }
# Subject-Verb Agreement Check
if ( $np_agr ne '.3' && $np_agr ne '.0' && $v_agr eq '.3' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";
return; }
elsif (( $np_agr eq '.3' ) && ( $v_agr ne '.0' ) && ( $v_agr ne '.3' )) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";
return; }
} # end sub agreechk
# check subcategory restrictions
sub subcatchk {
my($subcat, $object, $pp) = @_;
if ( ($subcat eq ";_np" || $subcat eq ";_np:_pp:") && $object eq '' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
elsif ( ($subcat eq ";_pp" || $subcat eq ";:_np:_pp") && $pp eq '' ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
elsif ( $subcat eq ";_np_pp" && ( $object eq '' || $pp eq '' ) ) {
$parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }
} # end sub subcatchk
sub relex {
@con = ''; # constituent array
my $con = 0; # constituent count
my $pattern = "[NVPC]P";
my @parsed = split / /, $_[0];
if ( $parsed[0] eq '' ) {
shift( @parsed ); }
my @words = @{ $_[1] };
foreach $parsed (@parsed) {
my $word = shift( @words );
if ( $parsed =~ /($pattern)/ || $parsed =~ /(AUX)/ ) {
$con++;
$con[$con] = $con[$con] . $word;
}
else {
$con[$con] = $con[$con] . ' ' . $word;}
} # end foreach
} # end sub relex
sub imperative {
$Action = $_[0];
$Object = $_[1];
$Loc = $_[2];
if ( $Object =~ s/the // ) {
$objdet = "the "; }
else { $objdet = ''; }
if ( $Loc =~ s/the // ) {
$locdet = "the "; }
else { $locdet = ''; }
if(exists $lex{$Action}{action}){
if ( substr($Object, -1) eq 's' ) {
$stem = $plural = $Object;
substr($stem, -1) = ''; # remove the -s suffix
@objects = grep { $_ =~ $stem } @things;
foreach (@objects) {
$Object = $_;
$loc{$_} = $lex{$_}{loc};
$lex{$Action}{action}->($Object);
}
}
else {
$lex{$Action}{action}->($Object); }
}
else{$reply="I don't know how to $Action the $Object."}
} # end sub imperative
sub print {
print "\nstring[$i] is: $string\n";
my $j = 0;
for ( $j = 0; $j <= 2; $j = $j + 1 ) {
print " parse[$j] = $parse[$j]"; }
print "$reply\n";
$reply = '';
@parse = '';
} #end sub print