Natural Language Production


      I have confined our discussion to declarative sentences to facilitate the construction of a parsing program that handles such basic chores as checking agreement and subcategorization features. This week we address the parsing of interrogative and imperative sentences as an illustration of how easily we can adapt the basic parsing program to new sentence modalities.


Yes/No Questions


      Within our miniature world, it would be helpful to ask such questions as ‘Is the egg in the bowl?’ or ‘Are you on the table?’ These are yes/no questions and differ from declarative sentences in inverting the subject noun phrase and the copula. The declarative sentence ‘The bowl is on the table’ can be transformed into the yes/no question ‘Is the bowl on the table?’ The corresponding change we must make to our grammar is equally simple. In addition to our rules for declarative sentences


      $S1 = "($NP2) ($V1)";

      $S2 = "($NP2) ($V2)";


we require a new rule for yes/no questions


      $Sinv = "(AUX(\.[0-3])(;[:np_]*))(?: )($NP)(?: )($PP)*";


This rule adds a new part of speech (the auxiliary verb) to the lexicon, and restricts the inversion process in yes/no questions to the set of auxiliary verbs. I made a simple change to the lexicon by adding an additional part of speech entry for the verb to be. The new lexical entries for these verbs are


      "is" => "AUX/3/_np;V/3/_pp",

      "am" => "AUX/1/_np;V/1/_pp",

      "are" => "AUX/2/_np;V/2/_pp",


These additions enable the verb to be to function as an auxiliary in yes/no questions as well as a main verb in copular sentences. If we find additional uses for auxiliary verbs, our expanded set of rules will automatically apply to the verb to be.

      The only other change we will need to make is to add a new routine that applies the yes/no question rule to the input. We can simply extend our parsing rules by matching the input string against the yes/no question structure with the statement


      $string =~ /$Sinv/


If the input string matches the yes/no question structure, our program will output the parse


      $parse[$j] = "AUX[$1] NP[$subject]" . PP ($pp) . "\n";


We can use the unmodified agreement and subcategorization checking subroutines to check this new parse with the statements


      agreechk($det_agr, $n_agr, $v_agr);

      subcatchk($subcat, $subject, $string3);


Thus, the only modules that require an update are the main program and the package Parse. The lexical changes to the main program are trivial. For completeness I provide the revised main program in Figure 9.1 and the revised Parse module in Figure 9.2.



Figure 9.1. A program that parses yes/no questions


#!usr/local/bin/perl

# quest4.pl

# demonstrate yes/no questions in a top-down parser


use English;

use Stem2;

use Ambi3;

use Parse4;


my @string = ''; my $word = '';

my %pos = (); my %subcat = ();


# lexicon arranged as pos/number/subcategory

%lex = ("I" => "N/1/",

     "you" => "N/2/",

     "it" => "N/3/",

     "block" => "N/0/",

     "egg" => "N/0/",

     "table" => "N/0/;V/2/_np",

     "bowl" => "N/0/",

     "floor" => "N/0/",

     "found" => "V/0/_np:_pp:",

     "has" => "V/3/_np",

     "is" => "AUX/3/_np;V/3/_pp",

     "am" => "AUX/1/_np;V/1/_pp",

     "are" => "AUX/2/_np;V/2/_pp",

     "get" => "V/2/_np:_pp:",

     "got" => "V/0/_np:_pp:",

     "give" => "V/2/_np:_pp:;V//_np_np",

     "gave" => "V/0/_np:_pp:;V/0/_np_np",

     "move" => "V/2/_np;N/0/",

     "put" => "V/2/_np_pp",

     "see" => "V/2/:_np:",

     "saw" => "V/0/:_np:",

     "on" => "P//",

     "in" => "P//",

     "with" => "P//",

     "a" => "DET/3/",

     "an" => "DET/3/",

     "the" => "DET/0/");


foreach $word (keys %lex) {

   $lex{$word} =~ /(.*)\/(.*)\/(.*)/ ;

   $pos{$word} = $1;

   $subcat{$word} = $3;

} # end foreach word


print "Please type a sentence\n\n";

chop( $input = <> );         #Get the string from the standard input


# Main program loop

until ( $input eq 'thanks' ) {

   print "\n";


   @words = split / /, $input;


   @words = stem(\@words, \%lex, \%pos, \%subcat); # morph analyzes inflectional morphology


   my @string = ambilex(\@words); # ambilex constructs a different string for ambiguous words


   parse(\@string); # parse produces a parse for each string of terminal elements


   chop( $input = <> );             #Get another string from the standard input

} #end until



Figure 9.2 The revised Parse module


#!usr/local/bin/perl

#!/usr/local/bin/perl

# Parse4.pm

# Works with quest4.pl

# implements syntactic module with yes/no questions


package Parse4;

use Exporter;

our @ISA = ('Exporter');

our @EXPORT = qw( &parse &print );


#The Grammar


my $NP = "(DET(\.[0-3]); )?(ADJ )*N(\.[0-3]?)?;";

my $PP = "P(?:\.; )($NP)";

my $NP2 = "$NP( $PP)*";

my $V1 = "(V(\.[0-3])(;[:np_]*))(?: )?($NP)?(?: )?";

my $V2 = "(V(\.[0-3])(;[:np_]*))(?: )?";

my $Sinv = "(AUX(\.[0-3])(;[:np_]*))(?: )($NP)(?: )($PP)*";


sub parse {


   our @parse = '';

   my @string = @{ $_[0] };


   $i = 0; # initialize the string index (lexical ambiguity)

   foreach $string (@string) {

      chop($string);

      $j = 0; # initialize the parse index (structural ambiguity)

      my $string1 = $string;

      my $string2 = $string1;

      my $string3 = $string2;

      if ( $string1 =~s/($NP2) ($V1)// ) {                # VP with object NP

         my $subject = $1; my $object = $16;

         my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;

         my $subcat = $15;

         if ( $string1 =~m/$V2/ ) {

            $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

         }

         else {

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";

         }

         agreechk($det_agr, $n_agr, $v_agr);

         subcatchk($subcat, $object, $string1);

      }

      else {

         $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

      }

      $j = $j + 1;

 

      if ( $string2 =~s/($NP2) ($V2)// ) { # plain VP

         my $subject = $1;

         my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;

         my $subcat = $15;

         if ( $string2 =~m/$V2/ ) {

            $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

         }

         elsif ( $string2 =~ /^$PP/ ) { #A PP verb complement?

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . PP ($string2) . "]\n"

         }

         elsif ( $string2 !~ /$NP/ ) { #No direct object?

            $parse[$j] = NP ($subject) . " VP[V$v_agr]\n";

         }

         else { #A direct object

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($string2) . "]\n";

         }

         agreechk($det_agr, $n_agr, $v_agr);

         subcatchk($subcat, $string2, $string2);

         }

      else {

         $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

      }

      $j = $j + 1;


      if ( $string3 =~ /$Sinv/ ) {                   # yes-no question?

         my $subject = $4; my $pp = $9;

         my $det_agr = $6; my $n_agr = $8; my $v_agr = $2;

         my $subcat = $3;

         $parse[$j] = "AUX[$1] NP[$subject]" . PP ($pp) . "\n";

         agreechk($det_agr, $n_agr, $v_agr);

         subcatchk($subcat, $subject, $string3);

      }

      else {

         $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

      }


      &print;


      $i = $i + 1; # increment the string index


   } #end foreach string


} #end sub parse

 

sub NP {                           # parse NP

   my $string = shift;

   my $parse;

   if ( $string =~ s/($NP)// ) {

      $parse = " NP[$1" . PP ($string) . "]";     # call PP

   }

} # end sub NP


sub PP {                         # parse PP

   my $string = shift;

   my $parse;

   if ( $string !~ s/(P\.;) ($NP)// ) { # end recursion

      return;

   }

   else {                           # PP recursion

      $parse = " PP[$1 NP[$2" . PP($string) . "]]";

   }

} # end sub PP


# Check agreement

sub agreechk {


# NP Agreement Check

   my($det_agr, $n_agr, $v_agr) = @_;

   if ( ( $det_agr eq '.3' ) && ( $n_agr ne '.0' ) && ( $n_agr ne $det_agr ) ) {

      $parse[$j] = $parse[$j] . " NP PARSE FAILS AGREEMENT CHECK!\n";

      return;}

   else {

      $np_agr = $n_agr; }

               

# Subject-Verb Agreement Check

   if ( $np_agr ne '.3' && $np_agr ne '.0' && $v_agr eq '.3' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";

      return; }


   elsif (( $np_agr eq '.3' ) && ( $v_agr ne '.0' ) && ( $v_agr ne '.3' )) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";

      return; }


} # end sub agreechk


# check subcategory restrictions

sub subcatchk {

   

   my($subcat, $object, $pp) = @_;

   if ( ($subcat eq ";_np" || $subcat eq ";_np:_pp:") && $object eq '' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }

   elsif ( ($subcat eq ";_pp" || $subcat eq ";:_np:_pp") && $pp eq '' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }

   elsif ( $subcat eq ";_np_pp" && ( $object eq '' || $pp eq '' ) ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!"; }

        

} # end sub subcatchk


sub print {


   print "\nstring[$i] is: $string[$i]\n";


   my $j = 0;

   for ( $j = 0; $j <= 2; $j = $j + 1 ) {

      print " parse[$j] = $parse[$j]"; }

   @parse = '';

} #end sub print


return 1;



Wh Questions


      Now that we have a program that can parse yes/no questions, it is a simple matter to extend it to wh questions. Wh questions will enable the program to parse such questions as ‘Who put the bowl on the table?’, ‘Where is the egg?’ or ‘What do you see?’ We can approach this problem in the same way we approached the yes/no questions by adding a new sentence structure for each of the wh questions. The result would be a set of phrase structure rules like


      $Sinv = "(AUX(\.[0-3])(;[:np_]*))(?: )($NP)(?: )($PP)*";

      $Swhp = "(PPwh\.2)(?:; )(AUX(\.[0-3])(;[:np_]*))(?: )($NP2)";

      $Swhn = "(NPwh(\.3))(?:; )(AUX(\.[0-3])(;[:np_]*))(?: )($PP)*";


The rule for $Swhp would parse such sentences as ‘Where is the table?’ while the rule for $Swhn would handle such sentences as ‘What is in the bowl?’ and ‘Who put the egg on the tablel?’ We would also have to check if the where questions were missing the postverbal prepositional phrase and if the who and what questions were missing the postverbal noun phrases in addition to checking for subject-verb agreement.

      This approach would amount to a form of pattern matching since we are just adding new sentence patterns without taking advantage of the generalizations that exist across the questions. The where and what questions can be combined using the more general pattern


      wh-word AUX NP? PP?


If the wh-word is where our grammar would have to insure that the postverbal prepositional phrase was not present, whereas if the wh-word is who or what our grammar will have to insure that the postverbal noun phrase is not present.

      Once we have taken the question pattern this far, it is easy to see how we can combine the wh questions with the yes/no questions in one general phrase structure rule. We just have to make the initial wh-word optional. The Perl statement to do this is


      $Swh = "([NP]wh(\.[2-3]))?(?:; )?(AUX(\.[0-3])(;[:np_]*))(?: )($NP)?(?: )?($PP)*";


The ([NP]wh(\.[2-3]))? part of this statement checks to see if the parse string begins with either ‘Pwh.2’ (the where question) or ‘Nwh.3’ (the who and what questions). The question mark at the end of this pattern indicates that the wh-word is optional. This part is followed by an obligatory auxiliary verb. The final part of the pattern looks for the optional noun and prepositional phrases. We can use this pattern to match all of the yes/no and wh questions we have looked at so far. We only need to add the correct rules for insuring that the obligatory phrases are present and that subject-verb agreement is correct. All of these changes only affect the Parse module. I provide the new Parse module in Figure 9.3.


Figure 9.3 The revised Parse module for questions


#!usr/local/bin/perl

# Parse5.pm

# Works with quest5.pl

# implements wh questions and do support


package Parse5;

use Exporter;

our @ISA = ('Exporter');

our @EXPORT = qw( &parse &print );


#The Grammar


my $NP = "(DET(\.[0-3]); )?(ADJ )*N(\.[0-3]?)?;";

my $PP = "P(?:\.; )($NP)";

my $NPwh = "(DET(\.[0-3]); )?(ADJ )*Nwh(\.3);( $PP)*";

my $NP2 = "$NP( $PP)*";

my $V1 = "(V(\.[0-3])(;[:np_]*)) ?($NP)? ?";

my $V2 = "(V(\.[0-3])(;[:np_]*)) ?";

my $Swh = "([NP]wh(\.[2-3]))?(?:; )?(AUX(\.[0-3])(;[:nvp_]*)) ($NP2)? ?";


sub parse {


   our @parse = '';

   my @string = @{ $_[0] };


   $i = 0; # initialize the string index (lexical ambiguity)

   foreach $string (@string) {

      chop($string);

      $j = 0; # initialize the parse index (structural ambiguity)

      my $string1 = $string;

      my $string2 = $string1;

      my $string3 = $string2;


      #SVOP

      if ( $string1 =~s/^($NPwh) ($V1)// || $string1 =~ s/^($NP2) ($V1)// ) {        # VP with object NP

         my $subject = $1; my $object = $16;

         my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;

         my $subcat = $15;

         #There's another verb!

         if ( $string1 =~m/$V2/ ) {

            $parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";

         }

         #A question?

         elsif ( $subject eq 'Nwh.3;' ) {

            $parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";

         }

         #Declaratives

         else {

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($object) . PP ($string1) . "]\n";

         }

         agreechk($det_agr, $n_agr, $v_agr);

                     subcatchk($subcat, $object, $string1);

         }

      else {

         $parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";

      }

      $j = $j + 1;

 

      #SVO

      if ( $string2 =~s/^($NPwh) ($V2)// || $string2 =~s/^($NP2) ($V2)// ) { # plain VP

         my $subject = $1; my $object = $string2;

         my $det_agr = $3; my $n_agr = $5; my $v_agr = $14;

         my $subcat = $15;


         if ( $string2 =~m/$V2/ ) {

            $parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";

         }

         #who moved the egg?

         elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$NP/ ) {

            $parse[$j] = "NP[$subject] VP[V$v_agr" . NP ($string2) . "]\n";

         }

         #what is in the egg?

         elsif ( $subject eq 'Nwh.3;' && $string2 =~ /^$PP/ ) {

            $parse[$j] = "NP[$subject] VP[V$v_agr" . PP ($string2) . "]\n";

         }

         #A PP verb complement?

         elsif ( $string2 =~ /^$PP/ ) {

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . PP ($string2) . "]\n"

         }

         #No direct object?

         elsif ( $string2 !~ /$NP/ ) {

            $parse[$j] = NP ($subject) . " VP[V$v_agr]\n";

         }

         #Declaratives

         else {

            $parse[$j] = NP ($subject) . " VP[V$v_agr" . NP ($string2) . "]\n";

         }

         agreechk($det_agr, $n_agr, $v_agr);

         subcatchk($subcat, $string2);

      }

      else {

         $parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";

      }

      $j = $j + 1;


      # a question?

      if ( $string3 =~ s/$Swh// ) {

         my $comp = $1; my $aux = $3; my $subject = $6;

         my $det_agr = $8; my $n_agr = $10; my $v_agr = $4;

         my $subcat = $5;

         # what do you see?

         if ( $subject ne '' && $comp eq 'Nwh.3' && $string3 =~ s/$V2// ) {

     $parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . PP ($string3) . "]\n";

             agreechk($det_agr, $n_agr, $v_agr); }

         # is the bowl on the table?

         elsif ( $comp eq '' && $subject ne '' && $subject =~ /$PP/ ) {

            $subject =~ s/($NP)//;

            $location = $subject; $subject = $1;

            $parse[$j] = "AUX[$aux] NP[$subject]" . PP ($location) . "\n";

            agreechk($det_agr, $n_agr, $v_agr); }

         # did you move the bowl?

         elsif ( $comp eq '' && $aux ne '' && $subject ne '' && $string3 =~ s/$V2// ) {

             $parse[$j] = "AUX[$aux]" . NP($subject) . " VP[V$v_agr" . NP ($string3) . "\n";

             agreechk($det_agr, $n_agr, $v_agr); }

         # where is the bowl?

         elsif ( $subject ne '' && $string3 eq '' && $comp eq 'Pwh.2' ) {

             $parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . "\n";

             agreechk($det_agr, $n_agr, $v_agr); }

         # where did you put the eggs?

         elsif ( $subject ne '' && $comp eq 'Pwh.2' && $string3 =~ s/$V2// ) {

             my $object = $string3;

     $parse[$j] = "CP[$comp] AUX[$aux]" . NP($subject) . " VP[V$v_agr" . NP ($string3) . "]\n";

             agreechk($det_agr, $n_agr, $v_agr); }

         # what is in the bowl?

         elsif ( $subject eq '' && $string3 ne '' && $comp eq 'Nwh.3' ) {

             $parse[$j] = "CP[$comp] AUX[$aux]" . PP ($string3) . "\n";

             agreechk($6, $2, $4);}

         else {

            $parse[$j] = "SENTENCE FAILS GRAMMAR CHECK!\n";

         }

      }

      else {

         $parse[$j] = " SENTENCE FAILS GRAMMAR CHECK!\n";

      }


      &print;


      $i = $i + 1; # increment the string index


   } #end foreach string


} #end sub parse

 

sub NP {                           # parse NP

   my $string = shift;

   my $parse;

   if ( $string =~ s/($NP)// ) {

      $parse = " NP[$1" . PP ($string) . "]";     # call PP

   }

} # end sub NP


sub PP {                          # parse PP

   my $string = shift;

   my $parse;

   if ( $string !~ s/(P\.;) ($NP)// ) { # end recursion

      return;

   }

   else {                             # PP recursion

      $parse = " PP[$1 NP[$2" . PP($string) . "]]";

   }

} # end sub PP


# Check agreement

sub agreechk {


# NP Agreement Check

   my($det_agr, $n_agr, $v_agr) = @_;

   if ( ( $det_agr eq '.3' ) && ( $n_agr ne '.0' ) && ( $n_agr ne $det_agr ) ) {

      $parse[$j] = $parse[$j] . " NP PARSE FAILS AGREEMENT CHECK!\n";

      return;}

   else {

      $np_agr = $n_agr; }

               

# Subject-Verb Agreement Check

   if ( $np_agr ne '.3' && $np_agr ne '.0' && $v_agr eq '.3' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";

      return; }


   elsif (( $np_agr eq '.3' ) && ( $v_agr ne '.0' ) && ( $v_agr ne '.3' )) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBJ-VERB AGREEMENT CHECK!\n";

      return; }


} # end sub agreechk


# check subcategory restrictions

sub subcatchk {

   

   my($subcat, $object, $pp) = @_;

   if ( ($subcat eq ";_np" || $subcat eq ";_np:_pp:") && $object eq '' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }

   elsif ( ($subcat eq ";_pp" || $subcat eq ";:_np:_pp") && $pp eq '' ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }

   elsif ( $subcat eq ";_np_pp" && ( $object eq '' || $pp eq '' ) ) {

      $parse[$j] = $parse[$j] . " PARSE FAILS SUBCAT CHECK!\n"; }

        

} # end sub subcatchk


sub print {


   print "\nstring[$i] is: $string\n";


   my $j = 0;

   for ( $j = 0; $j <= 2; $j = $j + 1 ) {

      print " parse[$j] = $parse[$j]"; }

   @parse = '';

} #end sub print


return 1;



Commands


      The question now arises of how we can extend our Perl grammar to parse imperative sentences. Imperative sentences such as ‘Get the egg’ or ‘Put the bowl on the table’ differ from declarative sentences in the lack of an overt subject. The verb in imperative sentences is also restricted to a second person agreement form. The obvious way to deal with the missing subject noun phrase in imperative sentences is to make the subject optional. We simply change the declarative sentence structures to


      $S1 = "($NP2)?(?: )?$V1";

      $S2 = "($NP2)?(?: )?$V2";


      We also need to change the rules in the parse subroutine so that they check for missing subjects. The subject-verb agreement check should also be amended so that it only allows missing subjects to occur with second person verb forms. I will leave these changes as an exercise.