Miniature Worlds


            Next to the pattern matching operation, lists are one of the most useful features in computer programs. Programmers have to provide the computer with efficient ways to store information, and lists answer this basic need. The technical term for a list on a computer is array, but don’t be put off by terminology. You use arrays on computers in the same way you use ordinary lists at home. I compare an ordinary grocery shopping list and a Perl array in Figure 4.1.


Figure 4.1 Two types of lists


 

Shopping List

Perl Array

 

 

 

 

eggs

@groceries = (“eggs”, “juice”, “milk”, “chocolate”);

 

juice

 

 

milk

 

 

chocolate

 



            The Perl example is actually a Perl statement that declares the array @groceries and adds elements to the array. Since this is a computer, we have to be very careful how we make such declarations. It might help to compare the array statement with the variable assignments that we made in the previous chapter. Perl uses a dollar sign to define variables and an ampersand to define arrays. The computer assumes that variables will be place holders for numbers or strings, while arrays will hold information about a list of things. One of the nice features of Perl is that a large variety of different things can be kept in arrays. The following statements demonstrate how to add a list of strings to an array.


@groceries = ($diary, $beverage, $dessert);

@groceries = (@breakfast, @lunch, @dinner);


It is also possible to declare an array that holds a list of variables or arrays. This flexibility makes arrays extremely useful for storing information.

            The discussion of arrays also shows that we must be very careful to distinguish between different kinds of variables in our programs. At the most concrete level we have literals, which refer to a single piece of information or constant. Numeric literals are plain numbers, either with or without a decimal point (e.g., 1, 567, 3.14159). String literals are any characters that appear between quotation marks (e.g., ‘Now is the’, “time”). Empty quotes denote a null or empty string. At the next level come the scalar variables that begin with a dollar sign. You can think of scalar variables as a box for holding a literal. Scalar variables only hold one literal at a time, either a numeric or string literal. Arrays provide another level of information storage in that they can hold any number of literals, scalars or arrays. The best way to become familiar with these different levels is to see how they are used in Perl programs.

            Storing information in an array is only half the fun; we also need a way to get the information out of an array. Assume that we have declared the array @groceries containing a set of string literals. Each element in an array can be referenced by its index number. The index numbers for array elements in Perl start at zero and go up. We can print the second element of our @groceries array with the statement


print “$groceries[1]” ;


This statement prints the word “juice”. Remember that 0 indexes the first element in an array and

1 indexes the second element. You should also note the way Perl uses parentheses to declare the elements in an array, but uses square brackets around the index number for the array. Also note the way we switched implicitly from the array @groceries to the scalar variable $groceries[1]. Perl provides the option of referring to a single element in an array with a scalar variable ($groceries[1]) or to the whole array (@groceries). The following statement would print all the elements in the array.


print @groceries;


            We can also print our grocery list one element at a time using a program loop. A program loop repeats an operation until its job is finished. An endless loop would keep repeating its instructions until the end of time, using up computer resources and electricity. To avoid such disasters, programmers insert a test in program loops to tell the computer when to halt. An example of a for loop that solves our printing problem for @groceries would be:


for ( $index = 0; $index < 4; $index = $index + 1 ) {

   print “$groceries[$index], ”;

} #end for loop


This loop makes use of the scalar variable $index to reference the elements of the array @groceries sequentially. $index takes on values from 0 to 3, and thus the program repeats four times printing out a different element of the array each time. The angle bracket < in the statement $index < 4 is our test to make sure $index is less than 4. It is possible that you will not always know what the last index of the array might be. Perl has a handy, if obscure operator, for such situations. The variable $#groceries contains the index value of the last element on the list. We could use this variable to print out our grocery list as follows:


for ( $index = 0; $index <= $#groceries; $index = $index + 1 ) {

   print $groceries[$i];

} #end for loop


Here we test to see that the index variable is less than or equal to the value of the last element in the array.

            There are many more features of lists to discuss, but before we get too far along an example of a Perl program that uses arrays to keep track of language data would seem in order. Imagine asking a child what they have in their hand. I will assume that they are holding mangos, apples, and peaches. The Perl program in Figure 4.2 responds to some simple questions about such holdings.


Figure 4.2 A Perl possessions program


#!/usr/local/bin/perl

# possess.pl

# This program responds to possession queries

# It uses the array @hand to keep track of the computer's possessions

@hand = ("mangos", "apples", "peaches");

$total = $#hand + 1;


#$pattern1 = '‘how many things do you have';

#$pattern2 = '‘do you have a';

#$pattern3 = '‘how many ';


# Request a prompt from the user

print "What do you want to know? \n";

chop( $question = <> );                      #Get the question from the standard input

$question = lc $question;


# Main program loop

until ( $question eq 'thanks' ) {


# Test 1. How many things do you have?

   if ( $question =~ m/how many things do you have/ ) {

      print "I have $total things in my hand.\n";

   } #end elsif1


# Test 2. Do you have a ...?

   elsif ( $question =~ m/do you have a/ ) {

      for ( $index = 0; $index <= $#hand; $index = $index + 1 ) {

         if ( $question =~ m/$hand[$index]/ ) {

            print "Yes, I have a $hand[$i] in my hand.\n";

         }

      } #end for loop


      print "No I don\'t.\n";

   } #end elsif2


# Test 3. What do you have?

   elsif ( $question =~ m/what do you have/ ) {

      print "I have a $hand[0], a $hand[1], and a $hand[2].\n";

   } #end elsif3


# If everything else fails         

   else {

      print "I don\'t understand that question.

      \nPlease ask me a different question.\n";

   }


chop( $question = <> );                      #Get another question from the standard input

$question = lc $question;


} #end Main program loop



            This program may look a little more complicated than the programs in chapter 3, but it actually has a rather simple structure. It contains three main blocks. The first block initializes the Perl interpreter and declares the @hand array. It also computes the total number of elements in the array, which is one more than the value of $#hand (the index of the last element in the array). The second block asks the user for a question. The third block runs a series of tests on the user’s question and produces the relevant reply. The third block constitutes the main program loop. It begins with an until loop that looks for key word in the input. The until loop does not place any restrictions on the number of times the loop is repeated. This property makes the until (and while) loop ideal for interactive programs. The until loop repeats the instructions between the following pair of curly braces. Remember to add a right curly brace at the end of the until loop to avoid a programming error. You should also offer the user a way out of the loop. The possess.pl program exits the loop (thereby ending the program) when the user types “thanks”.

            All of the tests that the program uses in the third block are just pattern matches. They succeed if the text in the pattern is a subset of the text in the user’s question. The pattern in Test 3 succeeds if the user asks ‘What do you have?’ or ‘What do you have in your hands?’ It is always good to build in some flexibility in processing language. The conditional tests using the if and elsif operators in the until loop provide the computer with an unambiguous set of tests, including the condition where all of the patterns fail to match the input. It is always a good idea to prompt the user for a different question if the user’s input does not match any of the preprogrammed routines. The else statement at the end of the main program loop takes care of this situation.

            Test 2 is the most complicated test in the program. It is triggered when the user asks a question like ‘Do you have bananas in your hand?’ Responding to this question requires a test to see if the user asked a ‘Do you have ... ?’ type of question. If this is the case, the program has to use a subtest to discover what object was referred to in the question. The possess.pl program checks to see if any of the elements in the array @hand were mentioned in the question. If so, the program produces a positive reply. If not, the program issues a denial.

            Possess.pl provides a good example of routines that reference different properties of an array. It should provide you with an idea of the usefulness of arrays in storing information for computer programs. It can also provide the beginning of a miniature world program in which the computer responds to questions about its world. The most famous of such programs is Terry Winograd’s block world program, Shrdlu, that responded to commands to move various geometrical shapes (http://hci.stanford.edu/~winograd/shrdlu/). Winograd’s program, unlike Eliza, actually responded to questions about its world (see Figure 4.3). Of course, it was limited to a small set of issues related to this miniature world. No one has yet succeeded in bringing such programs into a larger world.


Figure 4.3 A sample Shrdlu dialogue


Human: HAD YOU TOUCHED ANY PYRAMID BEFORE YOU PUT THE GREEN ONE ON THE LITTLE CUBE?

Shrdlu: YES, THE GREEN ONE.

Human: WHEN DID YOU PICK IT UP?

Shrdlu: WHILE I WAS STACKING UP THE RED CUBE, A LARGE RED BLOCK AND A LARGE GREEN CUBE.

Human: WHY?

Shrdlu: TO GET RID OF IT.

Human: WHY DID YOU DO THAT?

Shrdlu: TO CLEAN OFF THE RED CUBE.

Human: WHY DID YOU CLEAR OFF THAT CUBE?

Shrdlu: TO PUT IT ON A LARGE GREEN CUBE.

Human: WHY DID YOU DO THAT?

Shrdlu: TO STACK UP THE RED CUBE, A LARGE RED BLOCK AND A LARGE GREEN CUBE.

Human: WHY DID YOU DO THAT?

Shrdlu: BECAUSE YOU ASKED ME TO.



            We can begin to see how to enlarge the world of possess.pl if we imagine that in addition to asking what is currently in the computer’s possession, we would be able to add or subtract objects from the computer’s clutches. Such an addition to the program requires that we find a way to update the information in the @hand array. Imagine giving the computer a new object to hold, say pears. We might try changing the @hand array with a statement like


@hand = “pear”;


If you try entering this statement as a Perl program you will discover that it doesn’t do what we hoped it would. Instead, it redefines the @hand array as containing a single element—pears. We need a way to add “pear” to our list of items rather than creating a new list that just contains pears.

            Fortunately, Perl has several ways to update lists. The easiest way to do this for our purposes would be to make use of slice. By specifying a slice of an array, we can add or subtract any element in that array. Slices specify an index of an array and its contents. We can use the following statement to add pears to our computer’s fruit basket.


@hand[$index + 1] = (“pears”);


I am using the scalar variable $index to track the index positions in the array. Since the @hand array has three elements, $index has the value 2. (Remember the index begins at position zero.) Our statement tells the computer to add another position to its list of possessions and use this position to store the word “pears”. The result would be equivalent to declaring the array as:


@hand = (“mangos”, “apples”, “peaches”, “pears”);


            We can use the Perl operator splice to delete one or more elements from our lists. If we wanted to jettison the peaches from our expanded list, we could use the statement:


splice (@hand, 2, 1);


Processing this statement produces a list with the elements “mangos”, “apples”, “pears”. The first position in the statement specifies the name of the array to be spliced. The second position specifies the index of the array being spliced, while the final position shows how many elements will be removed from the array. The splice operator can be used to add new elements to a list as well. The following statement would add grapes:


splice (@hand, $index, 0, “grapes”);


This example uses the fourth position in the splice operator to specify the new element that is added to our array. The third position tells the computer not to delete any elements from the array.

            Before slicing and splicing our array, we have to find a way to extract the relevant information from the user’s response. We will assume that users will use the phrase “Have some ____” to offer a new kind of fruit to the computer. We need to detect the initial part of the offer as well as a way to extract the item being offered so that we can add it to the computer’s inventory. We can use the pattern matching technique to take care of the first problem, but we will have to employ another Perl tool to solve the second problem. This new tool is known as variable binding since we want to bind (record) information in a variable that we can then reference in our slice and splice operations on arrays. Assuming that the user’s response is in the variable $question, we can use the following pattern in our matching statement:


elsif ( $question =~ m/have some (\w+)/ )


            This statement is similar to the previous examples of pattern matching except that it contains a backslash w and plus in parentheses. The backslash tells the computer that the following character has a predefined function rather than its literal use as the letter w. The combination backslash w plus (\w+) tells the computer to match the word following the phrase “have some”. The parentheses inclosing \w+ tell the computer to associate this word with an implicit variable. Perl uses the variables $1, $2, $3 and so on to refer to implicit variables. When our pattern matches the user’s input, Perl will store the word that follows the phrase “have some” in the implicit variable $1. We can then add this word to our fruit array with the statement:


@hand[$index + 1] = ($1);


I find the use of implicit variables to record parts of a pattern match downright handy. The program in Figure 4.4 provides a simple example of pattern matching with variable binding.


Figure 4.4 pattern matching with variable binding


#!/usr/local/bin/perl

# bind.pl

# This program demonstrates variable binding with implicit variables

# It ends when the user types "thanks"


# Block 1. Request a prompt from the user

print "What are you offering me? \n";


# Block 2. Process the user's question


chop( $question = <> );                      #Get the question from the standard input

$question = lc $question;


# Main program loop

until ( $question eq 'thanks' ) {


#Test 1. See what's being offered

if ( $question =~ m/have some (\w+)/ ) {

   print "Thanks for the $1.\n";

} #end if


chop( $question = <> );                      #Get another question from the standard input

$question = lc $question;

} #end until 



            Now let’s combine the slicing and variable binding operations with the possess.pl program so that it will allow the user to add new kinds of fruit to the computer’s inventory. I named the resulting program possess2.pl and it appears in Figure 4.5. The change from the first version of the program is very slight, but quite effective. We can begin to think about making further additions to our fruit salad.



Figure 4.5 A demonstration of array updating


#!/usr/local/bin/perl

# possess2.pl

# This program responds to possession queries and allows fruit updates

# It ends when the user types "thanks"

# It uses the array @hand to keep track of the computer's possessions


# Block 1. Initialize the array

@hand = ("mangos", "apples", "peaches");


# Block 2. Request a prompt from the user

print "What do you want to know? \n";


# Block 3. Process the user's question


chop( $question = <> );                      #Get the question from the standard input

$question = lc $question;


# Main program loop

until ( $question eq 'thanks' ) {


$number = $#hand;

$total = $#hand + 1; # the total number of items in the array


#Test 1. See if question is ‘how many things do you have?'

   if ( $question =~ m/how many things do you have/ ) {

      print "I have $total things in my hand.\n";

   } #end test 2


#Test 2. See if question is ‘what do you have?'

   elsif ( $question =~ m/what do you have/ ) {

      print "I have @hand \n";

   } #end test 3


#Test 3. See if question is ‘do you have a ___?'

   elsif ( $question =~ m/do you have / ) {

      for ( $index = 0; $index <= $number; $index = $index + 1 ) {

         if ( $question =~ m/$hand[$index]/ ) {

            print "Yes, I have some $hand[$index] in my hand.\n";

         }

      } #end for loop


      print "No I don\'t.\n";

   } #end test 4


#Test 4. See what's being offered

   elsif ( $question =~ m/have some (\w+)/ ) {

      @hand[$number + 1] = ($1);

      print "Thanks for the $1.\n";

   } #end test 5


# If all tests fail

   else {

      print "I don\'t understand that question.

      \nPlease ask me a different question.\n";

   }


chop( $question = <> );                      #Get another question from the standard input

$question = lc $question;


} #End main loop



            If you have run this program on your computer you should have a lot of ideas about how to improve it. I know I would be curious about how many mangos or apples the computer was holding. The easiest way to make such information accessible to the computer would be to create a hash or associative array. As useful as the regular kind of array might be, they have their limitations. One important limitation of an array is that you need to know the index number of an element in the array in order to reference that element. This is not a major problem when we cycle through all the elements of an array in printing the array, but it does mean that we cannot check directly to see if apples are on the list. A hash allows us to reference an element of an array by name rather than by number, so hashes provide a very handy tool for manipulating language data. Hashes in Perl begin with the percent sign % to distinguish them from scalar variables and arrays. We can use the following statements to define a hash named %hand that keeps track of the number of each fruit the computer is holding:


%hand = ();


$hand{‘mangos’} = 2;

$hand{‘apples’} = 3;

$hand{‘peaches’} = 6;


We can also define the hash with the statement


%hand = (‘mangos’ => 2, ‘apples’ => 3, ‘peaches’ => 6);


The second approach demonstrates how Perl hashes can be viewed as arrays with each position holding a pair of elements. Perl even has a reserved name for each member of the pair in a hash. The first member is the key since it unlocks the information in the hash. The second member is the value since this element often holds a value or attribute associated with the key. Perl allows access to the keys and values in a hash by using these terms as operators on the hash, as in the following statements


@fruits = keys %hand;

@number = values %hand;


            We can now look up the number of each kind of fruit by name, e.g., $hand{mangos}. This reference behaves like a scalar variable and has the value 2. To find out all the kinds of fruit the computer has in hand, we can use a foreach loop and the keys function. The following example demonstrates the use of the foreach loop to access each element of the hash:


foreach $fruit (keys %hand) {

   print “I\’m holding $hand{$fruit} $fruit \n”;

};         #end foreach


            Now it is time to put the hash to work in our possessions program. The example in Figure 4.6 will respond to the same questions as possess2.pl, but in addition, keeps track of how many tokens of each type of fruit the computer is holding.


Figure 4.6 Using a hash to track fruit information


#!/usr/local/bin/perl

# hash.pl

# This program responds to possession queries and allows fruit updates

# It ends when the user types "thanks"

# It uses the hash %hand to keep track of the computer's possessions


# Block 1. Initialize the hash

%hand = ('mangos' => 2, 'apples' => 3, 'peaches' => 6);


# Block 2. Request a prompt from the user

print "What do you want to know? \n";


# Block 3. Process the user's question


chop( $question = <> );                      #Get the question from the standard input

$question = lc $question;


# Main program loop

until ( $question eq 'thanks' ) {


#Test 1. See if question is ‘how many things do you have?'

   if ( $question =~ m/how many things do you have/ ) {

      $total = keys %hand;

      print "I have $total things in my hand.\n";

   } #end test 1


#Test 2. See if question is ‘what do you have?'

   elsif ( $question =~ m/what do you have/ ) {

      @fruit = keys %hand;

      print "I have @fruit \n";

   } #end test 2


#Test 3. See if question is ‘do you have ___?'

   elsif ( $question =~ m/do you have (\w+)/ ) {

      if (exists $hand{$1} ) {

         print "Yes, I have some $1 in my hand.\n";

      } #end if


      print "Yes, I have no $1. I have no $1 today.\n";

   } #end test 3


#Test 4. See if question is ‘how many ___?'

   elsif ( $question =~ m/how many (\w+)/ ) {

      print "I have $hand{$1} $1 \n";

   } #end test 4


#Test 5. See what's being offered

   elsif ( $question =~ m/have some (\w+)( and some (\w+))?/ ) {

      $fruit = $1;

      print "Thanks. How many $fruit can you give me?\n";

      $number = <>;

         if ( $number =~ m/(\d+)/ ) {

            $hand{$fruit} = $1;

         }

   } #end test 5


#Test 6. Give something

   elsif ( $question =~ m/give me a (\w+)/ ) {

      $fruit = $1.'es';

      $hand{$fruit} = $hand{$fruit} - 1;

      print "Here you go. Now I have $hand{$fruit} $fruit\n";

   } #end test 6


# If all tests fail

   else {

      print "I don\'t understand that question.

      \nPlease ask me a different question.\n";

   }


chop( $question = <> );                      #Get another question from the standard input

$question = lc $question;


} #End main loop



            Admittedly this program is not very deep, but it provides examples of some tools that we can use to create much more complex worlds. As a final illustration of uses for hashes I present a program in Figure 4.7 that responds to questions about the characters in the game Clue. The object of the game is to be the first player to guess which character used a specific weapon in a specific room to commit a murder. The program reads in information about the location and weapon associated with each character from the data file clue.txt that is shown in Figure 4.8. The program saves this information in the hashes %weapon and %room. These hashes enable us to see if Coronel Mustard is holding the revolver, for example, with the Perl statement


‘revolver’ eq $weapon{mustard}



Figure 4.7 A Clue world


#!/usr/local/bin/perl

# clue.pl

# This program responds to possession and location queries

# It ends when the user types "thanks"

# It uses the hashes @weapon and @room to keep track of the objects

%weapon = (); %room = ();


# Block 1. Read in data file

open ITEMS, "< clue.txt" ;


while (chop( $input = <ITEMS> )) {

   ($name, $weap, $rm) = split /\t/, $input;

   $weapon{$name} = $weap; $room{$name} = $rm;

   print "$name, $weapon{$name}, $room{$name}.\n";

} #end while


# Block 2. Request a prompt from the user

print "Hello Dave. What can I do for you? \n";


chop( $question = <> );                      #Get the question from the standard input

$question = lc $question;


# Main program loop

until ( $question eq 'thanks' ) {


#Test 1. See if question is 'Where is someone?'

if ( $question =~ m/where is (\w+)/ ) {

   print "$1 is in the $room{$1}.\n";

} #end Test 1


#Test 2. See if question is 'What does someone have?'

elsif ( $question =~ m/what does (\w+) have/) {

   print "$1 has the $weapon{$1}.\n";

} #end Test 2


#Test 3. See if question is ‘Is someone in somewhere with something?'

elsif ( $question =~ m/is (\w+) in the (\w+) with (a|the) (\w+)/ ) {


   if ($2 eq $room{$1} and $4 eq $weapon{$1}) {

      print "Yes, $1 is in the $2 with a $4\n";

   }


   elsif ($2 eq $room{$1}) {

      print "Well, $1 is in the $2.\n";

   }


   else {

      print "No, $1 is not in the $2.\n";

   }


} #end Test 3


# If all tests fail

else {

   print "I don\'t understand that question. Ask a different question.\n";

}


chop( $question = <> );                      #Get another question from the standard input

$question = lc $question;


} #End main loop



Figure 4.8 The Clue data file


mustard           candlestick      study

plum               lead_pipe        lounge

peacock           knife               dining_room

scarlet             rope                billiard_room

green               wrench                        conservatory

white               revolver          ballroom


            The program responds to such simple questions as ‘Where is X?’, ‘What does X have?’, and ‘Is X in room Y with weapon Z?’ There are many more elements that we can add to this game, but for the moment you should experiment with this simple version to make sure that you understand the purpose of each statement.



Summary of Perl statements


Code letters

\n        new line

\t         tab

\w+     word


Variable types

$scalar            a scalar variable, e.g., $scalar = “my string”;

@array            an array variable, e.g., @array = (“one”, “two”, “three”);

%hash             a hash variable, e.g., %hash = (“one” => 5, “two” => 3, “three” => 7);


$array[$index] = “element”

$hash{$key} = $value


Comparing strings

 

$x = $y                       check to see if the numerical value of x is equal to the value of y

$x <= $y                     check to see if the numerical value of x is less than or equal to y

$one eq $two              check to see if the string value of $one is identical to $two

(x|y)                            match x or y and bind value to implicit variable $1

x and y                                    the logical and; both x and y are true


($first, $second, $third) = split /\t/, $input; split the information in the variable $input by tab stops and store each part in the variables $first, $second, $third respectively


Pattern matching

$line =~ m/x/              search for x in $line

$line =~ m/(x)/           search for x in $line and bind it to the implicit variable $1

$line =~ s/x/y/            substitute y for x in $line


Looping constructs

for ( condition) { do something }

foreach ( item in array) { do something }

until ( condition ) { so something }

while ( condition ) { do something }


Operators


+ Addition operator (x + y produce the sum of x and y)

. Concatenation operator (x . y produces the string xy)

= Assignment operator (x = y assigns the value of y to x)

=~ Binding operator (x =~ y binds information in y to x)

# Comment operator (anything following # is a comment)


Logical Operators

foreach (x)      Checks for each instance of x

while (x)         Checks while x holds


<> default input device (i.e., the keyboard)


Commands

print “x”;         Display x on the computer screen


Functions

lc(x)                Convert all upper case characters in x to lower case characters

chop(x)           Delete the last character from x

chomp(x)        Delete the last character from x

open x “< y”;  Open the file y for input to the variable x

close x            Close the file x


Pattern Matching

x =~ m/y/        Test if y matches a subpart of x

x =~ s/y/z/       Replace an instance of y in x with z


Modifiers

m                    Match

s                      Substitute

g                      Global, i.e., all instances

[]                     Inclose options, e.g., [xyz] allows a choice between x or y or z