People rely on many sources of information to make the correct pronoun interpretation. Jurafsky & Martin (2000) describe several of these sources.

Many verbs limit the types of objects they can take. Pronouns that occur with such verbs probably refer to entities that meet the verb’s selectional restrictions. Consider the following sentences

Here the verb fill selects objects that can be used as containers. We are more likely to use a bowl than a table as a container, so the bowl is the most likely referent for the pronoun it.

The only difficulty in using selectional restrictions to resolve pronoun reference comes through metaphorical contexts. Consider the following example

Practically anything can be used to fill abstract needs and desires. The computer has to detect when the discussion turns to such abstract levels.

Many times hearers rely on their knowledge of the real world to resolve pronoun interpretation. Consider this example

Here, we are most likely to interpret the pronoun as referring to the computer rather than to the sound card, since computers are the sort of thing that is constantly in need of an upgrade. Computer upgrades are probably not part of the lexical meaning of the word computer, but rather come about through an interaction with computers in the real world.

Everything else being equal, we are more likely to tie pronoun reference to an object mentioned in the most recent sentence. See the following example

In this example, there is a strong preference to interpret the pronoun it as referring to the case on Mabel’s computer rather than the case on Sally’s.

There is a slight preference for Tom over Ralph as the referent for the pronoun he in the second sentence. This preference hierarchy can be overridden fairly easily though. Consider the following example

In this example the action of taking someone somewhere implies that the person who was taken was the one in need of new equipment for the computer. There would not be a good reason to mention Ralph if Tom was the only person interested in sound cards.

Pronouns are more likely to refer to entities that are mentioned repeatedly in previous sentences. Consider the following example

In this example, listeners prefer to give the pronouns the same referent ( = Sally) throughout the entire sequence. English lacks a formal distinction between pronouns that signals when the pronoun referent has changed, so the default interpretation is to maintain the same referent over a sequence of sentences.

Some verbs emphasize specific arguments over others and this can affect pronoun reference. Consider the following sentences

There is a strong preference for Mabel to serve as the referent of the pronoun in the first sentence while Sally is strongly preferred in the second sentence. The verb hesitate focuses attention on the object of the embedded clause while the verb promise emphasizes the action of its subject. Local effects such as verb semantics can override preferences due to recency or repeated mention.

Lappin & Leass (1994) describe one algorithm for resolving pronoun reference. Their algorithm uses a simple weighting scheme that balances the effects of recency and sentence structure. They did not employ semantic factors in their algorithm. The algorithm works by adding a discourse variable for each new entity mentioned in the discourse. The algorithm calculates the degree of salience for each entity by summing the weights from a table of salience factors. The salience factors and their weights are shown in the following table.

Salience Factor	Weight
Sentence recency	100
Subject emphasis	80
Existential emphasis	70
Direct object emphasis	50
Indirect and oblique argument emphasis	40
Non-adverbial emphasis	50
Head noun emphasis	80

Their algorithm cuts the weights in half for each succeeding sentence in the discourse. Entities referred to in the first sentence will be assigned weights corresponding to those listed in Table 1. Entities mentioned in the next sentence will again be assigned the weights listed in Table 1, but will add half of the weights, if any, derived from the first sentence. In this manner, the algorithm balances the effects of grammatical role and recency.

Applying the algorithm to the first sentence produces the following discourse referents and weights.

	Rec	Subj	Exist	Obj	Indirect	Non-adv	Head N	Total
Mabel	100	80				50	80	310
Sally	100			50		50	80	280
computer store	100				40		80	220

There are no pronouns to interpret in the first sentence, so the next step is to halve all of the weights for the first sentence. The second sentence contains two pronouns. The algorithm first eliminates any potential referents that do not agree in number or gender. This eliminates the computer store as a referent. The pronoun she occurs in the subject position, so it is assigned the weight 100+80+50+80=310. The noun noise occurs in the object position, so it gets the weight 100+50+50+80=280. The noun card occupies an adverbial position, so it receives the weight 100+40+80=220. The possessive pronoun her refers back to the subject of the same sentence, so it does not contribute a new discourse referent or salience weighting. These values are added to the reduced values from the first sentence. Since Mabel had the highest weighting from the previous sentence, the algorithm adds the weight for the pronoun she to the reduced weight for Mabel.

Referent	Referring Phrases	Weight
Mabel	{Mabel, she}	155+310=465
Sally	{Sally}	140
computer store	{computer store}	110
noise	{noise}	280
card	{card}	220

The final sentence adds the weight 100+80+50+80=310 for the subject sales clerk, 100+50+50+80=280 for the direct object her, and 100+40+50+80=270 for the indirect object new card. Since Mabel had the highest weighting from the previous sentence, the algorithm adds the weight for the pronoun her in the last sentence to the reduced weights for the referent Mabel. The final weights are shown below

Referent	Referring Phrases	Weight
Mabel	{Mabel, she, her}	233+280=513
Sally	{Sally}	70
computer store	{computer store}	55
noise	{noise}	140
card	{card}	110
sales clerk	{sales clerk}	310
new card	{new card}	270

There are several points worth noting about Lappin & Leass’ algorithm. It uses syntactic constraints to resolve pronoun interpretation within a sentence. It relies on number and gender features to reduce potential referents across sentences. It captures the recency effect by reducing the contributions from the previous sentences. Finally, it builds in the effect of the different grammatical roles by assigning different weights to these positions. While their algorithm did not produce a successful interpretation in this example, Lappin and Leass claim achieved 86% accuracy when applied to a corpus of computer training manuals. Their weights may need to be adjusted for other genres or languages.

Jurafsky, D. & Martin, J. H. 2000. Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall.

Lappin, S. & Leass, H. 1994. An algorithm for pronominal anaphora resolution. Computational Linguistics 20(4).535-561.