Understanding how humans process the subtlety of language is crucial to recreating the ability to understand natural language in computers. Dr. George Miller investigates the cognitive processes of resolving the vagueness in human language.
Originally published March 2001 at Impacts Magazine. Published on KurzweilAI.net May 15, 2001.
“If I accomplish nothing else in this story, I hope I will persuade you that human language is so vague and ambiguous that only a very clever brain could possibly understand it.”
Most people are unaware how vague and ambiguous human languages really are, so they are disappointed when computers fail to understand linguistic communication. They are surprised when an information retrieval system gives responses that seem unrelated to their search word. They can’t understand why question answering should be so hard for a machine. And they are really upset by the low quality of machine translations. As communication grows increasingly important, the computer’s linguistic limitations become increasingly frustrating. As more and more documents are stored in computers, the machines’ inability to understand the information they hold restricts their usefulness to both business and government. Computers are not to blame for this situation. Language itself is at the root of it.
I am the kind of psychologist who studies the basic cognitive processes of the human mind, the cognitive processes that support sensation and perception, learning and memory, problem solving and reasoning, and especially those characteristically human cognitive processes that support speech and language. As a psychologist, I say that I study the mind, but my private conceit is that I really study the brain, for surely it is the brain that performs all those processes that enable us to develop and maintain our knowledge of the world and of ourselves.
If I accomplish nothing else in this story, I hope I will persuade you that human language is so vague and ambiguous that only a very clever brain could possibly understand it.
The Nature of the Problem
The problem begins with words, which I shall take to be the smallest meaningful units of language. I am going to assume that we already understand how words are recognized as units in the flow of spoken sound. Speech recognition is still a challenging topic for research, but let’s assume that this perceptual part of the process of understanding speech has been solved–that we already have an adequate theory of how individual words are recognized.
The first question is how we assign meaning to the spoken words. To take one example from thousands that are available, consider the noun “triangle.” As philosophers pointed out long ago, the noun “triangle” is hopelessly vague. Without further explanation we don’t know whether the triangle is acute or obtuse, oblique or right-angled, scalene or isosceles or equilateral, and we have no idea what color it is or where it is or how big it is or how it is oriented. So the word “triangle”‘ is referentially vague.
Moreover, the noun “triangle” is ambiguous, in the sense that it can be used to express more than one meaning. The word “triangle” can refer either to a three-sided polygon or to a musical percussion instrument or to a social situation involving three parties. If you were to hear someone say, “It’s a good triangle,” you could not be sure which meaning of “triangle” the speaker had in mind. So the noun “triangle” is semantically ambiguous.
Of course, “triangle” is seldom ambiguous when it occurs as part of an on-going conversation. It has several possible meanings, but the intended meaning is almost always clear from the context in which the word is used. The fact that it has several meanings makes it potentially ambiguous. But there is a difference between multiplicity of meaning and ambiguity. To keep this distinction clear, I am going to use a technical term. I am going to say that “triangle” is polysemantic: poly–meaning “many”–and semantic–meaning “meaning.” A polysemantic word can have many meanings, yet not be ambiguous when used in an appropriate context.
The point, however, is that words, the basic building blocks of meaningful language, are extremely slippery items and must be handled with great care. Indeed, some people think that words can be indefinitely polysemantic–that a word can express different meanings every time it is used. They point to the noun “container.” If someone says he has a container of apples, you would probably understand “container” to mean “basket.” But if he says he has a container of water, you would probably understand “container” to mean “glass” or “bottle.” And if he says he has a container of groceries, you would probably understand “container” to mean “bag” or “box.” According to this argument, every time “container” occurs in a different environment, it expresses a different meaning. Hence, unlimited polysemy.
Things are bad, but they are not that bad. If, like Humpty Dumpty, our words could mean whatever we wanted them to mean, we would not have much luck using words to communicate. The trouble with this argument for unlimited polysemy is that it confuses meaning and reference. The word “container,” like the word “triangle,” is referentially vague–it can be used to refer to any one of a great variety of containers. But its meaning is, roughly, “an object capable of holding material for storage or transport,” and a great variety of objects, from spoons to boxcars, satisfy that definition.
Now, I can understand how people tolerate referential vagueness. It is a matter of common courtesy. A polite communicator gives the audience as much information as is needed, but not all that is available. Language evolved for social collaboration and once collaboration is achieved, language has done its job. Imagine telling someone to come here and then getting into an argument about precisely where “here” is–just there, or maybe an inch closer, or a tiny bit to the left? The adverb “here” is referentially vague, but that doesn’t cause trouble; it would only cause trouble if it were not vague. So I understand vagueness.
What I don’t understand is how we tolerate semantic ambiguity. Yet we seem to thrive on it. As a psychologist, I find it very interesting that most people are not even aware how ambiguous words can be. People are so skilled at resolving potential ambiguities that they don’t realize that they are doing it. The realization hits you, however, when you try to develop a theory of how people do it. People use the context, of course, but precisely what context is and how people use it need to be explained.
How Computers Resolve Ambiguity
There have been many attempts to enable computers to deal with ambiguity and I want to describe them briefly. If nothing else, it will help clarify what the problem is.
One of the benefits that modern computers provide for cognitive scientists is to give us a tool for sharpening and testing our theories. Many behavioral scientists believe that a computational theory is a first step toward a neuro-physiological theory. If we really understood how people cope with semantic ambiguity, we should be able to program a computer to do the same thing. But, so far, our attempts to devise such a theory and explain it to a computer have been only marginally successful.
The problem of ambiguity comes up almost everywhere that computers try to cope with human language. In information retrieval, the computer often retrieves information about alternative meanings of the search terms, meanings that we had no interest in. In machine translation, the different meanings of an English word may be expressed by very different words in the target language, so it is important to determine which meaning of the English word the author intended–and that is what a computer has trouble doing. Over and over, attempts to use computers to process human language have been frustrated by the computer’s limited ability to deal with polysemy.
I will illustrate what a computer faces with a well- known excerpt from Robert Frost’s poem, “Stopping by Woods on a Snowy Evening”:
The woods are lovely, dark and deep, But I have promises to keep And miles to go before I sleep, And miles to go before I sleep.”
To make my illustration as simple as possible, I will use only the couplet, `”But I have promises to keep, and miles to go before I sleep.” Let’s see what a computer might make of these thirteen words.
Imagine that a computer has been given all of the information about the meanings of English words that can be found in a good collegiate dictionary. So the computer will begin by looking up the word `But’ and will discover that the dictionary provides 11 different meanings. Next, the computer looks up `I’ and finds three meanings. On the assumption that the meaning of word combinations depends on the meanings of the individual words, the computer concludes that the two initial words, `But I,’ must have 3 x 11 = 33 possible compound meanings.
Proceeding in this manner, the computer finds that the word `have’ can be used to express 16 different meanings, so the number of possible compound meanings of `But I have’ is 3 x 11 x 16 = 528. And 7 meanings of `promise’ brings the number of possible meanings to 3,696.
By the time the computer finishes looking up all 13 of the words in this couplet, the product is 3,616,013,016,000 (three trillion six hundred sixteen billion thirteen million and sixteen thousand) possible compound meanings. This works out to an average of 9.247 meanings per word.
Put it this way: Imagine the computer is running a maze and that at each choice point there are 9 alternative ways to continue. In order to run the maze, the computer must make the correct choice every time–it must find the one correct path out of three trillion possibilities. Computers find this maze very difficult, but you and I sail through it without even noticing that there are any alternatives.
Of course, this couplet is short and the words are as plain and familiar as only Robert Frost could make them. And that is part of the trouble–the words are so plain and familiar. It is a perverse feature of human languages that the words used most frequently tend to be the most polysemantic. If we took a passage filled with obscure but unambiguous technical terms, the branching would be far less. But it would still not be zero.
So far I have assumed that the computer has only a dictionary. Let’s give the computer some capacity for syntactic analysis. Let’s assume–which is not unrealistic–that the little words (“but,” “I,” “to,” “and,” “before”–the so-called “closed-class” words) are there primarily as markers of grammatical structure, so a good syntactic analyzer will take care of them. The only thing tricky about the grammar here is that “have to” is a kind of modal auxiliary verb, synonymous with “must”–”have to keep promises” and “have to go miles.” The syntactic analyzer will also tell us that in this passage “promise” is a noun and “keep” is a transitive verb, and so on. Armed with this information, the computer can now make better use of its dictionary.
|have (t)||modal verb||1||1|
|Geometric Mean = 2.026 senses/word|
When the ambiguity calculation is repeated using only the meanings possible for the given syntactic structure, the product comes down to 9,660 possible meanings. Of course, this only looks like progress because three trillion was so absurd. But the computer still has to find the right meaning among a set of 9,660 possibilities.
The geometric mean per word is now 2.026 for this brief passage. If longer passages also average about two meanings per word, and if we were to guess at random which meaning was intended, we should be right about half the time. Not good enough.
The problem is even worse in other languages. The polysemy of words in spoken Chinese is far greater than it is in spoken English. Even French is more polysemantic than English.
The truth is that polysemy just doesn’t bother people. While a computer is struggling with its 9,660 alternatives, you and I select the correct interpretation in the twinkle of an eye. And we don’t even realize that we have done something remarkable.
But maybe language isn’t as ambiguous as this example has made it seem. It is true that common words usually have several different meanings, but not all of those meanings are used equally often. Some meanings of polysemantic words are used much more frequently than others are. For example, the word “horse” can refer to an animal, or it can refer to a gymnastic apparatus, or it can refer to a sawhorse, or it can refer to heroin, but if you sample usage in books and newspapers and magazines, you will find that the noun “horse” refers to an animal 100 times as often as it refers to anything else.
So maybe the computer can use statistics to solve this problem. What would happen if the computer always chose the most frequent meaning at every choice point in the maze?
My colleagues and I at Princeton University actually explored this possibility a few years ago. It isn’t easy, because good statistics about the relative frequencies of different meanings of polysemantic words do not exist. But we determined the context-appropriate meaning of every noun, verb, adjective, and adverb in some 104 passages (over 200,000 runningwords) of the Brown Corpus, which is a collection of 1,000,000 running words said to be representative of American prose writing. That gave us data on the relative frequencies of the different meanings of the most common polysemantic words.
Then we went through this semantically disambiguated text and looked to see how often the context-appropriate meaning was the most frequent one. The results are shown in this graph (Figure 1).
Looking only at the polysemantic words, the most frequent meaning was correct just 56 percent of the time. Of course, many of the nouns, verbs, adjectives, and adverbs in the Brown Corpus are monosemantic (in which case the most frequent meaning is the only meaning). So if we look at all the words together, the most frequent meaning is the correct one just 67 percent of the time.
When we give a computer more information, it does a better job. But understanding the wrong meaning for a third of the words is still not good enough. So far we have given the computer information about the words’ possible meanings, about the words’ syntactic role, and about the words’ most frequent usages. What more could we give it?
I have already said that people use context to determine the appropriate meanings of individual words, but so far we have not given the computer any information about context. Context can be linguistic–the other words that occur before and after a polysemantic word–or it can be situational–the situation in which the linguistic interaction is occurring. The linguistic context is the easiest to deal with, so let’s start with that.
One way to explore linguistic context is to collect a large sample of excerpts that contain a particular target word and to classify those excerpts manually according to which meaning of the word was intended. This manually disambiguated collection of contexts can then serve as training material for a computer.
My colleagues Claudia Leacock and Martin Chodorow and I programmed a computer to look for certain features of the context, then exposed it to a large sample of manually disambiguated contexts, and finally tested how well the computer could distinguish among a new set of manually disambiguated contexts. One program looked to see what nouns, verbs, adjectives, and adverbs occurred within plus-or-minus 50 words of the target word; we called that topical context. Another program looked at the exact order of words plus-or-minus two words on either side of the target word; we called that local context. And finally, we combined the output of the two programs in the hope that what one program missed, the other might catch.
The results for three different target words are shown in the following slides, where the percent correct is plotted as a function of the number of training contexts provided. In all cases, the performance improved as the number of training contexts increased.
First (Figure 2), the program was trained to distinguish four different meanings of the verb “serve.” As you can see, topical context was not very useful for this verb; the best results were obtained with local context. Combining them was only a little better than local context alone. Second (Figure 3), the program was trained to distinguish three different meanings of the adjective “hard.” As in the case of the verb, local context was much more useful than topical context, and combining them was no help.
Finally, the program was trained to distinguish six different meanings of the noun “line.” For this noun, the topical context was more useful than the local, and there was some advantage to combining them (Figure 4).
It is possible, of course–indeed, I think it likely–that we did not choose the correct properties of the context to train on, but in an international competition between programs that try to do this kind of thing [see Senseval-1 at http://www.itri.brighton.ac.uk/events/senseval/], ours was as good as any other. And we are only 85 percent correct at best, and we know how to do that well for only a few of the thousands of polysemantic English words. It’s still not nearly good enough. If you misunderstood the meaning of every seventh important word, you would not find language very useful.
The reality is there does not exist today a large-scale, operational computer system for determining the intended meanings of words in discourse. But solving the polysemy problem is so important that we can be confident that efforts will continue and that future systems will continue to improve. If you were to ask me what more could be done, I would suggest that we still have a lot to learn about contexts in general and linguistic contexts in particular. If I were feeling reckless, I might even suggest that understanding contexts better is critical for the future of processing linguistic messages by computer.
An Internet-user knows that information technology can now provide large amounts of raw information at the touch of a button. Unfortunately, most of it is irrelevant and searching through it to find what we really wanted requires great patience and peace of mind. My reckless claim would be that in addition to information technology, we need context technology. The future belongs to those who discover how to help users better understand the information that is provided. And the only way I know to do that is provide contexts to make the information meaningful.
How People Resolve Ambiguity
Enough about computers. Since people recognize intended meanings so easily, maybe computational linguists are missing something. So, what do we know about how people deal with ambiguous words?
Psychologists have learned a little about how people do it. We know, for example, that when a polysemantic word occurs, more than one meaning can be activated initially, but the context-appropriate meaning can be chosen very rapidly, within half a second. We assume that during that half second or so a meaning is chosen that can be integrated into a mental representation of the on-going discourse.
The nature of that representation of the on-going discourse is still uncertain, but it seems to involve more than just the linguistic context. It involves situational context and general knowledge.
Some psychologists believe that the representation of discourse must be “propositional,” with many propositions being filled in inferentially from general knowledge, and all of the propositions related by first order logic. A propositional representation of discourse would, of course, be easiest for a computer to simulate.
However, other psychologists maintain that the representation is “imaginal,” a mental picture that provides many default values from general knowledge but in which many irrelevant details are missing. Probably both propositions and mental images are involved. In any case, we don’t understand the mental representation of discourse well enough to replicate it with computers that are available today.
What we know is that the mental representations that people need in order to understand discourse must be both coherent and plausible.
First, a mental representation must be coherent. That is to say, if you scramble the order of sentences, or take sentences randomly from different sources, the result is not going to be organized around a unifying topic. It will not be coherent. The demand for coherence places many linguistic constraints on discourse. For example, new objects must be introduced with the indefinite article and thereafter referred to with the definite article; pronouns must have some antecedent to refer to; tense, locale and voice must agree, and so on.
And the mental representation must be plausible. If someone says, “Bill won the race from Sam because he had a good coach,” it is not plausible to conclude that “he” and “Sam” are coreferential. If told not to play with those boys because they are too rough, only a child would go looking for a smooth one. And if you see a sign in a farmer’s field saying “The bull may charge,” it is not plausible to think that the bull might charge admission. The demand for plausibility implies that the discourse must conform to general knowledge. That is a strong demand, of course, for general knowledge is boundless. It has been said that there is no fact so small or obscure that it would not disambiguate some polysemantic word.
So we can argue with some confidence that when people encounter a polysemantic word, they quickly select a meaning that can be integrated into a coherent and plausible mental representation of the discourse in which the polysemantic word occurs. The word is truly ambiguous only if two or more meanings satisfy that criterion.
Unfortunately, there is no insurance that the mental representation of the speaker and the mental representation of the listener will coincide. When people misunderstand one another, it is usually because they are working with different mental representations of what is being said, not because they misinterpret polysemantic words. They disagree because they have different ideas about why the speaker uttered the words she did. They disagree about the speaker’s pragmatic intentions. When it comes to estimating a speaker’s pragmatic intentions, we really do approach something resembling unlimited polysemy, and there is no dictionary of sentence meanings to guide us. But the pragmatics of discourse is a much larger topic than we can pursue here. Suffice it to say that there are many issues that affect the speaker’s pragmatic intentions, among them context, personal history, culture and the dynamic of the interaction. It’s a vibrant area of research, right now.
The first step is a plausible theory of linguistic context. Knowing the possible meanings of words and the grammatical structure of sentences is necessary, of course, but until we understand how people use context to construct a coherent and plausible mental representation of discourse, we will have no theory of language understanding, neither a computational theory nor a psychological theory.
One thing we can say with some assurance, however, is that people are extremely good at using context to resolve potential ambiguities. I believe that this skill in contextualizing is a general cognitive ability, not specific to language, but is involved in many higher cognitive processes. And I also believe that the best way to investigate our remarkable human ability to contextualize is to study ambiguous words.
Writing a Division and Classification Essay
Of the different styles of formal writing, the division and classification essay requires the sorting and grouping of specific items or ideas into various categories based on the information or characteristics provided. The well written classification essay should provide a thorough and meaningful message about how these different categories relate to one another.
For this reason, classification essay topics should be clear and concise. The perfect topic for a division and classification essay allows for a discussion on how the whole relates to its parts, or vice versa. For example, if writing an article on parenting styles, these four major divisions and classifications could be explored, alone and as a whole: authoritative, neglectful, permissive and authoritarian.
Purpose of a Division and Classification Essay
The primary goal of the division and classification essay is to break down large subjects into smaller and more specific subtopics for the reader. By creating a more manageable system of classification and division, things can be organised and divided much more quickly and easily, often requiring considerable less thought. These small categories, like subtopics, help the reader understand the individual facets of most classification essay topics more fully.
Structure of a Division and Classification Essay
When writing a classification essay, the structure must be concise and the information must be grouped into specific and organised categories. This is accomplished using a five paragraph division and classification essay format, consisting of the following parts: an introduction,three or more paragraphs which constitutes the body, and a conclusion
OZessay provides professional writing services, including background research, writing, proofreading & editing papers. If you are looking for a professionally written essay, research paper or dissertation, OZessay is the right place. Order your custom paper at OZessay today and we will be happy to assist!
- 250 words per page
- No plagiarism
- MBA level writers
- Free formatting and referencing
- On-time delivery
- 24/7 customer support
The introduction should immediately capture the reader’s attention, creating interest while developing a basic understanding for the chosen classification essay topics by explaining what the paper is going to be about. In the body of a classification essay, there should be at least three or more paragraphs, each one dedicated to a singular topic within a specific category that relates to the introduction.
Much like a summary, the conclusion reiterates the information that was given in the introduction, and sums up the supporting information provided in the body of the division and classification essay. The reader’s questions should be fully answered by the time he has read the entire conclusion.
Connection Words and Phrases for Classification Essay Topics
Connection words are very much like signal phrases, as both let the reader know when a transition is occurring within the text of a division and classification essay. These very clear and concise signals warn the reader that there will be new information divided and categorised in the next statement. While there are plenty more, some of the most commonly used connection phrases for classification essay topics include the following: classified according to, in this category, this type of, several kinds of, can be divided into, is categorised by.
Additionally, these signal phrases lend both unity and flow to the abstract categories of a classification essay, allowing the writer to reveal and communicate something that is meaningful and has value.
• Cengage Learning
• Wheeling Jesuit University
• Austin Peay State University
10 Tips for a Successful Division and Classification Essay
Only a small portion of students actually enjoy writing essays. That is simply because writing essays is quite difficult for most people, requiring hours of research and hard work – and that’s before the writing process ever begins. The division and classification essay ranks as one of the most important assignments that most students will ever face.
Critical thinking skills and the ability to write proficiently are necessary requirements for future careers and employment opportunities. Here are some insightful writing tips for composing the division and classification essay, or to assist with other writing assignments.
#1 – Choosing Classification Essay Topics
The savvy essay writer chooses the assigned division and classification topics that are familiar as well as interesting, lending greater detail and understanding to the subtopics or categories.
#2 – Don’t Delay
Once the assignment is given, writers should immediately begin the process of choosing the best classification essay topics, budgeting time, and forming a clear plan of action.
#3 – Make an Outline
Just as one might need a list to stay focused at the grocery store, it is important to write an outline to help with inserting the most vital facts into the essay. When finished, the outline should reflect all of the points that need to be made.
#4 – Write the First Draft
It is not likely that the first draft will contain all of the information needed to hand in for a grade. There will be additional points to include later, and others that will need to be omitted. Multiple drafts may be needed before the essay is ready to be graded by an instructor.
#5 – Understand the Assignment
Writers should be sure to ask any questions about the assignment early on, long before the division and classification essay deadline. It is not wise to wait until the last minute.
#6 – Seek Out Examples
For clarification, there are numerous example classification essays available online for comparison. These are a great way for writers to see what is expected from the assignment.
#7 – Never Plagiarise
While quotes or citations are acceptable, under no circumstances should the writer ever ‘borrow’ someone else’s words. Essays are considered to be intellectual property, owned by the composer.
#8 – Format Correctly
Don’t fall victim to poor formatting, which can result in a considerable loss of grade points. Take the time to go over formatting requirements and refer to online sources if there are unanswered questions.
#9 – Cite Sources
Citing sources is a vital part of any formal essay unless the instructions suggest otherwise, and time spent on this aspect of the essay is necessary for a desirable grade. Incorrectly citing the sources used is nearly as bad as plagiarizing someone else’s work.
#10 – Put the Essay Away
Once the final essay has been written with plenty of time to spare before the assignment’s due date, put it away and take some time to clear your mind. After a few days, read and review the essay again. If everything appears to be in order, the essay is ready to be handed in.
It is important to take essays seriously as these assignments weigh heavily in grades. Poorly written essays or late assignments can easily prevent admission into certain schools or academic programs. Approach all essay and other writing assignments as if your future depends upon the grade you will receive.
Need assistance with a division and classification essay? OZessay will quickly and skillfully provide the professional essay and report writing services you need for the success of your writing course.
Contact us for more information on how you can secure a better grade today.