«1 INTRODUCTION In (McCarthy and Hayes 1969), we proposed dividing the artiﬁcial intelligence problem into two parts—an epistemological part and a ...»
EPISTEMOLOGICAL PROBLEMS OF
Computer Science Department
Stanford, CA 94305
In (McCarthy and Hayes 1969), we proposed dividing the artiﬁcial intelligence problem into two parts—an epistemological part and a heuristic part.
This lecture further explains this division, explains some of the epistemological problems, and presents some new results and approaches.
The epistemological part of AI studies what kinds of facts about the world are available to an observer with given opportunities to observe, how these facts can be represented in the memory of a computer, and what rules permit legitimate conclusions to be drawn from these facts. It leaves aside the heuristic problems of how to search spaces of possibilities and how to match patterns.
Considering epistemological problems separately has the following advantages:
1. The same problems of what information is available to an observer and what conclusions can be drawn from information arise in connection with a variety of problem solving tasks.
2. A single solution of the epistemological problems can support a wide variety of heuristic approaches to a problem.
3. AI is a very diﬃcult scientiﬁc problem, so there are great advantages in ﬁnding parts of the problem that can be separated out and separately attacked.
4. As the reader will see from the examples in the next section, it is quite diﬃcult to formalize the facts of common knowledge. Existing programs that manipulate facts in some of the domains are conﬁned to special cases and don’t face the diﬃculties that must be overcome to achieve very intelligent behavior.
We have found ﬁrst order logic to provide suitable languages for expressing facts about the world for epistemological research. Recently we have found that introducing concepts as individuals makes possible a ﬁrst order logic expression of facts usually expressed in modal logic but with important advantages over modal logic—and so far no disadvantages.
In AI literature, the term predicate calculus is usually extended to cover the whole of ﬁrst order logic. While predicate calculus includes just formulas built up from variables using predicate symbols, logical connectives, and quantiﬁers, ﬁrst order logic also allows the use of function symbols to form terms and in its semantics interprets the equality symbol as standing for identity. Our ﬁrst order systems further use conditional expressions (nonrecursive) to form terms and λ-expressions with individual variables to form new function symbols. All these extensions are logically inessential, because every formula that includes them can be replaced by a formula of pure predicate calculus whose validity is equivalent to it. The extensions are heuristically nontrivial, because the equivalent predicate calculus may be much longer and is usually much more diﬃcult to understand—for man or machine.
The use of ﬁrst order logic in epistemological research is a separate issue from whether ﬁrst order sentences are appropriate data structures for representing information within a program. As to the latter, sentences in logic are at one end of a spectrum of representations; they are easy to communicate, have logical consequences and can be logical consequences, and they can be meaningful in a wide context. Taking action on the basis of information stored as sentences, is slow and they are not the most compact representation of information. The opposite extreme is to build the information into hardware, next comes building it into machine language program, then a language like LISP, and then a language like MICROPLANNER, and then perhaps productions. Compiling or hardware building or “automatic programming” or just planning takes information from a more context independent form to a faster but more context dependent form. A clear expression of this is the transition from ﬁrst order logic to MICROPLANNER, where much information is represented similarly but with a speciﬁcation of how the information is to be used. A large AI system should represent some information as ﬁrst order logic sentences and other information should be compiled. In fact, it will often be necessary to represent the same information in several ways. Thus a ball-player’s habit of keeping his eye on the ball is built into his “program”, but it is also explicitly represented as a sentence so that the advice can be communicated.
Whether ﬁrst order logic makes a good programming language is yet another issue. So far it seems to have the qualities Samuel Johnson ascribed to a woman preaching or a dog walking on its hind legs—one is suﬃciently impressed by seeing it done at all that one doesn’t demand it be done well.
Suppose we have a theory of a certain class of phenomena axiomatized in (say) ﬁrst order logic. We regard the theory as adequate for describing the epistemological aspects of a goal seeking process involving these phenomena
provided the following criterion is satisﬁed:
Imagine a robot such that its inputs become sentences of the theory stored in the robot’s database, and such that whenever a sentence of the form “I should emit output X now” appears in its database, the robot emits output X. Suppose that new sentences appear in its database only as logical consequences of sentences already in the database. The deduction of these sentences also uses general sentences stored in the database at the beginning constituting the theory being tested. Usually a database of sentences permits many diﬀerent deductions to be made so that a deduction program would have to choose which deduction to make. If there was no program that could achieve the goal by making deductions allowed by the theory no matter how fast the program ran, we would have to say that the theory was epistemologically inadequate. A theory that was epistemologically adequate would be considered heuristically inadequate if no program running at a reasonable speed with any representation of the facts expressed by the data could do the job. We believe that most present AI formalisms are epistemologically inadequate for general intelligence; i.e. they wouldn’t achieve enough goals requiring general intelligence no matter how fast they were allowed to run. This is because the epistemological problems discussed in the following sections haven’t even been attacked yet.
The word “epistemology” is used in this paper substantially as many philosophers use it, but the problems considered have a diﬀerent emphasis.
Philosophers emphasize what is potentially knowable with maximal opportunities to observe and compute, whereas AI must take into account what is knowable with available observational and computational facilities. Even so, many of the same formalizations have both philosophical and AI interest.
The subsequent sections of this paper list some epistemological problems, discuss some ﬁrst order formalizations, introduce concepts as objects and use them to express facts about knowledge, describe a new mode of reasoning called circumscription, and place the AI problem in a philosphical setting.
2 EPISTEMOLOGICAL PROBLEMSWe will discuss what facts a person or robot must take into account in order to achieve a goal by some strategy of action. We will ignore the question of how these facts are represented, e.g., whether they are represented by sentences from which deductions are made or whether they are built into the program. We start with great generality, so there are many diﬃculties. We obtain successively easier problems by assuming that the diﬃculties we have recognized don’t occur until we get to a class of problems we think we can solve.
1. We begin by asking whether solving the problem requires the cooperation of other people or overcoming their opposition. If either is true, there are two subcases. In the ﬁrst subcase, the other people’s desires and goals must be taken into account, and the actions they will take in given circumstances predicted on the hypothesis that they will try to achieve their goals, which may have to be discovered. The problem is even more diﬃcult if bargaining is involved, because then the problems and indeterminacies of game theory are relevant. Even if bargaining is not involved, the robot still must “put himself in the place of the other people with whom he interacts”.
Facts like a person wanting a thing or a person disliking another must be described.
The second subcase makes the assumption that the other people can be regarded as machines with known input-output behavior. This is often a good assumption, e.g., one assumes that a clerk in a store will sell the goods in exchange for their price and that a professor will assign a grade in accordance with the quality of the work done. Neither the goals of the clerk or the professor need be taken into account; either might well regard an attempt to use them to optimize the interaction as an invasion of privacy.
In such circumstances, man usually prefers to be regarded as a machine.
Let us now suppose that either other people are not involved in the problem or that the information available about their actions takes the form of input-output relations and does not involve understanding their goals.
2. The second question is whether the strategy involves the acquisition of knowledge. Even if we can treat other people as machines, we still may have to reason about what they know. Thus an airline clerk knows what airplanes ﬂy from here to there and when, although he will tell you when asked without your having to motivate him. One must also consider information in books and in tables. The latter information is described by other information.
The second subcase of knowledge is according to whether the information obtained can be simply plugged into a program or whether it enters in a more complex way. Thus if the robot must telephone someone, its program can simply dial the number obtained, but it might have to ask a question, “How can I get in touch with Mike?” and reason about how to use the resulting information in conjunction with other information. The general distinction may be according to whether new sentences are generated or whether values are just assigned to variables.
An example worth considering is that a sophisticated air traveler rarely asks how he will get from the arriving ﬂight to the departing ﬂight at an airport where he must change planes. He is conﬁdent that the information will be available in a form he can understand at the time he will need it.
If the strategy is embodied in a program that branches on an environmental condition or reads a numerical parameter from the environment, we can regard it as obtaining knowledge, but this is obviously an easier case than those we have discussed.
3. A problem is more diﬃcult if it involves concurrent events and actions.
To me this seems to be the most diﬃcult unsolved epistemological problem for AI—how to express rules that give the eﬀects of actions and events when they occur concurrently. We may contrast this with the sequential case treated in (McCarthy and Hayes 1969). In the sequential case we can write s = result(e, s) (1) where s is the situation that results when event e occurs in situation s.
The eﬀects of e can be described by sentences relating s, e and s. One can attempt a similar formalism giving a partial situation that results from an event in another partial situation, but it is diﬃcult to see how to apply this to cases in which other events may aﬀect with the occurrence.
When events are concurrent, it is usually necessary to regard time as continuous. We have events like raining until the reservoir overﬂows and questions like Where was his train when we wanted to call him?.
Computer science has recently begun to formalize parallel processes so that it is sometimes possible to prove that a system of parallel processes will meet its speciﬁcations. However, the knowledge available to a robot of the other processes going on in the world will rarely take the form of a Petri net or any of the other formalisms used in engineering or computer science.
In fact, anyone who wishes to prove correct an airline reservation system or an air traﬃc control system must use information about the behavior of the external world that is less speciﬁc than a program. Nevertheless, the formalisms for expressing facts about parallel and indeterminate programs provide a start for axiomatizing concurrent action.
4. A robot must be able to express knowledge about space, and the locations, shapes and layouts of objects in space. Present programs treat only very special cases. Usually locations are discrete—block A may be on block B but the formalisms do not allow anything to be said about where on block B it is, and what shape space is left on block B for placing other blocks or whether block A could be moved to project out a bit in order to place another block. A few are more sophisticated, but the objects must have simple geometric shapes. A formalism capable of representing the geometric information people get from seeing and handling objects has not, to my knowledge, been approached.
The diﬃculty in expressing such facts is indicated by the limitations of English in expressing human visual knowledge. We can describe regular geometric shapes precisely in English (fortiﬁed by mathematics), but the information we use for recognizing another person’s face cannot ordinarily be transmitted in words. We can answer many more questions in the presence of a scene than we can from memory.
5. The relation between three dimensional objects and their two dimensional retinal or camera images is mostly untreated. Contrary to some philosophical positions, the three dimensional object is treated by our minds as distinct from its appearances. People blind from birth can still communicate in the same language as sighted people about three dimensional objects. We need a formalism that treats three dimensional objects as instances of patterns and their two dimensional appearances as projections of these patterns modiﬁed by lighting and occlusion.