«Invited Paper Presented At The DARPA Image Understanding Workshop Los Angeles, February 23-25, 1987. Copyright © 1987 David M.McKeown.Jr. and Wilson ...»
NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:
The copyright law o f the United States (title 17, U.S. Code) governs the making
o f photocopies or other reproductions o f copyrighted material. Any copying o f this
document without permission of its author may be prohibited by law.
Automating Knowledge Acquisition For
Aerial Image Interpretation
David M. McKeown,Jr.,Wilson A. Harvey
Invited Paper Presented At The DARPA Image Understanding Workshop Los Angeles, February 23-25, 1987.
Copyright © 1987 David M.McKeown.Jr. and Wilson A. Harvey This research was primarily sponsored by the Defense Mapping Agency, under Contract DMA 800-85- C-0009 and partially supported by the Defense Advanced Research Projects Agency (DOD), ARPA Order No. 3597, monitored by the Air Force Avionics Laboratory Under Contract F33615-81-K-1539. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Mapping Agency, of the Defense Advanced Research Projects Agency, or the US Government.
AUTOMATING KNOWLEDGE ACQUISITION FOR AERIAL IMAGE INTERPRETATION 1T a b l e of C o n t e n t s
1. Introduction 1
1.1. Knowledge Acquisition For Vision 3
1.2. Layout of the Remainder of Paper 4
2. The SPAM Architecture
2.1. Knowledge Acquisition In SPAM
2.2. Schematization of SPAM
3. Tools For Knowledge Acquisition In SPAM
3.1. The ISCAN User Interface 11
3.2. The RULEGEN Compiler 13
3.3. SPATS: Automating Performance Analysis 13
4. A Schema-Based Knowledge Representation 17
4.1. A House Fragment Rule 18
4.2. A House-Road Consistency Rule 19
4.3. A House Functional-Area Definition Rule 20
4.4. A Suburban-Scene Model Rule 21
5. A New Task Domain For SPAM 5.1.
List of Figures F i g u r e 1-1: Overview of Knowledge Acquisition For SPAM F i g u r e 1-2: Types Of Knowledge Utilized In SPAM F i g u r e 2-1: Interpretation Phases In SPAM F i g u r e 2-2: Interpretation Phases In SPAM F i g u r e 2-3: Refinement, Consistency, and Prediction in SPAM F i g u r e 3-1: Knowledge Acquisition For SPAM F i g u r e 3-2: Ground Truth Segmentations For Dulles and Andrews AFB F i g u r e 3-3: Flight Information Charts With Airport Layouts For Dulles and Andrews AFB F i g u r e 5-1: Suburban House Scene Imagery F i g u r e 5-2: A Hand segmentation of a suburban scene F i g u r e 5-3: A Machine segmentation of a suburban scene Figure 5-4: A House functional-area result from hand segmentation F i g u r e 5-5: A Road functional-area result from machine segmentation F i g u r e 5-6: Rules generated by Interpretation Phase F i g u r e 5-7: Class, Subclass, and Functional Area Definitions For Both Tasks Figure 1: Example SPATS output for region-to-fragment phase.
F i g u r e 2: Example SPATS output for functional-area phase.
AUTOMATING KNOWLEDGE ACQUISITION EOR AERIAL IMAGE INTERPRETATIONAbstract The interpretation of aerial photographs requires a lot of knowledge about the scene under consideration. Knowledge about the type of scene: airport, suburban housing development, urban city, aids in low-level and intermediate level image analysis, and will drive high-level interpretation by constraining search for plausible consistent scene models. Collecting and representing large knowledge bases requires specialized tools. In this paper we describe the organization of a set of tools for interactive knowledge acquisition of scene primitives and spatial constraints for interpretation of aerial imagery.
These tools include a user interface for interactive knowledge acquisition, the automated compilation of that knowledge from a schema-based representation into productions that are directly executable by our interpretation system, and a performance analysis tool that generates a critique of the final interpretation. Finally, the generality of these tools is demonstrated by the generation of rules for a new task, suburban house scenes, and the analysis of a set of imagery by our interpretation system.
1. Introduction In this paper we describe a collection of software tools, ISCAN/RULEGEN/SPATS, for interactive acquisition of spatial knowledge, automated compilation of this knowledge into a rule-based scene interpretation system, and the production of performance analysis statistics to aid in incremental refinement of spatial knowledge. This work is focused on knowledge acquisition and performance analysis tools for SPAM, a knowledge-based system designed to interpret aerial photographs for mapping and photo interpretation. We have reported on SPAM research results in the context of airport scenes '.
We address a broad set of topics within the overall framework of knowledge acquisition.
First and foremost we are interested in automating the process by which an interpretation system, such as SPAM, can collect and represent new knowledge to improve performance on existing interpretation tasks, or in attempting to begin to become proficient in new ones. For the airport task we primarily relied on spatial constraints found in books on 345 airport design ' ' and, to a lesser extent, by observations of relationships found in aerial imagery. Other task domains, such as suburban house scenes, do not appear to have codified spatial organizations, although they exhibit similar patterns across many examples. In lieu of such information the ability to indicate and measure spatial relationships in representative imagery becomes more important. ISCAN is our first attempt to provide a graphical user interface, appropriate in an image-based domain, which has a model of the types of knowledge required by SPAM during the interpretation process. Such an interface may also provide individuals such as cartographers, remote sensing and photo interpreters, and other non-programmers with a mechanism for adding knowledge to SPAM without a detailed understanding of the underlying system.
Finally, SPATS was motivated by a need to automate the evaluation of the interpretations produced by SPAM within the context of idealized human photo interpretation. The goal was to measure the size of the interpretation space explored by SPAM, the number of competing hypotheses, and the correctness of those hypotheses during each interpretation phase. By varying the image segmentations presented to SPAM or by generating SPAM systems with different types of spatial knowledge we can now more rigorously evaluate and explore knowledge effects using SPATS. Figure 1-1 is an abstract overview of the relationship between these tools. While this particular focus on acquisition, compilation, and performance evaluation might appear to be somewhat parochial, we believe that these issues will be seen to be central to other researchers in computer vision working along similar lines.
1.1. Knowledge Acquisition For Vision Previous efforts to investigate knowledge acquisition v.iihin the context of systems for image interpretation have primarily focused on spectral properties of objects in the image or viewpoint specific spatial relationships. Early work by Barrow and Popplestone addressed the problem of describing relations between picture elements with predicates like ADJACENT(x.y) or ABOVE(x,y). Using this methodology "rules" could be formulated from these predicates and attached to individual elements of a picture. For example, in the context of face recognition, a nose would be defined by the rule: "ABOVE(x,mouth) and LEFT-OF(x,right-eye) and RIGIIT-OF(x,left-eye)". These rules were to be embedded into a resolution theorem proving paradigm. This work was a basis for the ISIS system which added the use of an interactive segmentation system. It allows a user to interactively specify representative regions with a particular interpretation, and then invoked an intensity classification segmentation process to attempt to extract the remaining parts of the scene.
89 1 Recently, the VISIONS system has reported similar attempts to make interpretations by propagating low-level process output, such as lines or regions, up to an intermediate level, which combines the low-level output with computed attributes such as color, texture, or orientation. Interpreted objects are defined in terms of these intermediate elements. Loosely speaking these classification systems use "knowledge" such as the sky has a pixel intensity greater than SO but less that 125 in the blue band. In fact, one must resort to density weighting functions much as in statistical pattern recognition for remote sensing. This "knowledge" is highly sensor and scene dependent. Other measures such as height, size (in pixels), and relative spatial position (e.g. sky is above the house and grass is below the house) are also employed. Again, these viewpoint dependent quantities will vary, not only from domain to domain, but from image to image. Ultimately sky is blue and grass is green allows for a direct mapping between regions and the associated highlevel interpretation. However, this mapping represents a rather shallow use of knowledge whose robustness is questionable. For example, consider the effect of averaging the RGB components of a color image into a monochromatic image. While the scene geometry remains unchanged, without the direct mapping of region spectral properties into a semantic interpretation (sky is blue) it is difficult to see how to operationalize much of the spatial knowledge. Thus, although there appears to be a spatial component, it is predicated on strong mapping between color and interpretation.
In our work with SPAM we have attempted to identify sources of knowledge that did not suffer from these drawbacks, and utilize spatial relationships in such a way that a chain of reasoning exists, generated from the application of many constraints across multiple levels of interpretation. While spectral knowledge can play a role in certain domains we believe that there are many types of spatial knowledge that can be expected to be more effective in driving the knowledge-based interpretation of aerial imagery. In terms of acquisition 4
AUTOMATING KNOWLEDGE ACQUISITION M i l AERIAL IMAGE INTERPRETATIONand utilization, we believe that Figure 1-2 lists 5 types of knowledge that are available and appear to us to be effective in aerial image interpretation tasks.
Type 1: Knowledge for the determination and definition of appropriate scene domain primitives. This includes knowledge of the image segmentation process, the image analysis tools that can reliably extract these primitives, and the appearance of the primitives in the image.
Type 2: Knowledge of spatial relationships and constraints between the scene domain primitives.
Type 3: Knowledge of model decompositions that determine collections of primitives which form "natural" components of the scene. These components can be characterized as sub-models that accumulate support for local interpretations and provide a context within which global analysis can be performed.
Type 4: Knowledge of methods for combining these components into complete scene interpretations.
Type 5: Knowledge of how to recognize and evaluate conflicts between competing interpretations.
F i g u r e 1-2: Types Of Knowledge Utilized In SPAM
1.2. Layout of t h e R e m a i n d e r of P a p e r In the following section we briefly describe the architecture of SPAM. We discuss the kinds of knowledge that SPAM utilizes and therefore needs to be acquired for an interpretation task. In Section 3 we describe the ISCAN/RULEGEN/SPATS tools and in Section 4 give an example of the schemata produced by ISCAN and used by RULEGEN to generate a SPAM interpretation system. Finally, in Section 5 we give an example of suburban house scene interpretation by a SPAM system generated using the ISCAN/RULEGEN/SPATS tools. We also compare the structure of the original hand generated SPAM system with those generated using these knowledge acquisition tools.
2. T h e S P A M Architecture SPAM represents four types of interpretation primitives, regions, fragments, functional areas, and models. SPAM performs scene interpretation by transforming image regions into scene fragment interpretations, aggregating these fragments into consistent and compatible collections called functional areas, and selecting sets of functional areas that form models of the scene. Loosely speaking there are four phases of interpretation. Each of these four phases operationalizes one or more of the five types of domain knowledge. In order to build a SPAM system we must be able to acquire knowledge for each interpretation phase as described in Figure 2-1.
As shown in Figure 2-2 each phase is executed in the order given above. SPAM drives
AUTOMATING KNOWLEDGE ACQUISITION FOR AERIAL IMAGE INTERPRETATION 5Phase 1: Region-to-fragment Assigns the image region data a set of fragment interpretations based solely on local properties (2-D shape characteristics, texture, 3-D depth/height, etc.) and knowledge about the classes of objects found in the scene.
Phase 2: Local-consistency-check Pair-wise tests are performed on the fragment interpretations that utilize spatial knowledge about the scene under consideration. The confidence of those interpretations supporting one another are incremented based on the quality of the test.
Phase 3: Functional-area Sets of mutually consistent interpretations that share similar functions or are spatial decompositions of the scene are grouped into cliques called functional areas.
Phase 4: Model-generation Sets of functional areas are grouped together into scene segments. The segments with the largest number of functional areas become distinct scene models. Any conflicts encountered when combining functional areas are resolved by a default strategy, using the accumulated support for each interpretation, or by specific knowledge added by the user.
F i g u r e 2-1: Interpretation Phases In SPAM
from, a local, low-level set of interpretations to a high-level, more global, scene interpretation. There is a set of hard-wired productions for each phase that control the order of rule executions, the forking of processes, and other domain-independent tasks.