«A CONCEPTUAL, CASE-RELATION REPRESENTATION OF TEXT FOR INTELLIGENT RETRIEVAL by Judith P. Dick A Thesis submitted in conformity with the requirements ...»
A CONCEPTUAL, CASE-RELATION REPRESENTATION
OF TEXT FOR INTELLIGENT RETRIEVAL
Judith P. Dick
A Thesis submitted in conformity with the requirements
for the Degree of Doctor of Philosophy in the
University of Toronto
Copyright !1991 Judith P. Dick
0. REFERENCES ii
Many thanks to all the members of my committee for persisting to the close of this long project. I am most grateful to everyone involved, especially to co-supervisor Ann Schabas, who sustained her interest in the research through a number of difﬁculties. Graeme Hirst, also a co-supervisor, opened many doors to things that would otherwise not have been accessible.
Special thanks are due to the Department of Computer Science which has welcomed me warmly and has been generous with computing facilities, tutelage and friendship. Past Chairman, Derek Corneil, has my gratitude for letting me share the limited space. In addition, I must say ‘‘thank you’’ to John Mylopoulos who encouraged me to study AI and gave me a good start.
The funding I received from the Ontario Government and the University of Toronto made further education late in life possible. Support from The Natural Sciences and Engineering Research Council made the difference between good schooling and a ﬁne education.
Thanks go also to Jim Dick for his unfailing support and for sharing his family with me when my own died. The last word is for my father who gave me life and helped me to survive it. He loved me and taught me through the days of his life and the days of his death. I Myr—
0. REFERENCES iii A conceptual, case-relation representation of text for information retrieval Judith P. Dick This research demonstrates that intelligent retrieval is possible using a conceptual representation. It is an attempt to move from contemporary IR toward retrieval of ideas through text analysis. Intelligent retrieval systems should help the user ﬁnd information while allowing him or her to concentrate on the problem that occasioned the search. The user must be free to reason through his or her problem with additional, newly retrieved information. Search operations should be a secondary consideration.
In addition, a conceptual representation enables the user to ﬁnd information about ideas that he or she cannot name but can outline. Such information can be found even when the stored text does not contain relevant nominals.
In order to accomplish intelligent retrieval, a semantic representation of the text had to be made. The strength of our semantic representation results from the use of Harold Somers’s grid of twenty-eight deﬁnitive deep cases. The grid is designed to answer the strongest criticisms of case and combines grammatical and semantic roles in each cell. The cases have been developed beyond their original capacity, but the theoretical framework and the grid itself were kept intact.
A knowledge base of contract law cases has been constructed. The principal argument of each case has been analyzed according to Stephen Toulmin’s ‘‘good reasons’’ argument model. John Sowa’s conceptual graphs have been used as a near-FOL notation. In addition to the semantic representations of each argument, the knowledge base contains a lexicon of legal concepts and rules for semantic selection.
The dissertation concludes with a retrieval demonstration using questions derived from cases following those represented in the knowledge base. LOG+, a frame matching algorithm by Mara Miezitis along with some proposed adaptations, is used. The demonstration focuses on pattern-matching among conceptual deﬁnitions using spreading activation. Semantic constraints facilitate inference within a type hierarchy.
A case-law retrieval system would ideally provide the researcher with conceptual access to cases and free him or her to develop arguments. The use of deep cases for the representation of large texts makes conceptual retrieval possible. Employing inference to locate implicit information gives us desirable advantages over contemporary IR system designs.
0. REFERENCES iv
1.1. Intelligent retrieval
1.2. What the lawyer wants
1.3. The limitations of traditional systems
1.3.1. Problems with keyword representations
188.8.131.52. Distinguishing meanings for terms
184.108.40.206.3. Syntactic structure
220.127.116.11.4. High-frequency terms
18.104.22.168.5. Unnamed ideas
22.214.171.124.6. Inﬂexible Matching
1.3.2. Problems with Boolean logic
1.4. The promise of conceptual retrieval
1.5. Representing meaning
1.5.2. General and domain-speciﬁc knowledge
1.5.3. What a conceptual retrieval system does
1.5.4. Problems with conceptual retrieval
1.5.5. Quasi-intelligent IR
1.6. Document retrieval and conceptual retrieval
2. Literature review and technical background
2.1. IR systems
2.1.1. Evaluative research
2.1.2. Statistical analyses and automatic indexing
2.1.3. Vector retrieval
2.1.4. IR and natural language systems
2.2. Retrieval systems for legal information
2.2.1. The special requirements of law
2.2.2. Online retrieval systems
2.2.3. Knowledge-based systems
126.96.36.199. Legal reasoning systems for legislative instruments
188.8.131.52. Case-based legal reasoning systems
184.108.40.206. Conceptual retrieval
2.3. AI and IR
2.3.1. What is a knowledge representation?
220.127.116.11. Kr for conceptual retrieval
0. REFERENCES v2.3.2. Natural language processing
18.104.22.168. Case grammars
22.214.171.124. Case-slot organization
2.3.5. IR and AI
3. Contents of the knowledge base
3.2. Which cases?
3.3. The cases
3.3.1. Weeks v. Tybald. (1605) Noy 11; 74 E.R. 982
3.3.2. Stamper v. Temple. (1845) 6 Humph. 113 (Tennessee).
3.3.3. Upton-on-Severn v. Powell. England. Court of Appeal.  1 All E.R. 220.................. 60 3.3.4. Hadley v. Baxendale. (1854) 9 Exch. 341, 156 E.R. 145
4. Representing knowledge using Sowa’s conceptual structures
4.1. What are Sowa’s conceptual structures?
4.1.1. Basic conceptual graphs
4.1.2. The linear form and its punctuation
4.1.3. Logic notation
4.1.4. Lambda expressions
4.1.5. Quantiﬁers and scoping
4.1.6. Co-reference links
4.1.7. Set notation
4.1.8. Mass nouns
4.1.9. Combining graphs
4.2. Why use Sowa’s cgs?
4.3. Adapting the notation to use
4.3.2. Temporal predicates and tenses
5. Somers’s case grid
5.2. Why use Somers’s cases?
5.3. Somers’s approach to case grammar
5.3.1. Source-goal directionality
5.3.2. Agent-patient co-referentiality
5.3.3. Agent and experiencer optionality
5.4. Somers’s proposed solution
5.5. The case grid
6. Representing arguments
0. REFERENCES vi6.1. Introduction
6.2. Knowledge base structure
6.3. Text analysis—the general approach
6.3.1. Indirect analysis
6.3.2. Direct analysis
6.4. Conceptual graphs
6.5. Somers’s case grid
6.6. Toulmin arguments
6.7. Lexicon of legal concepts (lconcs)
6.8. The representations
6.8.1. Case 1: Weeks v. Tybald
6.8.2. Case 2: Stamper v. Temple
6.8.3. Case 3: Upton-on-Severn Rural District Council v. Powell
6.8.4. Case 4: Hadley v. Baxendale
7. The retrieval mechanism
7.2. Objectives revisited
7.2.1. A realistic model of search behaviour
7.2.2. Retrieving concepts
7.3. An overview of the search process
7.3.1. Using the argument structure
7.3.3. Frame matching
7.3.4. Why use LOG?
7.3.5. Adapting LOG to use in IR
7.3.6. The LOG lexicon
7.3.7. The semantic selection
7.3.8. Matching in LOG
7.4. A detailed view of the search process
0. REFERENCES vii7.4.1. Introduction
7.4.2. Frame matching as conceptual retrieval
126.96.36.199. The lexicon
188.8.131.52. Semantic constraints
184.108.40.206. The type hierarchy
220.127.116.11. Generalized inference
7.4.3. Examples—the test patterns
18.104.22.168. Search 1—legal concept named, followed by free-ranging search
22.214.171.124. Search 2—legal concept by deﬁnition
126.96.36.199. Search 3—legal concept by description
188.8.131.52. Search 4—facts and legal concept
7.4.4. Examples—medium complexity
184.108.40.206. Search 5—facts to facts
220.127.116.11. Search 6—difﬁcult legal concept by description
7.4.5. Examples—from reported cases
18.104.22.168. Search 7—Carlill v. Carbolic Smoke Ball Co.
22.214.171.124. Search 8—Cory v. Thames Ironworks Co.
126.96.36.199. Search 9—Lilley v. Doubleday
188.8.131.52. Search 10—Baxendale v. London, et al
8. Conclusions and afterword
8.1. Signiﬁcance of the research
8.2. Incomplete tasks
8.3. Future research
8.3.1. Somers’s cases
8.3.2. Sowa’s conceptual graphs
8.4. The next step
8.5. Hope for the future?
A. Catalogue of conceptual relations (conrels)
B. Glossary of legal terms
C. Lexicon of legal concepts (lconcs)
D. Rules for semantic selection
1.1. Intelligent retrieval Information retrieval systems are intended for people’s use. Artiﬁcial intelligence (AI) techniques are used in this application to assist people in developing their ideas.
Ideally, an information retrieval system will adapt itself to a user’s changing viewpoint. It ought to be designed to suit not a prototypical user, but an intelligent person whose ideas evolve. An intelligent retrieval system would free its user to explore ideas as he wished, unfettered by rigid system limitations.
Our present capability is a long way from the ideal. However, to be worthwhile, any attempted improvement must be set in a realistic framework. The problem of user modelling continues to perplex information scientists. In this work, it is assumed that the user is an individual with changing ideas and that supporting his cognitive activity takes precedence over improving system efﬁciency. Everyone constructs conceptual patterns as he accumulates experience. The process of learning while living is paralleled, in retrieval, by learning while searching. In order for the searcher to maximize his potential, the system should permit him to shift his perspective as readily as reality requires him to do so. The need for ﬂexibility is perhaps more obvious in searching for a good legal argument than in other kinds of retrieval. However, it is a need we all experience.
Present day AI can take us some distance toward the ideal, but not the whole way. Although it has improved search with generalized inference, the difﬁculty of handling natural language is a major stumbling block. This dissertation describes work done toward cutting that block down to size. A knowledge representation of contract law cases has been constructed. John Sowa’s conceptual graphs (Sowa 1984), and Harold Somers’s linguistic cases (Somers 1987), have been used.
The law case representations have been organized according to a schema based on Stephen
Retrieval capability has been demonstrated using a frame matcher to describe how queries can be answered. Realistic questions were derived from the facts of contract cases which followed those represented.
Large volumes of text are characteristic of modern retrieval systems. At present, large-volume applications are beyond the capability of our knowledge base technologies. However, there is no known absolute barrier to large-scale implementations, especially if the language problem is curtailed. The potential power and ﬂexibility of conceptual retrieval are undeniable.
1.2. What the lawyer wants IR systems are used in many different subject domains. One of the domains that poses both difﬁcult problems and interesting challenges is case law research.
The lawyer wants authority for his point of view. He wants a viable argument that will support his claim—from a binding case if he can get it, from a persuasive one if he cannot. Failing that, he will take any helpful argument he can ﬁnd. He may even want some conﬁguration of facts and legal concepts which, although it does not constitute an argument in itself, will help him to construct one.
The following description of a lawyer’s search shows the usual cognitive phenomenon.