«Metaphor in Diagrams Alan Frank Blackwell Darwin College Cambridge Dissertation submitted for the degree of Doctor of Philosophy University of ...»
Metaphor in Diagrams
Alan Frank Blackwell
Dissertation submitted for the degree of Doctor of Philosophy
University of Cambridge
Modern computer systems routinely present information to the user as a combination of text
and diagrammatic images, described as “graphical user interfaces”. Practitioners and
researchers in Human-Computer Interaction (HCI) generally believe that the value of these
diagrammatic representations is derived from metaphorical reasoning; they communicate
information by depicting a physical situation from which the abstractions can be inferred.
This assumption has been prevalent in HCI research for over 20 years, but has seldom been tested experimentally. This thesis analyses the reasons why diagrams are believed to assist with abstract reasoning. It then presents the results of a series of experiments testing the contribution of metaphor to comprehension, problem solving, explanation and memory tasks carried out using a range of different diagrams.
The results indicate that explicit metaphors provide surprisingly little benefit for cognitive tasks using diagrams as an external representation. The benefits are certainly small compared to the effects of general expertise in performing computational tasks. Furthermore, the benefit of metaphor in diagram use is largely restricted to mnemonic assistance. This mnemonic effect appears to be greatest when the user of the diagram constructs his or her own metaphor, rather than being presented with a systematic metaphor of the type recommended for use in HCI.
Acknowledgements This work was supported by a Collaborative Studentship, awarded by the Medical Research Council and Hitachi Europe Limited. I am grateful to the staff of the Advanced Software Centre of Hitachi Europe for their support; to Martin Bennett, who initiated the project, and especially Dr. Chas Church, who has provided generous support and encouragement.
I have enjoyed friendly and stimulating surroundings for this project. The staff and students of the MRC Applied Psychology Unit (now the Cognition and Brain Sciences Unit) were welcoming and tolerant of a stranger in their midst, and I have been fortunate to inherit the distinguished legacy of research in Applied Cognitive Psychology and Human-Computer Interaction previously carried out at the APU.
The students and fellows of Darwin College have broadened my horizons, developed my confidence, and demonstrated the inestimable value of small, multi-disiciplinary academic communities.
Thomas Green’s work of over 20 years was the inspiration for this project; I am very fortunate that he accepted me as a student. That he has also been patient with my errors and encouraging of my ambitions was far more than I expected. I am tremendously grateful for the hours that Thomas has given me, especially after his departure from the Applied Psychology Unit and from Cambridge.
I would never have aspired to academic research, and could certainly never have contemplated this project, without the encouragement and enthusiasm of my wife, Helen Arnold. Fifteen years of marriage already deserves more than a declaration of love and gratitude – after the last three years of study, I look forward to Helen’s acceptance of repayment in kind.
Table of Contents
SURVEY 1: METACOGNITIVE STATEMENTS IN THE COMPUTER SCIENCE LITERATURE24
SURVEY 2: PROFESSIONAL USERS OF CONVENTIONAL PROGRAMMING LANGUAGES38
SURVEY 3: USERS OF A VISUAL PROGRAMMING LANGUAGE 47
For 20 years, new computer software has presented information graphically as well as in textual form. The usual justification for this practice has been that the graphical form is easier to learn, understand and apply because it allows metaphorical reasoning. Consider these forthright statements from introductory textbooks on software user interface design, all published within the last two years: “Designers of systems should, where possible, use metaphors that the user will be familiar with.” (Faulkner 1998, p. 89). “Metaphors are the tools we use to link highly technical, complex software with the user’s everyday world.” (Weinschenk, Jamar & Yeo 1997, p. 60). “Select a metaphor or analogy for the defined objects … real-world metaphors are most often the best choice.” (Galitz 1997, p. 84). “Real world metaphors allow users to transfer knowledge about how things should look and work.” (Mandel 1997, p. 69). “Metaphors make it easy to learn about unfamiliar objects.” (Hill 1995, p. 22). “Metaphors help users think about the screen objects much as they would think about real world objects.” (Hackos & Redish 1998, p. 355). “Very few will debate the value of a good metaphor for increasing the initial familiarity between user and computer application.” (Dix et. al. 1998, p. 149).
The goal of this dissertation is to investigate the psychological evidence for these claims. This investigation is perhaps overdue. Not only are computer science students advised to use metaphor as the basis for their designs, but software companies routinely base their research efforts on this assumption (Blackwell 1996d), and the most influential personal computer
companies insist on the importance of metaphor in making computers available to everyone:
The conclusion of the research described in this dissertation will be that the case for the importance of metaphor is greatly over-stated. This should not be interpreted as a deprecation of graphical user interfaces. Graphical user interfaces provide many advantages – the problem is simply that those advantages are misattributed as arising from the application of metaphor. A more prosaic explanation of their success can be made in terms of the benefits of “direct manipulation”, which indicates potential actions via the spatial constraints of a 2dimensional image. The concept of direct manipulation has been thoroughly described and analysed (Shneiderman 1983, Lewis 1991). It will not be discussed in any detail here, but the implication of the current investigation is that, if the expected benefits of metaphor have been exaggerated, these low-level virtues and by-products of direct manipulation are even more important than is usually acknowledged.
Overview of the Thesis
Chapter 2 considers previous work in HCI, but it also reviews theories that have been proposed to describe diagrammatic graphical representations and to describe metaphor. It then considers the manner in which diagrams and metaphors can be used as cognitive tools, before returning to the question of HCI.
Chapter 3 presents the results of three contrasting surveys, investigating how computer scientists and professional programmers regard their use of visual programming languages.
Researchers developing these languages are greatly influenced by cognitive theories, including some theories of metaphor, but professional users appear to have little awareness of the potential cognitive implications of diagrammatic representations, instead emphasising more pragmatic benefits.
Chapter 4 describes two experiments which manipulated the degree of metaphor in diagrams.
The metaphor was used to teach elements of a visual programming language, then of more general diagrams, to people who had never programmed computers. Their performance was compared to that of experienced computer programmers, in order to judge the effect of the metaphor on learning. The use of metaphors provided little benefit relative to that of experience.
Chapter 5 investigates which properties of visual representations assist the formation of complex abstract concepts in visuo-spatial working memory. The value of mental imagery as a design strategy for abstract problems is an underlying assumption of much of the literature on visual metaphor. Four experiments were conducted to measure productivity when the appearance of the visual representation was manipulated. Metaphorical content appeared to have little influence, and there was also little consistent evidence for significant benefits from mental imagery use.
Chapter 6 returns to the type of explanatory diagram introduced in chapter 4, and presents the results of three further experiments which manipulated both the metaphorical and visual content of the notations. Diagrams were described with and without instructional metaphors, and both memory and problem solving performance were measured. Metaphor had little effect on problem solving, and memory was improved far more by pictorial content in the diagram than by explicit metaphorical instructions.
Chapter 7 concludes that the main potential advantage arising from metaphor in diagrams is a mnemonic one, rather than support for abstract problem solving or design with mental images. Furthermore the mnemonic advantage is greater if diagram users construct their own metaphors from representational pictures, rather than receiving metaphorical explanations of abstract symbols. This finding has considerable importance for the future study of diagram use and human-computer interaction.
Chapter 2: Diagram and Metaphor as Tools
This chapter reviews previous research that has investigated the application of both diagram and metaphor as cognitive tools. Much research into the use of diagrams has not considered the possibility that metaphor might be involved. Likewise, much research into metaphor has explored metaphor in language rather than in diagrams. The chapter is divided accordingly.
After brief definitions of diagrams and of metaphor as subjects of psychological research, the bulk of the review considers how each can be studied as tools.
The section that discusses diagrams as tools considers general theories of external representation use in problem solving, then addresses two specific cases that have been studied in greater detail: graphs and visual programming languages. The section that discusses metaphor as a tool concentrates on the previous research in human-computer interaction that has motivated this study, as described in the introduction to chapter 1. It is this research that suggests a possible relationship between theories of metaphor and of diagram use, despite the fact that there is relatively little empirical evidence to support some of the main theories.
Although this project originated in the study of graphical user interfaces, the methods and conclusions are applicable to a broader class of cognitive artefact (Norman 1991, Payne 1992) – diagrams. Diagrams are familiarly associated with instruction manuals (Gombrich 1990), electronics (Newsham 1995, Petre & Green 1990), software design (Martin & McClure 1985), architecture (Porter 1979), geometry (Lindsay 1989, Netz in press), general mathematics education (Pimm 1995, Kaput 1995) and symbolic logic (Shin 1991, Sowa
1993) as well as informal problem-solving (Katona 1940). Insights from these various fields are slowly being integrated in the interdisciplinary study of Thinking with Diagrams (Glasgow, Narayanan & Chandrasekaran 1995, Blackwell Ed., 1997), with conclusions that are more widely applicable to other notations, including such examples as music notation (Bent 1980), board games (Ellington, Addinall & Percival 1982) or proposals for a pictographic Esperanto (Shalit & Boonzaier 1990).
Continuum of representational conventions in cognitive artefacts Within this huge range of applicability, the common nature of diagrams is most appropriately defined by contradistinction. Diagrams form the middle part of a continuum between two other classes of cognitive artefact: text and pictures (see Figure 2.1). If we regard all three as markings (Ittelson 1996) on some surface (setting aside the tasks to which they might be applied), diagrams can be distinguished from text by the fact that some aspects of a diagram are not arbitrary but are homomorphic to the information they convey. They can be distinguished from pictures by the fact that some aspects must be interpreted by convention, and cannot be deduced from structural correspondences.
A simple distinction underestimates the complexity of text and pictures, however. The cognitive processing of text is closely related to auditory verbal comprehension, and therefore inherits homomorphic features of speech: onomatopoeia, for example (Werner & Kaplan 1963), as well as typographic conventions and conjectured systematic origin of all abstract verbal concepts in spatial experience (Jackendoff 1983, Johnson 1987, Lakoff 1987). The construction and interpretation of pictures also relies on some arbitrary depictive conventions (Willats 1990), even though those conventions may simply reflect basic perceptual abilities (Kennedy 1975) and have been supplemented by the mechanical determinism of photography (Ivins 1953). For the purposes of the current argument, text and pictures can be regarded as ideals – extremes that are never observed in actual communication via markings.
Instead, all texts are to some extent diagrammatic, and all pictures are to some extent diagrammatic. Even a photograph, despite the implied objectivity of mechanical reproduction, conveys information diagrammatically through its composition, its context on a surface and other factors (Stroebel, Todd & Zakia 1980).
As diagrams share aspects of both text and pictures, they can be analysed using techniques and theories from either extreme of the continuum. Firstly, diagrams can be regarded as twodimensional graphical languages, composed from a lexicon of geometric elements. The relationship between these elements can be described in terms of a syntax incorporating various subsets of proximity, ordering, spatial enclosure and topological connection.
Interpretation of a diagram is therefore a process of deriving semantic intention from the syntactic relationships that have been created between the lexical elements (Bertin 1981). This view of diagrams suggests that researchers should use the structural analysis of Saussure (Culler 1976), or the semiotic trichotomies of Peirce (1932).
Alternatively, diagrams might be regarded primarily as representations of physical situations.