FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 |

«Designing Illustrated Texts: How Language Production Is Influenced by Graphics Generation Wolfgang Wahlster, Elisabeth André, Winfried Graf, Thomas ...»

-- [ Page 1 ] --

In: EACL91, pp. 8-14.

Designing Illustrated Texts:

How Language Production Is Influenced by Graphics


Wolfgang Wahlster, Elisabeth André, Winfried Graf, Thomas Rist

Authors' Abstract

Multimodal interfaces combining, e.g., natural language and graphics take advantage of both

the individual strength of each communication mode and the fact that several modes can be

employed in parallel, e.g., in the text-picture combinations of illustrated documents. It is an important goal of this research not simply to merge the verbalization results of a natural language generator and the visualization results of a knowledge-based graphics generator, but to carefully coordinate graphics and text in such a way that they complement each other. We describe the architecture of the knowledge-based presentation system WIP which guarantees a design process with a large degree of freedom that can be used to tailor the presentation to suit the specific context. In WIP, decisions of the language generator may influence graphics generation and graphical constraints may sometimes force decisions in the language production process. In this paper, we focus on the influence of graphical constraints on text generation. In particular, we describe the generation of cross-modal references, the revision of text due to graphical constraints and the clarification of graphics through text.

Table of Contents 1 Introduction 3 2 The Architecture of WIP 5

2.1 The Presentation Planner 6

2.2 The Layout Manager 7

2.3 The Text Generator 7

2.4 The Graphics Generator 8 3 The Generation of Cross-Modal References 9 4 The Revision of Text Due to Graphical Constraints 11 5 The Clarification of Graphics through Text 13 6 Conclusion 14 Acknowledgements 15 References 15 2


With increases in the amount and sophistication of information that must be communicated to the users of complex technical systems comes a corresponding need to find new ways to present that information flexibly and efficiently. Intelligent presentation systems are important building blocks of the next generation of user interfaces, as they translate from the narrow output channels provided by most of the current application systems into high- bandwidth communications tailored to the individual user. Since in many situations information is only presented efficiently through a particular combination of communication modes, the automatic generation of multimodal presentations is one of the tasks of such presentation systems. The task of the knowledge-based presentation system WIP is the generation of a variety of multimodal documents from an input consisting of a formal description of the communicative intent of a planned presentation. The generation process is controlled by a set of generation parameters such as target audience, presentation objective, resource limitations, and target language.

One of the basic principles underlying the WIP project is that the various constituents of a multimodal presentation should be generated from a common representation. This raises the question of how to divide a given communicative goal into subgoals to be realized by the various mode-specific generators, so that they complement each other. To address this problem, we have to explore computational models of the cognitive decision processes coping with questions such as what should go into text, what should go into graphics, and which kinds of links between the verbal and non-verbal fragments are necessary.

In the project WIP, we try to generate on the fly illustrated texts that are customized for the intended target audience and situation, flexibly presenting information whose content, in contrast to hypermedia systems, cannot be fully anticipated. The current testbed for WIP is the generation of instructions for the use of an espresso-machine. It is a rare instruction manual that does not contain illustrations. WIP's 2D display of 3D graphics of machine parts help the addressee of the synthesized multimodal presentation to develop a 3D mental model of the object that he can constantly match with his visual perceptions of the real machine in front of him. Fig. 1 shows a typical text-picture sequence which may be used to instruct a user in filling the watercontainer of an espresso-machine.

Fig. 1: Example Instruction

3 Currently, the technical knowledge to be presented by WIP is encoded in a hybrid knowledge representation language of the KL-ONE family including a terminological and assertional component (see Nebel 90). In addition to this propositional representation, which includes the relevant information about the structure, function, behavior, and use of the espresso-machine, WIP has access to an analogical representation of the geometry of the machine in the form of a wireframe model.

The automatic design of multimodal presentations has only recently received significant attention in artificial intelligence research (cf. the projects SAGE (Roth et al. 89), COMET (Feiner & McKeown 89), FN/ANDD (Marks & Reiter 90) and WIP (Wahlster et al. 89)). The WIP and COMET projects share a strong research interest in the coordination of text and graphics. They differ from systems such as SAGE and FN/ANDD in that they deal with physical objects (espresso-machine, radio vs. charts, diagrams) that the user can access directly. For example, in the WIP project we assume that the user is looking at a real espresso-machine and uses the presentations generated by WIP to understand the operation of the machine. In spite of many similarities, there are major differences between COMET and WIP, e.g., in the systems' architecture. While during one of the final processing steps of COMET the layout component combines text and graphics fragments produced by modespecific generators, in WIP a layout manager can interact with a presentation planner before text and graphics are generated, so that layout considerations may influence the early stages of the planning process and constrain the mode-specific generators.


The architecture of the WIP system guarantees a design process with a large degree of freedom that can be used to tailor the presentation to suit the specific context. During the design process a presentation planner and a layout manager orchestrate the mode-specific generators and the document history handler (see Fig. 2) provides information about intermediate results of the presentation design that is exploited in order to prevent disconcerting or incoherent output. This means that decisions of the language generator may influence graphics generation and that graphical constraints may sometimes force decisions in the language production process. In this paper, we focus on the influence of graphical constraints on text generation (see Wahlster et al. 91 for a discussion of the inverse influence).

Fig. 2 shows a sketch of WIP's current architecture used for the generation of illustrated documents. Note that WIP includes two parallel processing cascades for the incremental generation of text and graphics. In WIP, the design of a multimodal document is viewed as a non-monotonic process that includes various revisions of preliminary results, massive replanning or plan repairs, and many negotiations between the corresponding design and realization components in order to achieve a fine-grained and optimal division of work between the selected presentation modes.

–  –  –

2.1 THE PRESENTATION PLANNER The presentation planner is responsible for contents and mode selection. A basic assumption behind the presentation planner is that not only the generation of text, but also the generation of multimodal documents can be considered as a sequence of communicative acts which aim to achieve certain goals (cf. André & Rist 90a). For the synthesis of illustrated texts, we have designed presentation strategies that refer to both text and picture production.

To represent the strategies, we follow the approach proposed by Moore and colleagues (cf.

Moore & Paris 89) to operationalize RST-theory (cf. Mann & Thompson 88) for text planning.

The strategies are represented by a name, a header, an effect, a set of applicability conditions and a specification of main and subsidiary acts. Whereas the header of a strategy indicates which communicative function the corresponding document part is to fill, its effect refers to an intentional goal. The applicability conditions specify when a strategy may be used and put restrictions on the variables to be instantiated. The main and subsidiary acts form the kernel of the strategies. E.g., the strategy below can be used to enable the identification of an object shown in a picture (for further details see André & Rist 90b). Whereas graphics is to be used to carry out the main act, the mode for the subsidiary acts is open.





(Provide-Background P A ?x ?px ?pic GRAPHICS)


(BMB P A (Identifiable A ?x ?px ?pic))

Applicability Conditions:

(AND (Bel P (Perceptually-Accessible A ?x)) (Bel P (Part-of ?x ?z)))

Main Acts:

(Depict P A (Background ?z) ?pz ?pic)

Subsidiary Acts:

(Achieve P (BMB P A (Identifiable A ?z ?pz ?pic)) ?mode) For the automatic generation of illustrated documents, the presentation strategies are treated as operators of a planning system. During the planning process, presentation strategies are selected and instantiated according to the presentation task. After the selection of a strategy, the main and subsidiary acts are carried out unless the corresponding presentation goals are already satisfied. Elementary acts, such as Depict or Assert, are performed by the text and graphics generators.


The main task of the layout manager is to convey certain semantic and pragmatic relations specified by the planner by the arrangement of graphic and text fragments received from the mode-specific generators, i.e., to determine the size of the boxes and the exact coordinates for positioning them on the document page. We use a grid-based approach as an ordering system for efficiently designing functional (i.e., uniform, coherent and consistent) layouts (cf. Müller-Brockmann 81).

A central problem for automatic layout is the representation of design-relevant knowledge. Constraint networks seem to be a natural formalism to declaratively incorporate aesthetic knowledge into the layout process, e.g., perceptual criteria concerning the organization of boxes as sequential ordering, alignment, grouping, symmetry or similarity.

Layout constraints can be classified as semantic, geometric, topological, and temporal.

Semantic constraints essentially correspond to coherence relations, such as sequence and contrast, and can be easily reflected through specific design constraints. A powerful way of expressing such knowledge is to organize the constraints hierarchically by assigning a preference scale to the constraint network (cf. Borning et al. 89). We distinguish obligatory, optional and default constraints. The latter state default values, that remain fixed unless the corresponding constraint is removed by a stronger one. Since there are constraints that have

–  –  –

2.3 THE TEXT GENERATOR WIP's text generator is based on the formalism of tree adjoining grammars (TAGs). In particular, lexicalized TAGs with unification are used for the incremental verbalization of logical forms produced by the presentation planner (cf. Harbusch 90 and Schauder 91). The grammar is divided into an LD (linear dominance) and an LP (linear precedence) part so that the piecewise construction of syntactic constituents is separated from their linearization according to word order rules (Finkler & Neumann 89).

The text generator uses a TAG parser in a local anticipation feedback loop (see Jameson & Wahlster 82). The generator and parser form a bidirectional system, i.e., both processes are based on the same TAG. By parsing a planned utterance, the generator makes sure that it does not contain unintended structural ambiguities.

Since the TAG-based generator is used in designing illustrated documents, it has to generate not only complete sentences, but also sentence fragments such as NPs, PPs, or VPs, e.g., for figure captions, section headings, picture annotations, or itemized lists. Given that capability and the incrementality of the generation process, it becomes possible to interleave generation with parsing in order to check for ambiguities as soon as possible. Currently, we are exploring different domains of locality for such feedback loops and trying to relate them to resource limitations specified in WIP's generation parameters. One parameter of the generation process in the current implementation is the number of adjoinings allowed in a sentence. This parameter can be used by the presentation planner to control the syntactic complexity of the generated utterances and sentence length. If the number of allowed adjoinings is small, a logical form that can be verbalized as a single complex sentence may lead to a sequence of simple sentences. The leeway created by this parameter can be exploited for mode coordination. For example, constraints set up by the graphics generator or layout manager can force delimitation of sentences, since in a good design, picture breaks should correspond to sentence breaks, and vice versa (see McKeown & Feiner 90).


When generating illustrations of physical objects WIP does not rely on previously authored picture fragments or predefined icons stored in the knowledge base. Rather, we start from a hybrid object representation which includes a wireframe model for each object.

Although these wireframe models, along with a specification of physical attributes such as surface color or transparency form the basic input of the graphics generator, the design of illustrations is regarded as a knowledge-intensive process that exploits various knowledge sources to achieve a given presentation goal efficiently. E.g., when a picture of an object is requested, we have to determine an appropriate perspective in a context-sensitive way (cf.

Pages:   || 2 | 3 |

Similar works:

«Case: 11-20323 Document: 00512146007 Page: 1 Date Filed: 02/18/2013 IN THE UNITED STATES COURT OF APPEALS FOR THE FIFTH CIRCUIT United States Court of Appeals Fifth Circuit FILED February 18, 2013 No. 11-20323 Lyle W. Cayce Clerk UNITED STATES OF AMERICA, Plaintiff–Appellee, v.JOHN FLUELLEN HEARD, JR.; GARY LEE LAMBERT, Defendants–Appellants. Appeals from the United States District Court for the Southern District of Texas Before DAVIS, OWEN, and SOUTHWICK, Circuit Judges. OWEN, Circuit...»

«CHANGING PERCEPTIONS TSUKANOV & SAATCHI SEAL DEAL WITH GROUNDBREAKING SHOW BY SIMON HEWITT TWO YEARS AGO the Saatchi Gallery hosted Breaking The Ice, an exhibition devoted to Soviet-era Non-Conformist artists from Moscow. It wowed the London public, attracting over 600,000 visitors. Now the man who masterminded that show, Igor Tsukanov, is back – thinking even bigger. Post Pop: East Meet West, which opened at the Saatchi on November 26 and runs through February 23, features 240 works by over...»

«The Militant Go-between: Émile Pouget’s Transnational Propaganda (1880–1914) Constance Bantman Imperial College London/ Paris 13 University, UK and France This article is a study of the transnational activism of the French anarchist militant Emile Pouget (1860–1931), from his early days in the 1880s as an agitator and as the editor of the scathing anarchist weekly Père Peinard, through to his key role in the spread of revolutionary syndicalism in France and beyond. Against dominant...»

«Transparency International Bangladesh COURT-WATCH Report on Research on Speedy Tribunal Act 2002 Presented for Roundtable Discussion 12 September 2004. Executive Summary Introduction Bangladesh is a poor state in South Asia, ridden with innumerable problems. The pressure of population, terrorism, poverty, and corruption are some of the problems that are clouding the lives of the people of this country even 33 years after Liberation. The absence of good governance and its negative influence has...»

«2012 REPUBLICAN PARTY OF TEXAS Report of Platform Committee 2012 STATE REPUBLICAN PARTY PLATFORM PLATFORM COMMITTEE PREAMBLE PRINCIPLES PRESERVING AMERICAN FREEDOM LIMITING THE EXPANSE OF GOVERNMENT POWER Limited Federal Powers Unelected, Appointed Bureaucrats (Czars) Constitutional Citations on Legislation If It’s Good Enough For Us it’s Good Enough for Them Law Enforcement Border Security Preserving National Security Patriot Act Emergency War Powers and Martial Law Declarations...»

«ANNUAIRE 2011 Archidiocèse de Lomé SOMMAIRE Données Statistiques Nonciature Apostolique Organes et Directions dépendant de la C.E.T. Directions Diocésaines Aumôneries Séminaires, Collèges et Instituts Les Paroisses Instituts de vie consacrée et Sociétés de vie apostolique IOrdres féminins AInstituts locaux BInstituts missionnaires IIOrdres masculins Le Clergé diocésain ARCHIDIOCESE DE LOME (TOGO) Données statistiques Date d’érection : 14 septembre 1955 3.682 Km2 Superficie :...»

«June, 2011 1928 North Star II cruising under the San Francisco Bay Golden Gate Bridge owners Alan & Barbara Almquist Northern California Fleet Report 6th Annual Forget Me Knot Celebration (A Call to Honor) By Les Cochran, NC Fleet Vice Commodore to honor all veterans. The final Forget Me Knot Wreath This will be laid to honor the loved ones who have passed on. coming As with last year's celebration, flowers will be requested fall, on from the garden in Normandy, France, the site of the...»

«Implications of child errors for the syntax of negation in Korean* Paul Hagstrom, MIT October 1997 The structure of negation, particularly in Korean, has been the subject of a much research in theoretical syntax in recent years.1 In much of this work, firstlanguage acquisition data is either disregarded or considered only cursorily. We will see that by taking this evidence seriously, we are able to reach quite striking conclusions about the syntax of negation in the adult language. 1. Negation...»

«SUPREME COURT OF MISSOURI en banc STATE OF MISSOURI, ) ) Respondent, ) ) vs. ) No. SC92720 ) CHRISTOPHER L. COLLINGS, ) ) Appellant. ) APPEAL FROM THE CIRCUIT COURT OF PHELPS COUNTY The Honorable Mary Sheffield, Judge Opinion issued August 19, 2014 Christopher L. Collings (hereinafter, “Collings”) was tried and found guilty by a jury of first degree murder pursuant to section 565.020, RSMo 2000. 1 Collings was sentenced to death, consistent with the jury’s recommendation. This Court has...»

«Safety Data Sheet LACQUER THINNER SECTION 1. IDENTIFICATION LACQUER THINNER Product Identifier 13-354 Other Means of Identification Please refer to Product label. Recommended Use None known. Restrictions on Use Recochem Inc., 850 Montee de Liesse, Montreal, QC, H4T 1P4, Compliance and Regulatory Manufacturer / Department, 905-878-5544, www.recochem.com Supplier Emergency Phone No. CANUTEC, 613-996-6666, 24 Hours 1124 SDS No. SECTION 2. HAZARDS IDENTIFICATION GHS Classification Flammable liquid...»

«Volume 17, Number 1 Print ISSN: 1096-3685 Online ISSN: 1528-2635 ACADEMY OF ACCOUNTING AND FINANCIAL STUDIES JOURNAL Mahmut Yardimcioglu Kahramanmaras Sutcu Imam University Editor The Academy of Accounting and Financial Studies Journal is owned and published by Jordan Whitney Enterprises, Inc. Editorial content is under the control of the Allied Academies, Inc., a non-profit association of scholars, whose purpose is to support and encourage research and the sharing and exchange of ideas and...»

«TETRA Introduction Pocket Guide TETRA Introduction TETRA Introduction Wireless Networks Division Contents Page 1. Overview 2 2. The Market 3 PMR and PAMR 3–4 Why a New Standard? 5 TETRA Applications 6–7 TETRA Roll-Out 8 The TETRA/Tetrapol Race 9 3. The Technology Behind TETRA 10 TETRA Sub-Standards: V+D, PDO, DMO 10 Simplex, Semi-Duplex, Full Duplex 11 TDMA 12 – 13 Bursts and Time Slots 14 Channel Structure 15 Higher Data Rates: Multislot Operation 16 Modulation and Channel Coding 17 4....»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.