WWW.DISSERTATION.XLIBX.INFO
FREE ELECTRONIC LIBRARY - Dissertations, online materials
 
<< HOME
CONTACTS



Pages:   || 2 |

«Abstract Flexichain is an API for editable sequences. Its primary use is in end-user applications that edit sequences of objects such as text editors ...»

-- [ Page 1 ] --

Flexichain: An editable sequence and its gap-buffer

implementation

Robert Strandh (LaBRI∗), Matthieu Villeneuve, Timothy Moore (LaBRI)

2004-04-05

Abstract

Flexichain is an API for editable sequences. Its primary use is in end-user

applications that edit sequences of objects such as text editors (characters),

word processors (characters, paragraphs, sections, etc), score editors (notes,

clusters, measures, etc), though it can also be used as a stack and a double-

ended queue.

We also describe an efficient implementation of the API in the form of a cir- cular gap buffer. Circularity avoids a common worst case in most implemen- tations, makes queue operations efficient, and makes worst-case performance twice as good as that of ordinary implementations 1 Introduction Editable sequences are useful, in particular in interactive applications such as text editors, word processors, score editors, and more. In such applications, it is highly likely that an editing operation is close to the previous one, measured as the dif- ference in positions in the sequence. This statistical behavior makes it feasible to implement the editable sequence as a gap buffer[Fin91].

The basic idea is to store objects in a vector that is usually longer than the number of elements stored in it. For a sequence of N elements where editing is required at ∗ Laboratoire Bordelais de Recherche en Informatique, Bordeaux, France 1 index i, elements 0 through i are stored at the beginning of the vector, and elements i + 1 through N − 1 are stored at the end of the vector. When the vector is longer N, this storage leaves a gap. Editing operations always result in modifications at the beginning or at the end of the gap.

Occasionally, the gap has to be moved, or rather, some elements have to be moved so as to leave the gap where the next editing operation is desired. In the worst case, i.e., that of an alternating sequence of editing operations at the beginning and at the end of the sequence, every element needs to be moved. While it can be argued that this case does not happen very frequently, it unfortunately corresponds to op- erations that might be reasonable in some clients, namely rotation of the elements or the use of the sequence as a queue.

The kind of worst-case behavior described in the previous paragraph can be avoided by the use of a doubly-linked list rather than a gap buffer to implement the se- quence. Unfortunately, the doubly-linked list has unacceptable storage overhead for small objects such as characters (a factor 8-16 according to the word size of the machine and the implementation of the memory allocator) or bits (a factor 128- 256).

2 Previous work We are not the first ones to consider the problem of editable sequences for interactive applications[Fin91].

Multics Emacs [Gre96] [Gre80] used a doubly-linked list of lines of text. Each line was a vector of characters. A new vector type was added to Multics Maclisp for the purpose. The new type used complex instructions of the underlying architecture that allowed an arbitrary sequence of bytes to be moved with a single instruction.

While this implementation was fine for most text editing, it was painfully slow for editing files with few newlines, since a considerable number of bytes had to be moved for each editing operation. Today’s hardware is 1000 times faster than what Multics ran on, so this implementation might be acceptable today, even though most processors might not have the specialized instructions required so it would have to be implemented in software.

GNU Emacs [LLS02] stores the entire buffer as a big gap buffer as described in the introduction. This implementation avoids bad worst-case behavior for long lines of text. On the other hand, it introduces a different, potentially more serious, worst

–  –  –

Hemlock [CM89] (the Emacs-like editor distributed with CMUCL) uses a sequence of lines. Lines can be stored in different places as compact strings, and one of the lines (the open-line) is represented as a gap buffer. This implementation largely avoids the worst case of GNU Emacs, at least for ordinary text with lines of relatively modest length.

Goatee (the Emacs-like editor of McCLIM [SM02]) uses a doubly-linked list of lines, each line being a gap buffer.

Gsharp [Str02], the interactive editor for music scores, currently uses ordinary singly-linked Lisp lists, since the score is divided into relatively few smaller units corresponding to musical phrases. Still, this implementation introduces some serious worst-case behavior that we would like to avoid.

It is interesting to notice that the sequence of lines used in Goatee (and in Hemlock and probably elsewhere as well) is just a version of an editable sequence with the implementation exposed.

3 Flexichain: an API for editable sequences The Flexichain API grew out of the need for code factoring between different applications (especially Goatee and Gsharp, but hopefully Hemlock as well), and sometimes between different parts of one application.

To provide maximum flexibility for potential clients, we decided to divide the API in two different layers: Fexichain and Cursorchain. The Flexichain layer provides editing operations based on positions represented as integers, whereas the Cursorchain layer introduces the possibility of an arbitrary number of cursors into the editable sequence.





3.1 The basic layer: Flexichain The basic layer provides editing operations based on the concept of a position.

For an insert operation, a position is an integer between 0 and N inclusive, where N is the length of the sequence. In general, a value of i indicates the position 3 before element number i in the sequence, except of course when i = N and there is no element i. In this case, the position indicates the end of the sequence. For reasons that will be explained in the next section, we actually provide two insert operations insert* and insert* that are entirely equivalent when only the basic layer is used.

For a delete operation, a position is an integer between 0 and N − 1 and indicates the element number of the element to be deleted.

The basic layer also provides operations for accessing and replacing an element at an arbitrary position in the sequence, as well as operations that treats the sequence as a stack, a linear double-ended queue, or a circular queue.

Some relatively simple applications can use the basic layer directly. The main inconvenience of the basic layer is that the position of an element changes as a result of editing operations at lower positions in the sequence.

3.2 The second layer: Cursorchain Complex applications such as multi-window text editors need to manage several positions in the sequence such that these positions refer to the same element independently of any editing operations in other places in the sequence.

For that reason, the second layer introduces the concept of a cursor. A cursor is similar to the point or a mark of Emacs. It is positioned either at the beginning of the sequence, at the end of the sequence, or between two element of the sequence.

All the operations of the basic layer can be used on instances of cursorchain.

While it is straightforward to determine what happens to a cursor when an element is deleted, it is not clear what happens when an element is inserted at a position occupied by one or more cursors. There are actually two possibilities: either the element is inserted before the cursors (i.e., between the cursors and the element that precedes them) so that the cursors end up at a position after the newly inserted element, or the element is inserted after the cursors (i.e., between the cursors and the element that succeeds them) so that the cursors end up at a position before the newly inserted element.

Different applications might want different behavior with respect to insertion, and some applications (this is the case with Gsharp) might want one behavior in one part of the buffer representation and another behavior in a different part. For that 4 reason, we provide two different insert operations: insert for the first case and insert for the second case. As indicated in the previous section, there are two insert operations in the basic layer as well, simply because these operations might be used on a cursorchain and the behavior of potential cursors at the insertion position must be specified.

Since cursors are never conceptually positioned on a particular element, we provide two different delete operations to delete elements before and after the cursor, and two different operations for accessing and replacing an element with respect to the cursor (before it and after it).

We also provide operations to move the cursor forward and backward by an arbitrary number of positions, to translate between a cursor and its position, and to determine whether the cursor is at the beginning or at the end of the sequence.

4 Implementation of the API In order to avoid the worst-case behavior of a buffer implementation of the type used by GNU Emacs, we use a circular gap buffer. Thus, the first and the last element of the underlying vector are considered contiguous, and the first element of the sequence is not necessarily the first element of the underlying vector. We keep track of the first element by introducing another slot in the class that represents the Flexichain.

4.1 Implementing the basic layer There are two main considerations with regard to the implementation of the basic layer, namely when and how to change the size of the vector that holds the sequence, and how to move the gap.

4.2 Changing the size of the vector Whenever and insert operation is issued on a Flexichain that is full (i.e., the size of the vector holding the sequence has the same length as the sequence itself), its underlying vector must be extended. In order to maintain linear worst-case complexity of a sequence of editing operations, we must then multiply the size of 5 a vector by a constant (called the expand factor rather than adding a fixed number of elements.

Each resize operation requires all the elements to be moved. It is therefore desirable to avoid resize operations as much as possible. For that reason, it is advantageous to have a large expand factor. On the other hand, in order to avoid wasted space, it is desirable to have a small expand factor.

We use a default expand factor of 1.5 with the possibility for client code to alter it. Applications that manipulate sequences that vary little in length can use a small expand factor to minimize overhead, while applications that use relatively small sequences the length of which vary a lot can use a larger expand factor.

The vector has to be expanded as a result of an insert operation on a full Flexichain.

It is particularly easy to move the gap in this case (no elements need to be moved).

For that reason, in this case we first move the gap and then expand the vector.

To avoid too much overhead when the number of elements in the sequence decreases, we occasionally have to shrink the vector. In order to avoid having to immediately expand it again in case of more insert operations being issued, we only shrink the vector when the ratio between the length of the vector and the length of the sequence is greater than the square of the expand factor.

Shrinking the vector preserves the position of the gap.

4.3 Moving the gap Perhaps the most complex part of implementing the basic layer is moving the gap when an editing operation (insert or delete) is issued at a position other than that of the gap.

There are three different possible configurations of the gap with respect to the data.

Figure 1 shows the case where both the gap and the data are contiguous. Figure 2 shows the case where the data is not contiguous. Finally, figure 3 shows the case where the gap is not contiguous.

We make sure we always move the minimum number of elements required by moving the gap in either of the two directions possible.

It turns out that there are five different cases of combinations between the configurations of the gap and the position of the editing operation that need to be taken into account. Two of the five cases need a single call to the Lisp function replace,

–  –  –

two more require two calls, and one case requires three calls.

4.4 Implementing the second layer The main difficulty in implementing the second layer lies in the way cursors are managed. It is necessary for the implementation to access all cursors in order to be able to update their corresponding positions as the sequence is altered. To avoid memory leaks, we use weak references to store the cursors, so that when client code no longer refers to a cursor, we can detect that.

Perhaps the most natural implementation of cursors would be to store the corresponding position in the sequence, and update that position whenever an editing operation with a smaller position is issued. However, such an implementation would make the complexity of editing operations proportional to the number of cursors into the sequence which is not desirable.

Instead, we store indexes into the underlying vector. This way, we can split the cursors into three sets according to whether they are positioned before, at, or after

–  –  –

7 the gap (although we have not implemented this possibility yet). Only cursors that are at the gap potentially need to be altered after an editing operation. All other cursors remain unchanged. Instead, cursors are updated as a result of moving the gap.

Cursor updates are implemented as :before, :after, and :around, methods on the generic functions in the basic layer that handle moving the gap and changing the size of the vector. These operations constitute an internal protocol of the Flexichain library and is not part of external API.

5 Conclusions and future work We believe we have a good API and a high quality implementation of it. We would be interested in seeing our code used in a variety of existing projects, in particularly Goatee and Gsharp, but also in similar projects such as Portable Hemlock and others.

A typical text editor such as Goatee or Hemlock could use Flexichains (or rather Cursorchains) to implement both the sequence of lines of text and the sequence of characters within each line.



Pages:   || 2 |


Similar works:

«FEZA BASKAYA Simulating Search Sessions in Interactive Information Retrieval Evaluation ACADEMIC DISSERTATION To be presented, with the permission of the Board of the School of Information Sciences of the University of Tampere, for public discussion in the Auditorium Pinni B 3116, Kanslerinrinne 1, Tampere, on June 13th, 2014, at 12 o’clock. UNIVERSITY OF TAMPERE FEZA BASKAYA Simulating Search Sessions in Interactive Information Retrieval Evaluation Acta Universitatis Tamperensis 1949 Tampere...»

«Current Research in Machine Translation HAROLD L. SOMERS Centre for Computational Linguistics UMIST PO Box 88 Manchestel; U.K. ABSTRACT:This paper, accompanied by peer group commentary and author’s response, is a discussion paper concerning the state of the art in Machine Translation. The current orthodoxy is first summarized, then criticized. A number of researchprojects based on the standard architecture are discussed:they involve the use of Artificial Intelligence techniques, advanced...»

«ATTORNEY FOR APPELLANT ATTORNEYS FOR APPELLEE Kevin R. O’Reilly Steve Carter Lafayette, Indiana Attorney General of Indiana Stephen R. Creason Deputy Attorney General Indianapolis, Indiana In the FILED Indiana Supreme Court Apr 01 2008, 11:21 am CLERK No. 79S00-0702-CR-65 of the supreme court, court of appeals and tax court MICHELLE GAUVIN, Appellant (Defendant below), v. STATE OF INDIANA, Appellee (Plaintiff below). Appeal from the Tippecanoe Superior Court No. 2, No. 79D02-0503-MR-1 The...»

«Luton Dementia Guide 1 2015 Foreword When I first got involved with the co-production project, I said that what was needed in the town was a single booklet which people would be given when they are diagnosed, containing all the info they need. I think you have ticked all the boxes so to speak. I think there are too many leaflets and booklets around which carers like me don’t have time to read! I am therefore very pleased that one is being produced. Jeff Solomons Carer Luton Dementia Guide 2...»

«International Journal of Humanities and Social Science Vol. 2 No. 22 [Special Issue – November 2012] African Literature in a Structural and Linguistic Jail: Acknowledging, Apprehending and Advocating for Prison Break Dr. John Mugubi, PhD Chairman Department of Theatre Arts and Film Technology Kenyatta University P.O. Box 43844-00100, Nairobi, Kenya. “Matter is expressed in manner” The above words by Wordsworth succinctly capture the inextricable relationship between form and content. That...»

«CENTRE FOR GAMBLING RESEARCH REVIEW OF THE ACT GOVERNMENT’S HARM MINIMISATION MEASURES MARCH 2005 COMMISSIONED BY ACT GAMBLING AND RACING COMMISSION Review of the ACT Government’s Harm Minimisation Measures © J. McMillen & S. Pitt Centre for Gambling Research, RegNet, ANU 2005. ISBN 0-9757104-5-1 This work is copyright. Apart from any use as permitted under the Copyright Act 1968, no part may be reproduced by any process without permission from the authors. In all cases the authors and ANU...»

«Can Brisbane Remain a Subtropical City? Peter Spearritt As the seasons change, public and private gardens become a riot of colour. Winter shows the scarlet flags of poinsettia Brisbane's emblem, which, if really a Mexican beauty, has made itself very much at home. The lavender glow of jacaranda and the gold of laburnum, the green umbrella of poinciana crowned with gleaming scarlet, the massed magnificence of magenta bougainvillea, the creamy blossoms and heavy tropical scent of frangipani...»

«ChiloquinNews August 23, 2010 Volume 7, Issue 3 Editor’s Note: We still have no volunteer to take over the editorship of this newsletter. If no one comes forward, I expect to put out one more issue, then the ChiloquinNews will be no more. Artist of the Month at the Two Rivers Gallery I started woodturning about ten years ago while we were living in Fredericksburg, Texas. I joined the Hill Country Turners in Kerrville, TX, bought a Jet Mini lathe and started turning with the help of some very...»

«HYDROACOUSTICS: LAKES AND RESERVOIRS Hydroacoustics: Lakes and Reservoirs J. Christopher Taylor and Suzanne L. Maxwell Background Fisheries hydroacoustics uses transmitted sound to detect fish. Sound is transmitted as a pulse and travels quickly and efficiently through water. As the sound pulse travels through water it encounters objects that are of different density than the surrounding medium, such as fish, that reflect sound back toward the sound source. These echoes provide information...»

«VOLUME: E 02 ISSUE 17 10/26/2014 THE QUEEN’S OWN CAMERON HIGHLANDERS OF CANADA CAMERON ASSOCIATION IN CANADA THE QUEEN’S OWN CAMERON HIGHLANDERS OF CANADA MINTO ARMOURY, 969 ST. MATTHEWS AVE WPG, MB R3G 0J7 2014/2015 OFFICERS Steve MacMillan 351 Ainslie St R3J 2Z7 204-831-0542 PRESIDENT Wpg, MB sdmacmillan@shaw.ca Karen Tyler 255 Aldine St R3J 2A9 204-414-0973 VICE-PRES Wpg, MB Hugh O’Donnell 713 Cambridge St R3M 3G2 204-285-7222 SECRETARY Wpg, MB hodonnell@shaw.ca Dave Gibson 104 William...»

«THURS 2ND MON 6TH JUNE 2016 THECATLAUGHS.COM Supported by: Pouring Partner: BOOK NOW: THECATLAUGHS.COM / 056 776 3837 OUR SPONSORS 2016 FUNDING PARTNERS POURING PROGRAMME MEDIA PARTNER PARTNER PARTNER ACCOMODATION PARTNERS RESTAURANT PARTNERS PERSIAN PAL IT PARTNER WELCOME Welcome to Kilkenny City: home of Ireland’s Medieval Mile, Ireland’s Ancient East and the “best little comedy festival in the world”. It is with great honour and pride that I present my fourth and final programme for...»

«Discrete Applied Mathematics 98 (1999) 121–130 Pattern minimisation in cutting stock problems Colin McDiarmid ∗ Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK Received 21 October 1997; accepted 8 February 1999 Abstract In the cutting stock pattern minimisation problem, we wish to satisfy demand for various customer reels by cutting as few as possible jumbo reels, and further to minimise the number of distinct cutting patterns used. We focus on the...»





 
<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.