«A dissertation submitted to the Department Of Computer Science, Faculty of Science at the University Of Cape Town in partial fulfilment of the ...»
Individual Document Management Techniques:
an Explorative Study
A dissertation submitted to the Department Of Computer
Science, Faculty of Science at the University Of Cape
Town in partial fulfilment of the requirements for the
degree of Master of Philosophy (in Information
By Mpho Sello
Supervised by Dr Hussein Suleman
© Copyright 2007
Individuals are generating, storing and accessing more information than ever before.
The information comes from a variety of sources such as the World Wide Web, email and books. Storage media is becoming larger and cheaper. This makes accumulation of information easy. When information is kept in large volumes, retrieving it becomes a problem unless there is a system in place for managing this.
This study examined the techniques that users have devised to make retrieval of their documents easy and timely. A survey of user document management techniques was done through interviews. The uncovered techniques were then used to build an expert system that provides assistance with document management decision-making. The system provides recommendations on file naming and organization, document backup and archiving as well as suitable storage media. The system poses a series of questions to the user and offers recommendations on the basis of the responses given.
The system was evaluated by two categories of users: those who had been interviewed during data collection and those who had not been interviewed. Both categories of users found the recommendations made by the system to be reasonable and indicated that the system was easy to use. Some users thought the system could be of great benefit to people new to computers.
ii Acknowledgements My heartfelt gratitude goes to Dr Hussein Suleman who supervised this project and provided guidance and suggestions on how to undertake it. I appreciate all the assistance he provided. I would also like to thank my classmates who provided support throughout my studies in UCT, in particular Victor Katoma. Thanks, Victor, for your time and patience. I would also like to thank Michael Kwesi Nyarko and Paolo Pietro Pileggi who came to my rescue when Java was having its way with me.
Life should not come to a standstill when studying. Thanks to Oyapo Chimidza I came to realize that one can still have a life even if they are studying. I would like to thank all the people who touched my life in different ways while I was at UCT.
Last, but not in any way suggesting that they are the least, I would like to thank my family for their support throughout my studies and my government for footing the bill for my studies.
Now it is time to take a second look at the working world. Hopefully it will be better this time around.
iii Table of Contents CHAPTER 1 – INTRODUCTION AND MOTIVATION
1.2 DESCRIPTION OF THE PROBLEM
1.3 METHODOLOGY AND EVALUATION
1.4 DISSERTATION OUTLINE
CHAPTER 2 – BACKGROUND ON DOCUMENT MANAGEMENT............... 6
2.2 DOCUMENT MANAGEMENT PRACTICES
2.3 DOCUMENT MANAGEMENT PROBLEMS
2.4 DOCUMENT MANAGEMENT RESEARCH
2.5 CONCLUDING REMARKS
CHAPTER 3 – EXPERT SYSTEMS
3.2 EXPERT SYSTEM STRUCTURE
3.3 EXPERT SYSTEM SHELLS
3.4 KNOWLEDGE REPRESENTATION
3.5 DEVELOPING AN EXPERT SYSTEM
3.6 EXAMPLES OF EXPERT SYSTEMS
CHAPTER 4 – DATA COLLECTION AND ANALYSIS
4.2 SAMPLING PROCEDURE AND DATA COLLECTION METHOD
4.3 SUMMARY OF COLLECTED DATA
4.4 DATA INTERPRETATION AND ANALYSIS
4.5 CONCLUDING REMARKS
CHAPTER 5 – DEVELOPMENT OF THE EXPERT SYSTEM
5.2 SYSTEM DEVELOPMENT
5.3 USING THE SYSTEM
5.4 SNAPSHOTS OF THE SYSTEM’S PAGES
CHAPTER 6 - EXPERT SYSTEM EVALUATION AND RESULTS.................42 6.1 INTRODUCTION
6.2 PREPARATIONS FOR SYSTEM TESTING
6.3 SYSTEM TESTING
6.4 PRESENTATION OF RESULTS
6.5 DISCUSSION OF RESULTS
CHAPTER 7 – CONCLUSIONS AND RECOMMENDATIONS
7.2 FUTURE WORK
1.1 Introduction Information plays a major role in the activities of organisations. People in organisations need to exchange information while carrying out their duties. In some organisations, information is the sole product, for example libraries and publishers, while in other organisations it serves as support to organisational products, for example user manuals. Regardless of the nature of organisation, information is used to facilitate the undertaking of all organisational activities. The information used is stored and retrieved as necessary.
Before the advent of information technology, information was contained mainly on paper. The paper was kept in indexed files to make it easy for users to access it. When the filed information was no longer used regularly, it was put into storage as records.
The records were either stored on the organisational premises or entrusted to an organisation that dealt specifically with storage of records.
Storing information on paper posed a number of problems to users of the information.
Files were sometimes inappropriately filed or misplaced by some users and this made it difficult or impossible to find them. Sometimes files were lost or stolen. Storage of old files also posed a problem because it required organisations to have physical storage space or pay for storage by other organisations.
With the advent of information technology, some of the information storage problems were solved. Large amounts of information could be stored electronically on small storage media that do not require large physical storage spaces. However, problems like inappropriate filing were not solved entirely and other new problems came about.
With the wide choice of storage media available, users tend to store more information.
Due to the increase in stored information, retrieving it is not always easy. Users forget where they have stored information or the names with which they have stored the information. Sometimes the medium on which information is stored fails, gets misplaced or becomes obsolete.
1 Information stored electronically is usually contained in documents. It is these documents that users often have problems accessing. These problems have inspired an area of research called document management. Document management research is concerned with devising ways of making information storage and retrieval easy for users. Document management is a relatively new research area, but it follows in the steps of earlier research efforts that focused on file organisation.
Doing a search on document management on the Internet does not yield many results.
Even books on document management are hard to find. From the little research that has been conducted in this area, it seems that researchers approach the problem from varying angles. Some researchers have focused on document storage and retrieval and others have focused on presentation of stored documents on retrieval.
1.2 Description of the problem As storage media become larger and cheaper, users are able to store more information than ever before. This sometimes results in users keeping information that they would otherwise not keep, for instance, information that can be easily put together when required or information that is not likely to be needed again in future. Keeping this extra information then results in problems when trying to locate other, more important, information.
The most common problem encountered by users when trying to retrieve information, is forgetting document names or the storage medium on which the information is stored. This is often the case when the document being sought has not been accessed for some time.
Specific document management problems encountered by users include:
Users lose stored information because they cannot retrieve it.
Users struggle to locate old documents.
Users forget the names with which they store documents.
Users are unable to retrieve information because they cannot remember where they stored it.
1.3 Methodology and evaluation The aim of this project was to study expert users’ habits and techniques for managing documents and use this information to build an expert system that would help average users to manage their documents better. The study examined the techniques employed by users for storing and retrieving information and these were incorporated into the expert system.
The expert system can be used to make decisions about how to store documents, what documents to backup, what documents to archive and how to do so in a way that will make retrieval of the stored documents easy. Through the system, the users can also make decisions about the best media to use for document storage, backup and archiving.
A study of users’ habits and techniques for managing documents was carried out. This was done using a questionnaire that attempted to discover users’ habits and techniques for managing documents. A sample of typical information workers was chosen randomly from the UCT community; lecturers, administrative assistants and postgraduates. These were people who were believed to deal with large volumes of documents in their day-to-day activities.
The questionnaire contained mainly open-ended questions as it was meant to collect information about users’ practices. Instead of distributing questionnaires to users, to fill in during their free time, respondents were interviewed by the researcher. This was done with the aim of attaining a high response rate. Due to the open-ended nature of the questions, it was feared that users might not respond well to the questions if left to fill it in on their own. When all the respondents had been interviewed, their responses were compiled and used to build the document management expert system.
3 Expert systems are usually evaluated at three levels. They are first evaluated for their performance, that is, their reasoning abilities and the quality of decisions they put forward to the user. The first form of evaluation is carried out by the expert system builders/developers. They are also evaluated by the experts whose knowledge was used to build the system, to establish whether they are able to give the correct information. Lastly they are evaluated by users - the people for whom the system was built. The users’ evaluation is meant to establish the system’s usability and efficiency.
On completion of its development, the expert system was tested to establish its ability to assist users with making document management decisions. The system was tested on users who had been interviewed during data collection and users who had not been interviewed. As providers of the information used to build the system, and the intended users of the completed system, the users who had been interviewed tested the system on two levels. They tested it for the correctness and utility of the information it contains and also for its usability and efficiency.
1.4 Dissertation Outline Chapter 2: Background on Document Management The chapter gives an introduction to document management. Practices, problems and developments in the area of document management are outlined.
Chapter 3: Expert Systems The chapter introduces the reader to expert systems. It explains what expert systems are, their makeup and how they are developed. Examples of classical expert systems are given.
Chapter 4: Data Collection and Analysis The chapter outlines the data collection process and gives an analysis of the collected data.
Chapter 6: Evaluation of expert system and results The chapter outlines the process of evaluating the developed expert system. It also discusses the results of the evaluation.
Chapter 7: Conclusions and Recommendations The chapter outlines conclusions drawn from the study and makes suggestions for future work.
2.1 Introduction Computers provide an easy and convenient way of storing documents. However, retrieval of the stored documents is not always easy when users deal with vast amounts of information and have a variety of locations where they can store their documents. The volume of information and the variety of storage locations end up becoming hindrances when users want to retrieve stored documents. The problem has become so widespread that it inspired several areas of research in information management.
Information management research has focused on areas such as file organisation, information retrieval, personal information management and document management.
Despite their different approaches, these areas of research are all concerned with storage and retrieval of information. They look at the problems encountered by users when working with information and attempt to devise solutions to these problems.
The focus of this study is document management. Most of the information that users deal with is contained in documents. Users create and store documents for later retrieval. Users often have a problem storing documents in a way that will help them retrieve the documents later. This results in delays and frustrations when trying to retrieve the stored information. Users forget the names with which they stored their documents or the locations where the documents were stored. Researchers in the area of document management focus on devising ways of overcoming these challenges to make storage and retrieval of documents timely and easy.
This chapter focuses on document management and the user. It looks at users’ document management practices, the problems they encounter and the efforts undertaken by researchers to help users overcome these problems.