Online Catalogue and Repository Interoperability Study (OCRIS)

Appendix 3 to OCRIS Final Report

In-depth studies of a sample of IRs and Online Public Access Catalogues

Duncan Birrell, Gordon Dunsire and Kathleen Menzies

Centre for Digital Library Research, University of Strathclyde

September 2009

Project Acronym: OCRIS

Document: Appendix 3 to OCRIS Final Report

Version: 1

Contact: Kathleen Menzies or Gordon Dunsire

Date: September 2009

In-depth studies of a sample of Institutional Repositories and Online Public Access Catalogues1 1 Introduction As part of Workpackage 3, the OCRIS project team devised a workflow by which to investigate Online Public Access Catalogues (OPACs) and Institutional Repositories (IRs) within 10 Higher Education Institutions (HEIs) deemed (both individually and as a group) to be representative of a particular facet of the project and its context. This purposive sampling provided a snapshot of the current situation regarding duplication of both records and scope between bibliographic/publication systems, the way item types are described and curated, the implementation and customisation of the software being used and the extent to which links are being made between the two systems.

Having identified HEIs with both OPACs and IRs within a previous workpackage, a purposive sample of these (10 in total) was selected. Establishing which item types were in scope for the OPACs and IRs, each sub-set of institutional systems was interrogated at User Interface (UI) level in order to identify and retrieve records for which examples of overlap or duplication might be found (for example, a record for the same thesis, held by both the IR and the LMS). Using these records, many issues and problems stemming from duplication, arrangement and description were explored.

The 10 institutions which OCRIS selected for in-depth study were as follows:

• Cambridge and Glasgow acted as Case Study institutions within Workpackage 3 thus were selected for inclusion.

• Hull is unusual in using Fedora software as the basis of its IR (with a Muradora interface).

• Southampton is notable for having multiple Institutional Repositories (10 in total), many of which pertain to specific subjects or disciplines.

• The National Marine Biological Library was selected as an atypical example for various reasons - its IR is both multi-institutional and subject-specific while its catalogue is based on Open Source software. OCRIS was able to explore its aims and objectives beyond the boundaries of a single University.

• Strathclyde is the "home" library of the OCRIS team and was an early advocate of Open Access repositories – being able to talk directly to staff involved in its development was beneficial.

• The University College of the Creative Arts administrate the Visual Arts Data Service (VADS), a format and discipline specific repository, using custom software.

• The metadata provided for their bi-lingual interfaces was considered in relation to the 3 HEIs which contribute to the Welsh Repositories Network (WRN): the Universities of Aberystwyth, Cardiff and Glamorgan.

The following discussion is structured according to various aspects explored relating directly to the aims and objectives of OCRIS. These are: scoping issues (primarily, duplication of scope at both item and record level as well as information provided to users on scope); the description of item and format types; the arrangement and granularity of subject menus; the application of authority control and levels of ambiguity/clarity; the instantiation of links between IRs and OPACs.

These themes are not explored herein in relation to every institution; instead, those examples deemed most illustrative have been chosen. Similarly the discussions vary in depth and length due to factors such as the number of IRs or catalogues within an institution, the amount of duplication occurring or the availability of information concerning any given system.

All information taken from websites was correct at the time when these studies were carried out (July/August 2009). Inevitably some information will have changed since that time.

2 Scopes and item types

2.1 Cambridge University

At Cambridge University, the items searchable in the Newton catalogue are given as:

• Book

• Serial

• Electronic journal

• Electronic resource

• Disc (CD/DVD)

• Music Score

• Map

• Non-musical Recording

• Musical Recording

• Archive/Manuscript

• Kit

• Mixed Material/Collection

• Mixed Material

• Visual Material But naturally the true scope extends beyond this list of 14 MARC-derived fields and is also more complex. Further investigation of the website reveals that the following holdings of Cambridge

University Library are recorded in Newton:

• all printed books published from 1978 onwards, with the exception of Official Publications

• selected Official Publications published since 1999

• printed books published before 1978 considered to be of academic importance at the time of acquisition

• all print journals

• all electronic journals

• atlases published after 1977

• maps catalogued since August 2000

• sheet music and recorded music catalogued after 1990

• microfilms and microfiches published after 1977

• audio-visual material published after 1977

• music manuscripts All printed books and journals in the 4 CUL dependant libraries are also recorded.

An additional note on scope states that "Coverage of books published prior to 1978 in the University Library is incomplete. Books published prior to this date, and considered at the time of acquisition to have been of secondary academic importance, are not included in Newton...[ ]...Coverage of music manuscripts in this catalogue is complete, and where such manuscripts contain two or more pieces, there are in most cases catalogue entries for each piece. This catalogue does not include theses or nonmusic manuscripts, which are in a separate Newton catalogue, University Library Manuscripts and Theses." 2 Yet it takes some digging for users to get down to this level of detail.

The item types for the CUL's Pilot Universal Catalogue are the same as those listed above, but it also allows users to set search limits by format (which it terms "medium"). The range of mediums

recognised in the Universal Catalogue is:

• Digital http://ul-newton.lib.cam.ac.uk/vwebv/ui/en_US/htdocs/help/about.htm [Accessed 24th august, 2009].

• Globe

• Globe

• Map

• Microform

• Nonprojected Graphic

• Motion Picture

• Projected Graphic

• Sound Recording

• Text (Eye-Readable)

• Videorecording DSpace@Cambridge webpages inform users that it was established in 2003, "to facilitate the deposit of digital content of a scholarly or heritage nature, allowing academics and their departments at the University to share and preserve this content in a managed environment". It provides a list of the item types contained within, derived from the standard Dublin Core "dc.item" terms instantiated within the

repository. The following are listed in the ‘browse by type’ menu:

• Article

• Audio

• Book or Book chapter

• BW Image

• Colour Image

• Dataset

• Drawn image

• Image

• Journal Article - Published Version

• Journal Article - Submitted Version

• Map

• Other

• Preprint

• Presentation

• Software

• Table

• Technical Report

• Thesis

Similarly to the OPAC, the scope is in actuality broader than this, with all types of material being sought.

There is then, more potential overlap between OPAC and IR than it may seem from looking at "browse by type" menus. Conversely, overlap may be less likely than it appears once a user unearths sufficient detail. Yet it can be concluded that books, journals, maps and theses are all within scope for both systems at Cambridge. The ways in which that duplication actually occurs is considered in later sections.

2.2 University College for the Creative Arts Types and Locations listed in "advanced search" delimiters by the UCCA Library Catalogue are as



Academic research output Cameras and IT equipment CD-ROMs and computer files References to administrative aspects such as "reported missing", "in transit between campuses" and "CD-ROM 3 week loan" for reasons of relevance.

• Music and sound recordings
• Slides

Music and sound recordings Slides


AV equipment Academic research output Audio cassette BFI ticket - three day loan Box of slides British Library article British Library book CD CD-ROM Canterbury Fine Art Store Canterbury Media Store Computer equipment Copyright cleared photocopy Digital game Electronic book Electronic journal Epsom Media Store Information files Journal Laptop MA reading list Non BL document supply article Non BL document supply book Reference book Scanned material (shadow catalogue) Slide or group of slides Thesis


• Academic Reserve Collection
• Animated Films Video Collection
• Annuals Collection
• Archive
• Artists' Book Collection
• Audio Collection
• Berbiers Collection
• Canterbury Fine Art Store
• Canterbury Media Store
• Careers Collection
• DVD Collection
• Documentaries Video Collection

The VADS "Collections" search offers access to 44 separate image collections, where the material types are marked "images". It also provides guidance and consultancy services in relation to digital projects and preservation.

From the catalogue lists given above it appears that the only potential scope overlap between OPAC

and IR would be:

• Maps

• Slides

• Photocopies

• Scanned material Even then this would only apply if they were derived from the two UCCA collections recorded in VADS (the Crafts Study Centre: University for the Creative Arts at Farnham and the Textiles Collection: University for the Creative Arts at Farnham) or if they linked to external digital copies of images.

In the OPAC these would always remain at bibliographic level only while in VADS images would be stored alongside richer, VRA4 bibliographic records.

In any case, there appears to be no duplication as slides/maps/photocopies and scans are not associated with the 2 UCCA VADS collections (whose parent institutions are not listed in the "Locations" menu). These specialist Centres are not libraries and the 30,000 items of the Crafts Study Centre Archive4 are described in the JISC-funded Archives Hub gateway service (http://www.archiveshub.ac.uk) rather than within the UCCA catalogues.

There is also potential overlap with some of the sets of "Resources" available via VADS (http://www.vads.ac.uk/resources/index.html) which include a database of selected film titles and viewing notes, transcripts of letters and multimedia collections. But generally these are clearly packaged to act as learning/teaching resources (a variety of "modules" are contained in the resources section) and are quite distinct in purpose to the bibliographic records held in the OPAC, effectively http://www.csc.ucreative.ac.uk/index.cfm?articleid=20211 [Accessed 25th August, 2009].

VADS is a rich data source and merits further exploration in terms of item description/arrangement.

VADS is a rich data source and merits further exploration in terms of item description/arrangement.

Its search by "Themes" facility allows users to locate items within and across five key subject areas in

the arts:

• Applied Arts

• Architecture

• Design

• Fine Art

• Media The image files within these 5 themes are sub-divided into 20 classifications, with 70 listed material types or techniques. This means the material types depicted in VADS image files can only be discerned by clicking through each of the themes in turn. Whilst effectively placing the onus for building an index of material types on the end user, this type of classification is clearly designed to reflect the ways in which students think about and make use of visual arts materials (and illustrative images of them) within their work. Similarly, items are arranged by Collection (44 in total, 2 of which are UCCA collections). Students can save records to their own "lightbox" area and add tags to items.

