Digitization with Millennium & CONTENTdm Stuart Hunt IUG17 Anaheim May 2009 Overview
Background Digitisation Metadata Workflows Now
University of Warwick Royal Charter 1965 Russell Group
16,000 FTE students 5000 staff University Library
Approx 1.1 million volumes 170 staff (110 FTE) Millennium 2003 Approx 100,000 issues/renewals per yr Approx 28,000 new books per yr RLUK member
OCLC member Content Marandet Collection 4000+ French plays 1720 to 1900 Acquired 1970s Guide published 1979 Bibliographic records in Millennium, RLUK, COPAC, & WorldCat
No IPR issues Projects Revolutionary Drama (1789-1800)
339 plays Empire Period Drama (1801-1815) 123 plays JISC Digitisation Programme: Enriching Digital Resources Exposing Marandet 1500 plays/75,000 pages
Objectives Cross-searching Full-text searching Integration with existing & future systems Millennium Web Vertical search solution
Options Existing solutions Millennium In-house web publishing tool Separate product Digital collection management software CONTENTdm
Solution would drive approach taken Digital production Image files TIFF & JPEG derivative Full colour & greyscale Outsourced
Text files/full-text transcripts OCR quality initially not acceptable Re-keying Outsourced Media Management
Tried & tested solution Quick & easy Link digital content D2D process simplified Existing bibs
New bibs Use existing authentication if required Media Management No full-text searching No cross-collection searching (unless in separate scope) Tied to MARC metadata Metadata enrichment difficult
Image file format Not a total solution CONTENTdm
Full-text & cross-collection searching Not tied to MARC metadata Metadata enrichment simple Local Windows server Initial licence <50K images Upgraded to unlimited licence 2008
Local metadata context Separate bibs Print vs electronic Describes what is Supports better (future) FRBRisation Ease of maintenance Location & format based scoping 793 for local added entry/uniform title
Collection name Metadata option 1 Create metadata within CONTENTdm Play-by-play Metadata already present in Millennium Metadata option 1 Assumes that metadata is already
available Not scalable Poor use of resources Does not allow data to work harder or smarter Metadata option 2 Create metadata outside of Millennium Metadata not already present in
Millennium Play-by-play Harvest from CONTENTdm into Millennium via XML Harvester XML Harvester Single configuration file Needs to be edited for each separate resource
Uses XSLT not load table(s) Major changes (e.g. harvest different schema) may need to be done by III Configuration file triggers @XML_TYPE=DC (or MARCXML) @OAI_FORMAT=oai_dc @DBNAME=[Repository name] @URL=[url for OAI-PMH]
@USEOAI=true (or false) @OAISET=[Name of set] @RECID_MARCTAG=001 XML Harvester Harvested metadata
Loaded through Data Exchange Significant re-editing Tags & indicators Diacritics Creating attached items or holdings records
Harvested metadata Metadata option 3 Batchload into CONTENTdm via delimited file from Create Lists Cross-walk MARC21 to DC Directory structure
MARC to Simple DC crosswalk 260|c dc:date Record# dc:identifier 008/07-10 300 dc:format dc:language 100 dc:creator
700|t dc:relation 793 dc:source MARC DC Crosswalk Additional DC elements dc:rights dc:type Transcript mapped to dc:description
Metadata workflow Create separate bibs for e-versions Export print records via Data Exchange MarcEdit to remove extraneous tags (907, etc) Insert 006, 007, 008/23, GMD, 533 Re-import into Millennium as new bibs [856 CONTENTdm reference url added]
Metadata workflow Review file of newly loaded bibs exported from Create Lists Cross-walked from MARC to DC Additional DC elements added Item level metadata added Loaded to CDM as delimited files with directory structure
Metadata in CONTENTdm Compound objects Document level Page level Less rich than document level Hospitable to multiple schemas Deliberate attempt to stay close to DC
Administrative metadata Later feature Document level AACR in DC wrapper All descriptive metadata from bib (except LDR, 006, 007, 008, GMD) Authority control (names, subjects, uniform titles)
Rights (dc:rights) Identifier (.b number) Mapped to DC for OAI harvesting Page level Basic descriptive metadata (creator, title, publisher, date) Rights (dc:rights) Identifier (.b number)
Transcript (dc:description) No OAI harvesting at page level Local decision Access & availability Availability across local global continuum Metadata contribution Collection level descriptions
OAI Collapse D2D Metadata in WorldCat Local CDM server not able to use Connexion Digital Import Bug between WorldCat and CDM for compound objects FRBRized display in worldcat.org potentially impedes discovery
Now Exposing Marandet completes 9/2009 Established service 4 collections Ancien Rgime Drama Revolutionary Drama Empire Period Drama Restoration Drama
Coulombic. Attraction; Periodic Logic Problem - due Monday/Tuesday Block. Complete post lab - due Monday/Tuesday block. Handouts: POGIL . Coulombic. Attraction. Coming . up: Periodic Trends Quest (no retakes) - Monday/ Tuesday Block April 26/27. AP and Light and Energy...
The Succession Crisis of 1066. Topic Summary. England had a troubled History in the 50 years leading up to the succession crisis of 1066. There were no clear rules in place for succession to the English throne.
To form the negative of passé composé, put ne before the helping verb (avoir or etre) and pas after the helping verb. Subject + ne + helping verb + pas + past participle. Examples. Je n'ai pas étudié. Nous n'avons...
Observer Pattern Tu Nguyen ... state to ConcreteObserver object ConcreteObserver Implements Observer interface to keep state consistent with subject Observer UML Consequences Abstract coupling between subject and observer Coupling is abstract, thus minimal (concrete class isn't known) Can have ...
Researchers, when they publish their work, are always putting forward an argument of some kind, but how much this is obvious, how it is structured, what counts as evidence, depends on the discipline or field: whether it is history, zoology,...
Hippocrate a construit des lunules associées à différentes figures géométriques et il a étudié l'aire de celles-ci à partir du théorème suivant : Les aires de figures semblables sont dans le même rapport que le carré de leurs lignes homologues.
Ready to download the document? Go ahead and hit continue!