Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

This wiki page serves to document the OAI-PMH configurations created for the Blanton Museum of Art, Harry Ransom Center, and UT Libraries Collections Portal, as part of the Art and Cultural Heritage Collective (ACHC) grant funded by the Mellon Foundation. Information about Primo's OAI-PMH service, Dublin Core mappings, and background on testing and development of those mappings are located here. Information may change as the grant project progresses. 


Tip

For more information about OAI-PMH in general, as well as specific implementation in content management systems used across UT, see the buttons below in this section.

These mappings can be adapted or used as a guide for repositories within UT Libraries who wish to contribute records to Primo. If interested in doing so, please contact Devon Murphy at devon.murphy@austin.utexas.edu

Auibuttongroup
an.spaceKeyutlmetadata
idurbdlgzfgr9
alignmentcenter
directionhorizontal


Auibutton
externalUrlhttps://www.openarchives.org/OAI/openarchivesprotocol.html
color#0052CC
filterSpacefalse
destinationexternalUrl
textColorPaletteDefault
titleOAI-PMH Documentation
typestandard
textColor#FFFFFF
url
targettrue
selectedUrlhttps://www.openarchives.org/OAI/openarchivesprotocol.html
an.spaceKeyutlmetadata
id7zzy85u63s7
backgroundColorPaletteDefault

Auibutton
externalUrlhttps://www.openarchives.org/OAI/2.0/guidelines-repository.htm
color#0052CC
filterSpacefalse
destinationexternalUrl
textColorPaletteDefault
titleOAI-PMH Implementation Guidelines
typestandard
textColor#FFFFFF
url
targettrue
selectedUrlhttps://www.openarchives.org/OAI/2.0/guidelines-repository.htm
an.spaceKeyutlmetadata
idvrnnvfex4l
backgroundColorPaletteDefault

Auibuttongroupan.spaceKeyutlmetadataidv6mbr4t9s2alignmentcenterdirectionhorizontal

Auibutton
externalUrlhttps://knowledge.exlibrisgroup.com/Alma/Product_Documentation/010Alma_Online_Help_(English)/090Integrations_with_External_Systems/030Resource_Management/060Setting_Up_OAI_Integration
color#0052CC
filterSpacefalse
destinationexternalUrl
textColorPaletteDefault
titleAlma/Primo OAI-PMH Documentation
typestandard
textColor#FFFFFF
url
targettrue
selectedUrlhttps://knowledge.exlibrisgroup.com/Alma/Product_Documentation/010Alma_Online_Help_(English)/090Integrations_with_External_Systems/030Resource_Management/060Setting_Up_OAI_Integration
an.spaceKeyutlmetadata
idtzbbu9c708n
backgroundColorPaletteDefault

Auibutton
externalUrlhttps://islandora.github.io/documentation/user-documentation/metadata_harvesting/#oai-pmh
color#0052CC
filterSpacefalse
destinationexternalUrl
textColorPaletteDefault
titleIslandora 8 OAI-PMH Documentation
typestandard
textColor#FFFFFF
url
targettrue
selectedUrlhttps://islandora.github.io/documentation/user-documentation/metadata_harvesting/#oai-pmh
an.spaceKeyutlmetadata
id83k0oa2d2ch
backgroundColorPaletteDefault

Auibutton
externalUrlhttps://help.gallerysystems.com/emuseum/6.4/usage/configuration/data-services#id-.DataServicesv6.0-OpenArchivesInitiativeProtocolforMetadataHarvesting(OAI-PMH)
color#0052CC
filterSpacefalse
destinationexternalUrl
textColorPaletteDefault
titleeMuseum OAI-PMH Documentation
typestandard
textColor#FFFFFF
url
targettrue
selectedUrlhttps://help.gallerysystems.com/emuseum/6.4/usage/configuration/data-services#id-.DataServicesv6.0-OpenArchivesInitiativeProtocolforMetadataHarvesting(OAI-PMH)
an.spaceKeyutlmetadata
id4v9iubp8hg5
backgroundColorPaletteDefault


Table of Contents

Table of Contents
minLevel2
excludeTable of Contents|Resources

OAI-PMH Basics


OAI-PMH is a series of six protocols invoked within a web browser, allowing users to obtain metadata records. Institutions can aggregate metadata from other sites, or "data providers," as long as they maintain an OAI-PMH service (acting as "service providers" as seen in the graphic below.)

Section


Info

OAI-PMH Overview


Primo has an OAI-PMH module which allows it to accept various XML formats. Basic information about the data provider (name, source data type, metadata schema) and a stable link to its OAI-PMH endpoint are necessary for a harvest to be created. More information about Primo's OAI-PMH service can be found in the presentation slides below.

Widget Connector
urlhttps://docs.google.com/presentation/d/1gEd6kADj0rGiFZNI05wkt4_CQ2SjvGA9lwevnIscDIM/edit?usp=sharing










Current OAI-PMH Mappings


The spreadsheet below provides an overview of the main Dublin Core mapping and the three specific mappings for each grant partner. Use the tabs underneath the spreadsheet to change between the mapping views. 

Section


Info

OAI-PMH mappings for all three grant partners

Widget Connector
width800
urlhttps://docs.google.com/spreadsheets/d/e/2PACX-1vSJ3r2ziDIGsriHk7B2abfDl62fPROKQ31mt4-DlDMowXSR6L3taOLN7gy7lS2ZkWyeT6GYjCyUsOaN/pubhtml
height550



As DPLA requires Title, Rights, and Identifiers (as well as the name of the repository), these fields are also required in our OAI-PHM service. Only one partner, the Harry Ransom Center, uses Qualified Dublin Core.




Development


These mappings are partially based on DPLA (Digital Public Library of America) and TexHub harvesting requirements and on shared metadata fields between the three grant partners. Other proposed institutions were also analyzed (Visual Resources Center collection.) This method allowed for record parity between three diverse collections and prepared the collections for possible aggregation into DPLA in the future. 

Mappings were tested within the Primo Sandbox environment, following the Sandbox refresh schedule. Challenges were presented by data modeling in CONTENTdm, reliance on vendors to implement OAI-PMH services and settings, and by Primo's resource types. In further detail:


  • CONTENTdm allows users to ingest multivalued fields into single cells. While CONTENTdm has a method to split these for display, this does not function for OAI-PMH harvesting, which retrieves the actual value. This becomes an issue for fields with variable lengths, as Primo has limited regular expression support.
  • Primo can split fields, but does not have extensive support for more complex regular expressions. Multivalued fields with variable length or formatting cannot be split into other fields successfully.
  • Some partners used in-house content management systems that were maintained by their IT departments, while others used vendor-supplied IT support. The latter could be challenging in some cases. For example, Gallery Systems only provide limited choices for users to set up their OAI-PMH feeds. Any change to fields or mappings has to be handled by the vendor. Without control over this process, wait times and communication were often tricky to navigate.
  • Primo can sort and facet records based on type (image, text, audio, etc.) It looks for these values in the dc:type field. If these fields do not match the expected values in its Resource Type mapping, items will be mapped as "Other." This became an issue due to some partners not using DCMI terms or having multivalued fields that contained DCMI and non-DCMI terms.


Section


Info

Takeaways

  • Encourage data providers to conduct metadata remediation before harvest implementation. After review of the harvested records, metadata remediation or remapping is recommended if errors persist. Primo has limited ability to edit fields.
  • Encourage data providers to investigate how their metadata fields are modeled (data value type, format, etc.). This modeling could have unexpected effects on harvest and display.
  • Migrations or changes in metadata practice on the data provider’s end can affect efficacy or display of their harvest.
  • How often records are updated by the service provider varies. Primo updates frequently, but not all recent changes will be immediately present.
  • Combined fields can cause problems for record display and indexing.
  • Not all fields are displayed or harvested. Aggregated records are not meant to replace the original record, but instead serve as a portal to the original source. Focus on shared fields across contributing institutions.
  • Primo type facets utilize DCMI terms for its internal mapping; if data providers want their resources to appear in this facet, their dc:type field should use DCMI terms. If not, they will be assigned the "Other" label.