Glossary of terms
Missing entry?
If you come across a term or concept in the DAMS user manual that is not listed here, please contact the DAMS managers (click here to submit a DAMS service request).
Advanced search
Search function that allows to select specific metadata fields in the Solr index for a search, including a combination of search queries. With the advanced search, complex search queries can be constructed to limit the search space. In comparison, a basic search would search for a given term across a predefined set of Solr fields.
Aggregator
In the context of digital content repositories like the DAMS, an aggregator is an organization that collects metadata (sometimes also thumbnail images or other derivatives) from different sources and combines them into a common data set. The aggregated data can be made available through search and discovery portals. The Digital Public Library of America (DPLA) is an example for a metadata aggregator that combines and showcases metadata for digitized cultural heritage objects from across the USA.
Aggregators like the DPLA are often said to 'harvest' metadata from participating institutions. It is common for cultural heritage institutions to make their metadata available through a public (web) interface and to notify the aggregator about how to access it. One of the de-facto standards for harvesting metadata through public interfaces is the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The protocol specifies how the software of an aggregator should 'talk' to the repository of a cultural heritage institution, in order to get new and updated metadata records. As a minimum requirement, OAI-PMH mandates that repositories send descriptive metadata about assets in Dublin Core format, although a repository owner may also provide metadata in additional formats.
Asset
see Digital Asset
BagIt
BagIt, also known as the BagIt File Packaging Format, is a specification for organising files for storage and transfer. The LoC describes it as follows: "BagIt provides a directory structure and a specifies a set of files for transferring and storing files that includes clear delineations between the digital content itself (stored in a subdirectory called “data”) and the metadata quantifying it, including a manifest of filenames and checksum values (called a manifest). It also allows for optional basic descriptive elements that are stored within the bag (in a file called bag-info.txt) to provide recipients or custodians of the content with enough information to identify the provenance, contact information, and context for the file delivery or storage package." (https://blogs.loc.gov/thesignal/2019/04/bagit-at-the-library-of-congress/)
The format specification has been updated in October 2018 (IETF RFC 8493), although it has been in existence for much longer.
Basic search
Search function that allows to search for a keyword in a predefined set of metadata fields available in the Solr index (title, subject, description, contributor, date, format, rights, full text). In the current DAMS implementation, a collection namespace can be selected to limit the search space. In comparison, the advanced search allows to select individual metadata fields in the Solr index for a search.
Batch ingest
Variant of ingest.
Method for uploading and processing multiple assets in bulk, rather than one asset at a time. More time-efficient than individual ingest; suited particularly well for larger digital collections. Different variants exist, including batch ingest of digital image files and the accompanying metadata. Files are uploaded to an SFTP server and then a batch ingest job is scheduled from the DAMS user interface. The DAMS will check every ten minutes for new batch jobs to process. Upon scheduling a batch job via the DAMS user interface, the system will generate a unique batch ID, which is sent out by email to the user requesting the batch job.
Batch ID
see Batch ingest
Collection
Natural or artificial grouping of assets. In the context of the DAMS, collections are sets of digital assets created for curation or data management purposes, similar to folders in a file system. Digital collections in the DAMS need not coincide with natural or artificial collection contexts outside of the DAMS.
Collections can be nested to form sub-collections. The highest level of Collections in the DAMS is called top-level collection or namespace collections. Sub-collections and digital assets underneath a namespace collection will inherit the collection namespace, e.g. utlblac, utlarch, utlmaps etc. The structure of sub-collections need not coincide with organizational hierarchies or the structure of physical collections.
The grouping of assets into DAMS Collections is NOT reflected on the Collections portal. An individual asset's MODS metadata should contain the name of a Source Collection, which is used to create a Source Collection facet in the Collections portal.
Collections and sub-collections in the DAMS have their own metadata, to provide a description of assets at the collection level. However, the collection-level metadata is not used for display on the Collections portal.
A digital asset in the DAMS can be a member of one single Collection. After ingest, digital assets cannot be moved between DAMS Collections. After digital assets have been ingested into a collection, the collection name must not be changed.
Collection contributor
also Collection user;
a DAMS user role. Collection users are assigned limited user permissions to edit metadata of individual assets in a sub-collection. See DAMS Policy for details.
Collection owner
A DAMS user role. Collection owners have curatorial responsibility, typically for sub-collections in the DAMS. See DAMS Policy for details.
Collection supervisor
A DAMS user role.
Collection supervisors have curatorial and organizational responsibility for top-level collections in the DAMS. They are organizationally responsible, for instance, for determining the structure and name of sub-collections inside a top-level collection and for granting editing access to collections and sub-collections in the DAMS. See DAMS Policy for details.
Collection user
See Collection contributor.
Complex asset
Digital asset in the DAMS that has constituent parts with separate PIDs, e. g. a serial which has issues, or a book containing pages; also referred to as paged content. In comparison, a simple asset does not contain constituent parts.
Content model
In order to appropriately store, manage and disseminate digitized physical assets, the DAMS uses content models for different classes of media, for instance image content, multi-page image content (books and serials issues), audio and video content. Depending on the content model, the DAMS may require different metadata and will show different web forms for entering and editing metadata through the DAMS user interface. Certain types of automatic processes upon ingest of a digital asset into the DAMS depend on the content model, e.g. the creation of derivative media files or full text.
DACS
see Describing Archives: A Content Standard
DAMS Administrator
A DAMS user role.
DAMS administrators are usually members of the Libraries' IT staff. They develop and maintain DAMS functionality and provide support for fixing software bugs. See DAMS Policy for details.
DAMS Manager
A DAMS user role.
DAMS administrators manage user accounts, monitor operational processes, track projects and provide consultation for ingest and management of assets in the DAMS. See DAMS Policy for details.
Datastream
Part of a digital asset. In the DAMS, assets logically bundle different kinds of digital data, e.g. an image file, a metadata file, and derivative images like a thumbnail image version. Each part of the bundle is referred to as a datastream in Fedora Commons terminology. Digital assets in the DAMS have a number of default datastreams, for instance a MODS datastream, a Dublin Core datastream and a RELS-EXT datastream.
DC
see Dublin Core
Describing Archives: A Content Standard
A set of rules for describing archives, personal papers, and manuscript collections, maintained by the Society of American Archivists. See https://www2.archivists.org/standards/describing-archives-a-content-standard-dacs-second-edition.
Derivative
File generated (derived) from the primary digital file stored in an asset, usually for presentation purposes. Typically, derivatives are alternative media representations in a different quality and/or different file format (e.g. an MP3 file as a derivative representation of an audio file, or a small thumbnail image for display in search results).
Digital asset
A resource in the DAMS that logically bundles a data file, or OBJ datastream with metadata datastreams and additional datastreams. Digital assets have a unique identifier (PID). Digital assets can also form either a parent or a child in a parent-child relationship. Digital assets that have child assets are referred to as complex assets.
Digital object
In the context of the DAMS, a digital object is part of a digital asset, specifically the component file with the datastream ID "OBJ".
The DAMS interface is unfortunately not consistent in its use of the word "Object". Depending on the context, the interface calls only the OBJ datastream "Object", in other instances "Object" refers to the OBJ datastream plus additional datastreams, including metadata - more akin to the definition for Digital Asset given in this documentation.
Distribution Bag
A Bag file containing a single asset or a collection of assets, generated for delivering assets to patrons. Distribution bags can be created on demand from the DAMS user interface.
Dublin Core
A set of metadata categories or vocabulary items for describing resources, based on the specifications developed by the Dublin Core Metadata Initiative (DCMI). The DAMS automatically creates Dublin Core metadata from the MODS metadata entered upon ingest of a digital asset.
Fedora Commons
Fedora Commons (short for Flexible Extensible Digital Object Repository Architecture) is an open source software for storing, managing and accessing digital content. It is part of the Islandora software stack used for the DAMS.
File Information Tool Set
also FITS;
a set of software tools to identify and validate file types and to extract Technical metadata about a file. The FITS software itself consists of a wrapper application that calls other tools to deal with specific file types and tasks in identifying files and extracting metadata. Upon ingest of a primary file, the DAMS processes the file with FITS and stores the result in the TECHMD datastream for an asset. This metadata is mainly used for quality control and digital preservation. See https://projects.iq.harvard.edu/fits/home for more information.
File Transfer Protocol
(also: FileZilla, FTP, JSCAPE, SFTP)
A network protocol suited to transfer files between computers in a network. In the context of the DAMS, a more secure variant of the File Transfer Protocol is used, called SFTP. Transfer of staged files for batch ingest into the DAMS can happen using an (S)FTP client like FileZIlla, which connects to Jscape (a file storage server that permits connections via SFTP).
FileZilla
A client (software) to connect to a remote computer ("server") via the File Transfer Protocol.
FITS
see File Information Tool Set
FTP
see File Transfer Protocol
Ingest
The process of adding a new Digital asset and/or metadata and/or Derivatives to the DAMS. Ingest can happen manually, asset by asset, through the DAMS user interface, or as a Batch ingest.
Islandora
A combination of software components that together provide the functionality of the DAMS for storing, describing, and managing digital assets. Islandora is developed by an international consortium of institutions from the cultural heritage field. See the Islandora website for more information.
As of December 2019, the DAMS uses a version of Islandora from the 7.x version branch, supplemented by custom software components built by UT Libraries IT developers. A newer version of Islandora is currently under development (Islandora 8).
JSCAPE
A file transfer server which allows to upload files for Batch ingest into the DAMS. JSCAPE provides a web interface for uploading files through a web browser. Alternatively, files can be uploaded with an (S)FTP client software like FileZilla, which can connect to JSCAPE using a secure variant of the File Transfer Protocol (SFTP). Uploading files via SFTP is recommended if the files staged for ingest are organised in folders, as the JSCAPE web interface does not allow to upload entire folders.
Manifest file
A manifest file is a file containing metadata for a group of accompanying files that form part of a set or coherent unit (from Wikipedia). In the context of the DAMS, manifest files are required for Batch ingest of Digital assets. The manifest files tell the DAMS which files should be added to a Digital asset, and which Datastream should be created from a particular file.
Metadata Object Description Schema
Metadata Object Description Schema (MODS) is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications. As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records. It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. See https://www.loc.gov/standards/mods/mods-overview.html for more information.
MODS
see Metadata Object Description Schema
Namespace
PIDs (persistent identifiers) assigned to digital assets in the DAMS are grouped into Namespaces by using a common namespace prefix for all digital assets in the same group. The namespaces used in the DAMS indicate the top-level collection a digital asset belongs to. PIDs in the DAMS have the form namespace-prefix:UUID, e.g. as in utlgs:cdf95dff-07dd-4def-b59e-bd8c61b02d32. The following namespace prefixes are currently available:
- utlarch (Architectural Digital Collections)
- utlfal (Fine Arts Collections)
- utlgeol (Geology Collections)
- utlblac (LLILAS Benson Digital Collections)
- utlmisc (Miscellaneous Collections)
- utlmaps (PCL Map Collection)
Object
see Digital Object
OCLC Number
OCLC Source
OCLC number of the catalog record describing the original version (e.g. print/physical copy) of an information resource.
OCLC Surrogate
OCLC number of the catalog record describing the digitized (derivative) version of an information resource.
OCR
see Optical Character Recognition
Optical Character Recognition
(also: OCR)
The use of software or other technology to recognize text, handwritten or typed, in digital images. The DAMS uses the open-source software Tesseract to perform Optical Character Recognition on digital images ingested into the system. Another common software tool used in cultural heritage contexts is Abby FineReader.
The result of Optical Character Recognition is typically an XML file, for instance in hOCR or ALTO format, or a plain text file.
Paged content
In the context of the DAMS, Paged content refers to Digital assets which consist of multiple image files representing the individual pages of for instance a book. In a broader sense, Paged content also refers to the organisation of publication volumes/issues into for instance series or a journal. Paged content is a typical use case for Complex assets: Each page of a publication issue is stored as an individual Digital assets inside the DAMS, but also contains a parent-child link to a Digital asset representing the issue. Publication issues in turn can contain a parent-child link to a Digital asset representing a series or a journal.
Persistent Identifier
also PID; generally a long-lasting reference to a (typically digital) object. In the broader cultural heritage domain, persistent identifier usually refers to a long-lasting reference to a digital resource that is accessible over the internet, cf. https://en.wikipedia.org/wiki/Persistent_identifier. The term and especially its abbreviation PID is used in a more specific sense in the Fedora/Islandora community and also in the context of the DAMS.
In a more specific sense, used within the context of the DAMS, a PID is a string of characters that identifies a Digital asset or a Collection within the UT Libraries DAMS. PIDs are unique within the DAMS. Syntactically, they begin with a Namespace prefix (utlibraries, utlarch, utlfal, utlgeol, utlgs, utlblac, utlmisc, utlmaps), followed by a colon and a randomly generated Universally unique identifier (UUID v.4). In the DAMS, PIDs are automatically assigned at the ingest of an asset or during the creation of a collection. They cannot be changed and are lost when an asset or a collection is purged.
Persistent URL
A Persistent URL (PURL) is an address on the world wide web that causes a redirection to another web resource. If a web resource changes location (and hence its URL), a PURL pointing to it can be updated. A user of a PURL always uses the same web address, even though the resource in question may have moved. PURLs are a kind of Persistent Identifier.
Implementations of PURL systems include purl.org (OCLC/Zepheira, now Archive.org), Archival Resource Keys (ARK), Handles and Digital Object Identifiers (DOI). See https://en.wikipedia.org/wiki/Persistent_uniform_resource_locator for more information.
PID
see Persistent Identifier
PREMIS
Standard for preservation metadata for digital objects, maintained by an international editorial commission. See PREMIS website for further information.
Preservation
Combination of policies, strategies, and actions that ensure access to reformatted and born-digital assets over time. Includes Archival Information Package (AIP) specification, tape vaulting, and geo-replication. The DAMS aims to support preservation actions by allowing the creation of Preservation bags that can be fed into a preservation workflow.
Preservation Bag
A Bag file containing a single asset or a collection of assets, generated for feeding assets into a digital preservation workflow. Preservation bags are created by Digital Stewardship staff upon request by Content curators.
Published
Status of a Digital asset indicating availability via the Collections portal for public access.
Purging
Permanent removal of a Digital asset from the DAMS, including primary digital object, derivative files and metadata. Purging is non-reversible. Per DAMS Policy, purging of Digital assets should be avoided, especially if a Digital asset has been published to the Collections portal.
Purging of a Digital asset will also remove the assigned PID, making it in principle possible that a newly ingested asset will receive the same PID as a previously purged asset. In practice, there is only a small statistical chance that a previously used PID will be re-generated.
PURL
see Persistent URL
RDA
see Resource Description and Access
RELS-EXT
System-generated Datastream in Islandora/Fedora that contains administrative information, for instance about the Collection a Digital asset belongs to.
Resource
same as asset
Resource Description and Access
Also RDA; international content standard for creating bibliographical descriptions. Successor for instance to the Anglo-American Cataloguing Rules (AACR2). Maintained by the RDA Steering Committee. See RDA Toolkit website.
SFTP
see File Transfer Protocol
Simple asset
Digital asset in the DAMS that does not have constituent parts/child objects with separate PIDs. In comparison, a complex asset can contain constituent parts which have individual PIDs, e. g. the issues of a journal or the pages of a book.
Simple search
see Basic search
Solr
Apache Solr is an open-source search index and retrieval software. It is one of the components of the Islandora software stack and used to automatically combine the MODS metadata of all Digital assets in the DAMS into one common search index. The DAMS Solr index is also the source for metadata that is published to the Collections portal.
Staging
Organizing the files that constitute a Digital asset, in preparation for ingest into the DAMS. Typically refers to the preparation steps for Batch ingest, i.e. organizing files in subdirectories and creating Manifest files.
Tape vault number
Identifies the storage location of a digital file on tape storage.
TECHMD
Datastream in the DAMS containing FITS metadata.
Tiered ingest
A variant of Batch ingest, for adding Digital assets to the DAMS. With a Tiered ingest, users can ingest primary digital objects alongside derivative or secondary files that were generated outside of the DAMS, e.g. thumbnail images or OCR full text. With the exception of system-generated Datastreams, any file can be ingested together with a primary digital object into one of the Datastreams constituting a Digital asset.
Unpublished
Status of an asset indicating that it is not available for public access through the Collections portal.
URI
see Uniform resource identifier
Uniform Resource Identifier
also URI;
a string of characters that unambiguously identifies a particular resource. To guarantee uniformity, all URI follow a predefined set of syntax rules (from Wikipedia: https://en.wikipedia.org/wiki/Uniform_Resource_Identifier). URI are used widely on the internet, especially on the World Wide Web (WWW). On the WWW they identify for instance web pages or other files. According to the formal definition of the identifier scheme, URI can identify abstract or physical resources (https://tools.ietf.org/html/rfc3986).
Two common sub-types of URI are Uniform Resource Locators (URL; "web address") and Uniform Resource Names (URN).
Universally unique identifier
also UUID;
an alphanumeric string that can be used to identify information in computer systems. In the DAMS, UUIDs are used to create the PID for an asset. UUIDs for DAMS PIDs are generated based on a random process (UUID version 4) and are unique within a Namespace in the DAMS, as long as the respective Digital asset is not Purged. There is a small statistical probability that UUIDs can be duplicated, which would result in the PID of a purged asset being used again, but the probability is close enough to zero to be neglected in practice. See https://en.wikipedia.org/wiki/Universally_unique_identifier#Version_4_(random) for further information.
User roles
DAMS users are assigned to one of the user roles specified in the DAMS Operations policy. A user roles defines a template for the set of permissions a user has for creating and managing Digital assets in the DAMS. Currently, the following user roles are used for the DAMS:
- View-only user
- Collection contributor
- Collection owner*
- Collection supervisor*
- DAMS manager
- Administrator
*) The two roles Collection owner and Collection supervisor are sometimes collectively referred to as Collection curators.
In the DAMS, user permissions are usually controlled on a Collection level. The permissions are stored in a XACML policy.
UUID
see Universally unique identifier
View-only user
A DAMS user role. The View-only user role is the default role a member of UT Libraries staff can receive upon registering as a DAMS user. The permissions of this user role are limited to browsing, searching and viewing Digital assets in the DAMS. See DAMS Policy for details.
W3CDTF
The Date and Time Formats specification of the W3 Consortium. A concise specification to represent date and time in a machine-readable form, geared towards application on the World Wide Web. See https://www.w3.org/TR/NOTE-datetime for the specification.
XACML policy
A user permission policy encoded in the EXtensible Access Control Markup Language, originally an XML application. User access to Collections and Digital assets in the DAMS is controlled via collection-specific policies, expressed in XACML. Per collection, User roles and lists of users can be granted viewing and managing access to the Digital assets the collection contains.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.