Expand | ||
---|---|---|
| ||
|
Abstract
Excerpt | ||||||
---|---|---|---|---|---|---|
MODS Element name: <physicalDescription><internetMediaType> Short definition: The electronic file format type/extension of the object. Input guidelines: Select/enter the value that corresponds to file type/extension of object to be ingested. e.g. CR2, GIF, JPEG, PSD, TIFF. The file format in subelement <internetMediaType> is specified according to the template given in IANA's list of Media Types (formerly known as MIME types). Note that the expected values in the DAMS in some cases differ from the IANA specification. The expected values for JPEG and GIF images are "image/jpeg" and "image/gif" respectively, even though these values are not present in the IANA list. Example: For images in TIF (Tagged Image File) format, the expected value for internetMediaType is "image/tiff". |
Definition
The electronic file format type/extension of the object (adapted from MODS standard).
DAMS input form fields
File Format
DAMS form field name | form field type | required | MODS element | Collections Portal display | notes | |
---|---|---|---|---|---|---|
File Format | dropdown selection Values available for selection are contingent on the content model for the asset being ingested:broad class of asset being ingested. Note that the file type values listed in the web form for a broad class of asset do not always align with the list of file types supported for a particular Content model in the DAMS. For instance, not all image file types listed on the metadata web form are supported by the LARGE or BASIC IMAGE Content model. Instead, they would have to be ingested using the BINARY Content model.
| yes | Type* | The values (labels) selected for file format are merged with metadata values for Genre and Type of Resource for display on the Collections portal.N/A |
MODS Element description
Element <physicalDescription>
http://www.loc.gov/standards/mods/userguide/physicaldescription.html
Guidelines for use
<physicalDescription> is a container element that contains all subelements relating to physical description information of the resource. Data is input only within each subelement.
Attributes
Currently no attributes for <physicalDescription> are implemented in the DAMS.
Subelements
The following subelements of <physicalDescription> are used in the DAMS:
- internetMediaType
- (extent)
- (form)
Subelement <physicalDescription><internetMediaType>
Guidelines for use
The file format in subelement <internetMediaType> is specified according to the template given in IANA's list of Media Types (formerly known as MIME types): https://www.iana.org/assignments/media-types/media-types.xhtml. Note that the expected values in the DAMS in some cases differ from the IANA specification. The expected values for JPEG and GIF images are "image/jpeg" and "image/gif" respectively, even though these values are not present in the IANA list.
The DAMS software expects one of the following values for file format specification:
File format | internetMediaType value | DAMS manual ingest form label | Available for Content Model |
---|---|---|---|
Canon Raw V2 | image/x-raw | cr2 | Image |
Graphics Interchange Format | image/gif | gif | Image |
JPEG (ISO/IEC 10918-1) | image/jpeg | jpg/jpeg | Image |
Portable Network Graphics | image/png | png | Image |
Photoshop Document | application/photoshop | psd | Image |
Tagged Image File Format | image/tiff | tif/tiff | Image |
* | other | other | Image |
Audio Interchange File Format | audio/aiff | aif/aiff | Audio |
AU Sound file | audio/au | au | Audio |
MPEG-4 Audio file (typically AAC, ALAC) | audio/m4a | m4a | Audio |
MPEG-1/2 Audio Layer III | audio/mpeg | mp3 | Audio |
Waveform Audio File Format | audio/x-wav | wav | Audio |
Windows Media Audio | audio/wma | wma | Audio |
* | other | other | Audio |
Portable Document Format | application/pdf | Paged Content | |
Tagged Image File Format | image/tiff | tif/tiff | Paged Content |
* | other | other | Paged Content |
not-applicable | N/A | Paged Content | |
Comma-separated values | text/csv | csv | Text |
Office Open XML (WordprocessingML) | application/msword | docx | Text |
EPUB | text/epub | epub | Text |
Portable Document Format | application/pdf | Text | |
Rich Text Format | text/rtf | rtf | Text |
Plain text | text/txt | txt | Text |
Text file with XML content | text/xml | xml | Text |
Office Open XML (SpreadsheetML) | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | xlsx | Text |
Microsoft Excel (BIFF) | application/vnd.ms-excel | xls | Text |
* | other | other | Text |
Audio Video Interleave | video/avi | avi | Video |
Camtasia Studio recording | video/camrec | camrec | Video |
ISO | binary/iso | iso | Video |
M4V File (Apple) | video/m4v | m4v | Video |
QuickTime File Format | video/quicktime | mov | Video |
MPEG-4 Part 14 | video/mp4 | mp4 | Video |
MPEG Transport Stream (container format for different kinds of video format) | video/mts | mts | Video |
Shockwave Flash | application/x-shockwave-flash | swf | Video |
Windows Media Video | video/x-ms-wmv | wmv | Video |
* | other | other | Video |
Attributes
Element Parts | Details | XPath syntax examples |
---|---|---|
lang | values:
Enter ISO-639-2 language code (3 letters). Default value is "eng" for English.Additional information: Preferably, use ISO-639-2 codes designated "T" (terminology use). | physicalDescription/internetMediaType[@lang="eng"] |
displayLabel | value:
| physicalDescription/internetMediaType[@displayLabel="File Format"] |
Subelements
No subelements for <internetMediaType>.
XML Examples
Code Block | ||||
---|---|---|---|---|
| ||||
<physicalDescription> <internetMediaType lang="eng" displayLabel="File format">image/jpeg</internetMediaType> </physicalDescription> <physicalDescription> <internetMediaType lang="eng" displayLabel="File format">application/pdf</internetMediaType> </physicalDescriptionphysicalDescription> |
Mappings
Dublin Core
Depending on the direction of mapping necessary, check
- DC to MODS: https://www.loc.gov/standards/mods/dcsimple-mods.html
- MODS to DC: http://www.loc.gov/standards/mods/mods-dcsimple.html
The following specific guidelines apply for the DAMS:
Dublin Core field | Mapping condition | MODS element | Notes |
---|---|---|---|
dc:format | internetMediaType | Approximate mapping. Depending on the exact content, dc:format might also map to MODS elements extent or form. |
MARC 21
Multiexcerpt | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||
see https://www.loc.gov/standards/mods/mods-mapping.html#physicaldescription. The following specific guidelines apply for the DAMS:
|
Solr
In general, all MODS metadata is imported into the DAMS Solr server upon ingest. The ingest process generates Solr fields typically named according to the following schema:
mods_value*_suffix
where
- value* can be one or multiple element, subelement or attribute names that allow to distinguish Solr fields
- suffix is s, t, ss, ms or mt, which refers to the type of data stored in a Solr field and how it is indexed. The Solr index usually contains multiple copies of each field with the same content, distinguished by their suffix.
The following table shows mappings between MODS elements and Solr fields for those fields that are currently used for display in the Collections portal, or where additional processing happens in Islandora or during the publishing process. Suffixes are ignored, unless relevant for the mapping.
MODS element | Mapping condition | Solr DAMS | Solr Collections Portal | Notes | ||||||
---|---|---|---|---|---|---|---|---|---|---|
physicalDescription/ internetMediaType | mods_ typephysicalDescription_ consolidated_ms | mods_type_consolidated_ms | The value for file format is merged with metadata values for Genre and Type of Resource for display on the Collections portal (label "Type"). | physicalDescription/ internetMediaType | mods_type_consolidated_ms | display_type_ms | The value for file format is merged with metadata values for Genre and Type of Resource for display on the Collections portal (label "Type"). The Solr field display_type_ms is generated from mods_type_consolidated_ms upon publishing, to feed the "Type" facet on the Collections portal start page. The first letter of each word is capitalized.internetMediaType | N/A |