Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

What this does

Tiered Ingest allows you to group all of the files corresponding to a simple asset's datastreams (including archival files, publication files, other derivatives created outside of Islandora, with the exception of RELS-EXT) into a sub-directory.

...

  • All of the files you are ingesting as part of one asset will be staged in one directory per asset, as a sub-directory of a batch job folder.
  • Each sub-directory corresponds to one asset and must contain at least a manifest file for the key datastreams (datastreams.txt).
  • The batch job folder can contain just one asset folder, but would still need the extra nesting


Code Block
titleSample folder structure
Code Block
eid1234_example-batch-submission/ (batch job folder)
├── asset1/
│   ├── datastreams.txt
│   ├── modsfile.xml
│	├── primaryfile.tif
│	├── anyarbitraryderivativefile.ext
│	├── anyarbitrarycomponentfile.ext
│   └── anymediaphotographfile.ext
├── asset2_audio_example/
│   ├── datastreams.txt
│   ├── modsfile.xml
│   ├── audiofile.wav
│   ├── derivative_audiofile_for_streaming.mp4 (e.g. for creating PROXY_MP4 datastream, which is required for streaming audio)
│   └── audio_transcript.txt
└──	asset3_video_example/
    ├── datastreams.txt
    ├── modsfile.xml
    ├── videofile.mp4
    ├── video_captions.vtt
    └── video_transcript.txt
    └── page02_custom_ocr.txt

...

Use 2 (two) equal signs to separate arguments and values.

Manifest Arguments

Refer to Anatomy of DAMS digital assets and Content models for a list of allowed/expected datastreams per content model. Consult with the DAMS Management Team for use cases not covered by the datastreams listed in this documentation.

Warning

DO NOT use any of the Restricted Datastream IDs.

Manifest generator script

Multiexcerpt include
MultiExcerptNamedatastreams generator script
PageWithExcerptDAMS datastreams.txt generator

Sample manifests

Code Block
languagetext
titleSample generic datastreams.txt manifest file
OBJ==primaryfile.ext
MODS==metadata.xml
# optional, if no MODS file is included, minimal metadata is automatically generated during ingest
PDF==custom.pdf
# optional
ARCHIVAL_FILE==originalversionof_primaryfile.ext
# optional, use for archival file (e.g. uncropped scan)
COMPONENT1==componentfile1.ext
COMPONENT2==componentfile2.ext
# optional, can for instance be used in cases where a primary image is stitched from multiple component images; increment for additional files in same directory
# DO NOT use for complex objects that can be modeled as paged content or Islandora component assets!
MEDIAPHOTOGRAPH1MEDIAPHOTOGRAPH==anymediaphotographfile.ext 
# optional, can be used for images documenting physical media, cases, covers, etc.; incrementuse MEDIAPHOTOGRAPH forif additionalthere filesis inone sameimage directoryonly
DERIVATIVE1MEDIAPHOTOGRAPH1==anyarbitraryderivativefileanymediaphotographfile.ext
MEDIAPHOTOGRAPH2==anymediaphotographfile.ext
# optional, usecan forbe derivativeused filesfor withimages directdocumenting descendantphysical relationshipmedia, fromcases, file designated OBJcovers, etc.; increment for additionalmultiple inimages samedocumenting directorythe # CAUTION, do not duplicate derivative files that are automatically generated by the DAMS

Manifest Arguments

The manifest may contain instructions to create the following datastreams. Refer to Anatomy of DAMS digital assets and Content models for a list of allowed/expected datastreams per content model.

Warning

DO NOT use any of the Restricted Datastream IDs.

...

Can be used for publication/series-level assets, book and issue-level assets.

If no thumbnail is provided during batch ingest, the DAMS will copy the thumbnail image of the first page of the asset to the book/issue level asset.

...

Can be used for book/issue-level assets.

Note

Use only for assets where the primary source file is a PDF document and for full text produced with pdftotext. See page _Text extraction in DAMS for details on the different text extraction/recognition methods.

...

Can be used for book/issue-level assets.

Use to add an externally created PDF document to an asset.

Info

If no page images are specified in the manifest, the DAMS will render image files from the pages of the PDF document and use these images to create page-level assets.

For digitally reformatted (scanned) content, using a PDF as a source for creating page images is strongly discouraged, as the automatically created page images are almost invariably of lower quality than the original scan images. Contact the DAMS managers for a consultation (click here to submit a DAMS service request).

For born-digital content (for instance modern PDF ebooks or PDF documents directly exported from a word processor), other content models and ingest processes will be more appropriate. Contact the DAMS managers for a consultation (click here to submit a DAMS service request).

OBJ==primaryfile.ext [designation of primary file is at digital stewardship staff discretion, in consultation with requesting content holder]

DERIVATIVE1==anyarbitraryderivativefile.ext [use for derivative file with direct descendant relationship from file designated OBJ; increment for additional in same directory]

COMPONENT1==anyarbitrarycomponentfile.ext [use for cases such as a file comprising one piece of a stitched OBJ or one page image in a pdf OBJ; increment for additional in same directory]

MEDIAPHOTOGRAPH1==anymediaphotographfile.ext [use for images documenting physical media, cases, covers, etc.; increment for additional in same directory]

MODS==metadata.xml   [use for optional included metadata file, if not included then very minimal mods will be added]

Notes:  

  • [text] should not be included in datastreams.txt file, used above for explanatory purposes only.
  • Additions beyond the standard datastream IDs shown above are allowed.  Consult with DAMS Management Team for recommendations. 

Example Ingest:

User1 in Architecture has a collection and needs to ingest their media with extra datastreams

they use ftp to upload their files to the server in a directory called batch1

fill out form as follows:

>>> Architectural Collections

Enter identifer of sub-collection that will contain your batch of assets >>> utlarch:5a4f464a-b4d5-4dd7-b2c2-4562643ac1bd

...

physical carrier(s)

Step 3: Upload batch job to Jscape

Multiexcerpt include
MultiExcerptNameBatch ingest upload
PageWithExcerptBatch ingest simple assets

Step 4: Set up collection and submit form in DAMS interface

Multiexcerpt include
MultiExcerptNamebatch ingest queue
PageWithExcerptBatch ingest simple assets