...
Multiexcerpt include | ||||
---|---|---|---|---|
|
Multiexcerpt include | ||||
---|---|---|---|---|
|
The tiered ingest batch module uses filenames to identify the files that correspond to specific datastreams. All of the files you are ingesting as one asset should go in one directory, a sub-directory of the path you identify in the queue form. Each sub-directory corresponds to one asset and must have at least a file for the "key datastreams" (datastreams.txt). This file will list the datastream ID and corresponding filename, for instance the MODS datastream (MODS.xml), OBJ datastream (ex: filename.tif for large image), or other datastreams with derivatives.
...
sample directory structure:
utlarch
batch1
set1
datastreams.txt
primaryfile.tif
anyarbitraryderivativefile.ext
anyarbitrarycomponentfile.ext
anymediaphotographfile.ext
metadata.xml
set2
datastreams.txt
primaryfile.tif
anyarbitraryderivativefile.ext
anyarbitrarycomponentfile.ext
anymediaphotographfile.ext
metadata.xml
...
Code Block |
---|
eid1234_example-batch-submission/ (batch job folder)
├── asset1/
│ ├── datastreams.txt
│ ├── modsfile.xml
│ ├── primaryfile.tif
│ ├── anyarbitraryderivativefile.ext
│ ├── anyarbitrarycomponentfile.ext
│ └── anymediaphotographfile.ext
├── asset2_audio_example/
│ ├── datastreams.txt
│ ├── modsfile.xml
│ ├── audiofile.wav
│ ├── derivative_audiofile_for_streaming.mp4 (e.g. for creating PROXY_MP4 datastream, which is required for streaming audio)
│ └── audio_transcript.txt
└── asset3_video_example/
├── datastreams.txt
├── modsfile.xml
├── videofile.mp4
├── video_captions.vtt
└── video_transcript.txt
└── page02_custom_ocr.txt |
Notes:
- set1 & set2 as shown above would be under the batch directory and each set represents an individual asset with its datastreams
- batch can be just one set but would still need the extra nesting
- there is no upper limit on number of sets/objects or filesize