Metadata records can contain varying levels of multilingual content, ranging in both number of languages represented and number of fields translated. Even if a record only has a single field in a language other than English, it still contains multilingual metadata, represented in the content and in the metadata describing that language.
Example of a record with non-roman script; only names in a second language
Example of a record fully translated, using roman script
For titles, proper nouns (publishers, organizations, etc.):
generally do not translate, keep original, for non-roman script, can provide transliterated option but should not be the first item in the record.
For other kinds of materials:
Diacritics are marks or symbols that show the phonetic value of a letter. (Ex. "á" or "ñ")
Character encoding is a process of making those letters machine-readable, assigning numerical codes. The majority of UTL's metadata uses UTF-8 character encoding. When creating metadata spreadsheets or XML, ensure that your character encoding is UTF-8. This will allow your diacritics to be displayed.
To check the character encoding in an Excel spreadsheet, navigate to the "Save As" option, then choose "Web Options" in the "Tools" dropdown menu.
UTF-8 encoding menu in Excel (Tools→Web Options)
For XML editors like Oxygen, navigating to Preferences→ Encoding will allow you to check the encoding of the document (UTF-8 by default.)
For Google Sheets, documents are in UTF-8 by default.
Languages used in metadata fields are modeled in different ways, based on the metadata schema. Below are a few examples. For more guidance on where to source Language information for the UTL DAMS, see the "Assets" section of the wiki. Currently,ISO 639-2 codes are advised, but 639-3 codes are also used when a language is unavailable.
<profiledesc>
<creation encodinganalog="500">Text converted and initial EAD tagging provided by Apex Data Services,<date era="ce"calendar="gregorian">July 2001.</date>
</creation>
<language>Finding aid written in <language langcode="eng" scriptcode="Latn">English.</language>
</language>
</profiledesc>
<langmaterial label="Language:" encodinganalog="546$a"><language encodinganalog="546" langcode="eng" scriptcode="Latn">English</language> </langmaterial>
In EAD, the language of the finding aid and the language of the material are both represented in separate fields.
<language>
<languageTerm type="text" lang="eng">Western Highland Purepecha</languageTerm>
<languageTerm type="code" authority="iso639-3" authorityURI="https://iso639-3.sil.org/code_tables/639/data">pua</languageTerm> </language>
<languageOfCataloging usage="primary"> <languageTerm type="code" authority="iso639-2b" authorityURI="http://id.loc.gov/vocabulary/iso639-2">eng</languageTerm>
</languageOfCataloging>
<subject lang="spa"> <topic>acuarelas (obra visual)</topic> </subject>
In MODS, the language of the finding aid and the language of the material are both represented in separate fields. Language tags are also assigned to individual fields to allow for greater flexibility.
{"typeName":"language","multiple":true,"typeClass":"controlledVocabulary","value":["Spanish, Castilian"]}
In this JSON example record, the language of the material is represented in one field.
SHORT TEXT ON ACCESSIBILITY AND IMPORTANCE FOR DIVERSE COLLECTIONS AND DIVERSE USERS; WHY IS TRANSLATION VALUABLE
White Paper on Best Practices for Multilingual Access to Digital Libraries
RESOURCES ON METADATA TRANSLATION
ISO-639 code tables (combines parts 1, 2, and 3); ISO-639-2 contains just part 2
Getty Vocabularies guides to contributing multilingual metadata (Guide 1, Guide 2)
List of Getty Vocabularies translation projects from the International Terminology Working Group
VIAF (Virtual International Authority File)
PANA (Pan-American Authorities project) from the University of Florida Libraries (contains large list of Spanish language thesauri/controlled vocabularies)
resource for translator pricing??
ISLANDORA 8 TRANSLATION MULTILINGUAL SUPPORT
Step-by-step guide to full multilingual site configuration
Drupal guide to multilingual content
Example of a record in Islandora 8 that has multilingual fields (in JSON-LD)
EXAMPLES OF TRANSLATED DIGITAL COLLECTIONS SITES AT UT LIBRARIES
Archive of the Indigenous Languages of Latin America (AILLA)
Human Rights Documentation Initiative (HRDI)
Latin American Digital Initiatives (LADI)
Primeros Libros de las Américas