This page was copied from the LLILAS Benson Digital Inititaives wiki space, and much of what follows was put together by Ryan Sullivant, Digital Content Curator at AILLA.

FFmpeg to manipulate a/v media

The open-source FFmpeg engine can be used for several purposes. These include, converting between a/v containers and formats, re-encoding media with a different codec to produce smaller derivative access files, and rotating video. FFmpeg can be used as a command line tool or can be integrated into Python scripts using the ffmpeg-python package (github), which is useful for batch processes, especially if you are more comfortable using Python tools to identify which files need to be fed into FFmpeg (literal lists, reading from a file, regex filters with glob, etc.) than doing the same things with the command line, terminal, or powershell. As FFmpeg has no native MP3 encoder, another library, like LAME, may need to be installed to produce MP3 files.

Creating smaller audio files

Some digitization labs will produce uncompressed files (typically WAV format) with much greater resolution than is necessary or useful (for example, 24/96k), and are often 2 or 4 times are big as files produced in more smaller resolutions. In these cases, we can re-encode the files with one of these lower resolutions. (NB: these resolutions are ideal for recordings of speech meant for listening and acoustic speech analysis; audio files with different contents and purposes may require higher resolutions.)

The two resolution options for audio (ar) are '48000' and '44100' (the former is more common with audio associated with video files, the second for standalone audio). Audio file bitrate (ab) can be specified as well. Stereo files can be converted to mono by specifying only one audio channel (ac). This should be done in cases there the original recording was made with one microphone, which is likely the case for many field recordings originally made on tape. Some digitization workflows will produce two audio channels even from mono recordings. In these cases, slight differences in the digitization settings and condition of the tape heads may result in slightly different left and right channels despite originally having the same input.

Python: ffmpeg.input(input_file_path).output(output_file_path, ab=16, ac=1, ar=48000).run()

Cmd: ffmpeg -i input_file_path -ab 16 -ac 1 -ar 48000 output_file_path

Note that different commands may be used to alter the sizes of compressed files (like MP3).

Convert audio formats

Sometimes we receive digitized or born-digital audio in formats that are not supported by our Islandora repository. Circa 2019 at AILLA, these have included files in WMA, AIFF and OPUS (a lossly format used by WhatsApp). Staff should convert these files into the supported WAV or MP3 formats, depending on the nature and quality of the unsupported digital files. Format conversion with FFmpeg can be as simple as specifying a different extension for the output file, but additional details, such as those discussed in Creating smaller audio files, may also be specified.

Cmd: ffmpeg -i input_file_path output_file_path

Python: ffmpeg.input(input_file_path).output(output_file_path).run()

This syntax can also be used to extract MP3 audio from an MP4 file, by specifying the MP4 file as the input and the MP3 file as the output.

Rotating video files

Some video files specify their orientation in their technical metadata. Sometimes, a file is incorrectly presented as portrait when it should be in landscape orientation and vice-versa. It is possible to transpose all pixels of the video, but a simple solution is to edit the file's technical metadata (while copying all other metadata) to inform video players of the proper orientation. This is done with the -map_metadata, -metadata:s:v and -codec copy flags as below. Note that rotate specified the degrees of rotation in a counterclockwise orientation; '90' rotates counterclockwise and '270' rotates counterclockwise. It is unclear how this command (in particular the -metadata:s:v flag) can be implemented with ffmpeg-python, so this function can only be done via the command line currently.

Cmd: ffmpeg -i input_file_path -map_metadata 0 -metadata:s:v rotate="270" -codec copy output_file_path

Creating smaller video files

Some high-definition cameras will use codecs, like XAVC S, that produce very large files, in part because each frame contains many many pixels. Many of these files may be too large to ingest directly into a repository. One way to produce a smaller file that is still suitable for online viewing is to re-encode the file with a different codec, one very widely supported codec to try is H.264.

Python: ffmpeg.input(input_file_path).output(output_file_path,vcodec='h264').run()

Cmd: ffmpeg -i input_file_path -vcodec h264 output_file_path

If your video files have a high frame rate (that is, a multiple of 24, 25, or 30), then a lower frame rate can be specified with a command like -r 24 (cmd) or r='24' (python).

Commands exist for altering the size of video frames in the output, but these have not yet been explored.

Converting MP3s to AAC MP4s

To stream via the collections portal, audio must be ingested into the DAMS as an MP4 file with AAC audio. While it's normally used for video files, the MP4 extension is really just a "wrapper" to combine audio and video streams, and there' s no requirement that there even be a video stream. That means you can convert from MP3 audio directly to MP4, and as long as you specify that the output will use AAC audio, it will work with the collections portal:

ffmpeg -i C:\directory\audio_file.mp3 -acodec aac C:\directory\output.mp4

If you have a directory full of MP3 files you would like to convert to AAC MP4, you can batch the process using a simple for loop on the command line:

for /f "tokens=1 delims=." %a in ('dir /b *.mp3 C:\directory\audio_file.mp3') do ffmpeg -i C:\directory\audio_file\%a.mp3 -acodec aac C:\directory\%a.mp4

If no other options are specified, the AAC audio will match the sampling rate of the MP3 input audio.

Note that this will not delete the original MP3 file, but will create a new MP4 file alongside it. Because AAC is a better compression algorithm overall than MP3, the output files will be about 80% smaller than the input files, while achieving the same audio quality.

Compressed inputs like MP3s can be converted relatively quickly, but larger files like WAVs may take more time. You can verify that the output MP4 uses the AAC audio codec by using a tool like File Information Tool Set (FITS) to extract technical metadata. Information about the audio codec is found within the "<track type="audio"...> tags:

According to Wikipedia, versions of FFmpeg released before February 2016 used an AAC encoder that produced generally low-quality output, so it is a good idea to use more up to date releases: https://en.wikipedia.org/wiki/Advanced_Audio_Coding#FFmpeg_and_Libav

Manipulating A/V files with FFmpeg