Acoustic Derivation Guide

1 Contents
2 1. Access TACC Account
3 2. Enter Jupyter
4 3. Set up folder
5 3. Upload input files
6 5. Go to the terminal
7 6. Trim pauses at the start and end of audio files
8 7. Derive first acoustic parameters
9 8. Derive second acoustic parameters
10 9. Download output files
11 10. Log out and end job
12
13 11. Merge the files

1. Access TACC Account

Access the shared credentials to enter TACC Analysis Portal

Go to https://stache.utexas.edu/

Enter with your UT credentials

Click on “secret”

1.1

Access TACC Analysis Portal https://tap.tacc.utexas.edu/jobs/

Enter Dr. Grasso’s account information generated by Stache and log in.

1.2

Submit a job on TACC by clicking on the dropdowns and selecting:

Lonestar 6

Jupyter Notebook

DBS23006

vm-small

Nodes 1; Tasks 1

Job name (can be anything, we will use AcousticDer_Practice in this tutorial guide)

Time limit (1 hour should be more than enough. However, if it’s your first time following this tutorial, you may want to give yourself more time to get through it – 2 hours should be fine)

Click on “submit”

*Note. If you're struggling to get nodes on the vm-small queue, I'd recommend trying the development queue. This applies to everything except transcription (where you should try gpu-a100-small, followed by gpu-a100-dev, followed by gpu-a100).

2. Enter Jupyter

If there are available nodes (picture A), you will be able to enter Jupiter right away. In that case, follow these steps:

Click on “connect”

Click on “work”

Click on “acousticScripts”

If there are no available nodes (picture B), you will have to wait in a queue until it’s available.

3. Set up folder

Click on “new”

Click on “folder”

Name folder (for this example, the name ACtrial will be used)

3. Upload input files

Go into the folder you just created

Click on “upload” and upload the input files

If you cannot see the file you uploaded, click on “last modified” a couple of times. Sometimes it doesn’t update immediately.

5. Go to the terminal

Once the file has been uploaded, click on the New menu dropdown

Click on terminal

6. Trim pauses at the start and end of audio files

(This script works for any type of audio file! It will output an monochannel .wav file without pauses at the start and end)

If your files are in a format OTHER than wav, convert them into wav files following the next steps:

Once in the terminal, type cdw, press enter

Type cd acousticScripts, press enter

Type conda activate racs, press enter

Type or copy the following command, then press enter:

python monoTrimAudioFiles.py

ACtrial/ ACtrialTrimmed/

Keep in mind that the red and green sections change depending on the name you gave your folder (with input files) and the name you will give your folder (with output files). The commands in purple always remain the same (See picture A).

Wait a few seconds (or minutes). You will know when it’s done running when you see this at the bottom of the terminal (see picture B).

Then, type conda deactivate

Press enter (see picture C)

The parts circled in red change depending on the name of your folder and the output folder.

ACtrial/ is the name I gave my folder. Your command will change depending on the name of your folder. Remember that the exact orthography must match. If the name of the file that you uploaded is all in lowercase, the command in the terminal must be all in lowercase.

ACtrialTrimmed/ is the name I gave to my output folder. You can change this part of the command depending on the name you want to give your output file.

7. Derive first acoustic parameters

Then, type conda activate textGridPauseSyllable

Then, enter the command below, press enter:

python derivePraatFeatsAndTextGrids_CAC.py ACtrialTrimmed/ ACtrialOutputOnee ACtrialTextGrids/

#Generalized command: python derivePraatFeatsAndTextGrids_CAC.py inputDirectory/ outputName textGridDirectory1/

Again, the sections in purple will remain the same while the parts in red, green, and blue may change. The part in red needs to match the command that you entered in step 6 (name of output folder). The 2 green sections in this step need to match as they are the same code. The part in blue can also change to whatever you decide. Just note that green and blue will be repeated in step 8, and they need to match in both steps.

*If your files were originally WAV file and you didn’t need to do step 6, then the section in red will be the same name that you gave the folder where your input files are located.

**If you encountered an error in this step, it is possible that there is a “hidden” file that is causing another error. This error will look like:

Sound not read from sound file “/work/09424/smgrasso1/ls6/acousticScripts/TRIMMED_FOLDER/.ipynb_checkpoints”.

If you get this error, you need to delete this file by typing in:

rm -rf TRIMMED_FOLDER/.ipynb_checkpoints

Note: ACtrialTextGrids can change (green section only). This code creates a new directory. Whatever you type, make sure it matches the final command (the long one at the end of step 7).

You should see this when the command is done running:

8. Derive second acoustic parameters

Now, type the following command, then press enter:

python derivePauseDurationStatsFromTextGrid_CAC.py ACtrialTextGrids/ ACtrialOutputTwo

PostTxSamples_NFV #Generalized command: python derivePauseDurationStatsFromTextGrid_CAC.py textGridDirectory1 outputName

The code in green and blue will change depending on what you typed in previous steps *they need to match what you typed previously*

You will see this when it is done running:

9. Download output files

Go back to the working directory. If the generated files did not pop up, click on “last modified,” sometimes it takes a minute to update.

You will see 2 output files. Check the boxes and click on download. You may need to download one by one.

10. Log out and end job

Go back to the terminal and run the following command:

Log out from the terminal by typing the command logout (see picture A)

Press enter

Go back to the original TACC page and click on “end job” (see picture B).

IT IS VERY IMPORTANT TO END THE JOB AS THE NODES ARE VERY LIMITED. THE JOB WILL KEEP RUNNING UNLESS YOU COMPLETE THIS STEP.

11. Merge the files

Once the files have been downloaded. Merge the columns to create a single file.

On file A: you can see 7 features: # of syllables, speech rate, average syllable duration, articulation rate, speech-to-pause ratio, time, # of pauses (Picture A).

On file B, you can see two features: Mean Pause Duration and Variability of Pause Duration (Picture B).

Manually copy and paste these two features into file A to form one spreadsheet with 9 features (see picture).

This is the final file that contains the acoustic derivations of the samples.

MADR Lab Wiki

8. Acoustic Derivations Guide

Contents