Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Acoustic Derivation Guide

Contents

Table of Contents


Anchor
_Toc168315553
_Toc168315553
1. Access TACC Account

Access the shared credentials to enter TACC Analysis Portal

Go to https://stache.utexas.edu/

Enter with your UT credentials

Click on “secret”

1.1

Access TACC Analysis Portal https://tap.tacc.utexas.edu/jobs/

Enter Dr. Grasso’s account information generated by Stache and log in.

1.2

Submit a job on TACC by clicking on the dropdowns and selecting:

Lonestar 6

Jupyter Notebook

DBS23006

vm-small

Nodes 1; Tasks 1

Job name (can be anything, we will use AcousticDer_Practice in this tutorial guide)

Time limit (1 hour should be more than enough. However, if it’s your first time following this tutorial, you may want to give yourself more time to get through it – 2 hours should be fine)

Click on “submit”

*Note. If you're struggling to get nodes on the vm-small queue, I'd recommend trying the development queue. This applies to everything except transcription (where you should try gpu-a100-small, followed by gpu-a100-dev, followed by gpu-a100).

image-20240603-192512.png

Anchor
_Toc168315554
_Toc168315554
2. Enter Jupyter

If there are available nodes (picture A), you will be able to enter Jupiter right away. In that case, follow these steps:

Click on “connect”

Click on “work”

Click on “acousticScripts”

If there are no available nodes (picture B), you will have to wait in a queue until it’s available.

image-20240603-192617.png

image-20240603-192812.png

Anchor
_Toc168315555
_Toc168315555
3. Set up folder

Click on “new”

Click on “folder”

Name folder (for this example, the name PostTxSamples_NFV ACtrial will be used)

image-20240603-192948.png

Anchor
_Toc168315556
_Toc168315556
3. Upload input files

Go into the folder you just created

Click on “upload” and upload the input files

If you cannot see the file you uploaded, click on “last modified” a couple of times. Sometimes it doesn’t update immediately.

Anchor
_Toc168315557
_Toc168315557
5. Go to the terminal

Once the file has been uploaded, click on the New menu dropdown

Click on terminal

image-20240603-193051.png

Anchor
_Toc168315558
_Toc168315558
6.

Convert files into wav files

Trim pauses at the start and end of audio files

(This script works for any type of audio file! It will output an monochannel .wav file without pauses at the start and end)

If your files are in a format OTHER than wav, convert them into wav files following the next steps:

Once in the terminal, type cdw, press enter

Type cdacousticScripts, press enter

Type conda activate racs, press enter

Type or copy the following command, then press enter:

python monoTrimAudioFiles./monoAudioFiles.sh PostTxSamples_NFV/ PostTxSamples_NFV_Wav1 #Generalized command: ./monoAudioFiles.sh inputDirectory/ outputWavDirectorypy

ACtrial/ ACtrialTrimmed/

Keep in mind that the red and green sections change depending on the name you gave your folder (with input files) and the name you will give your folder (with output files). The commands in purple always remain the same (See picture A).

Wait a few seconds (or minutes). You will know when it’s done running when you see this at the bottom of the terminal (see picture B).

Then, type conda deactivate

Press enter (see picture C)

image-20240603-193200.png

The parts circled in red change depending on the name of your folder and the output folder.

PostTxSamples_NFVACtrial/is the name I gave my folder. Your command will change depending on the name of your folder. Remember that the exact orthography must match. If the name of the file that you uploaded is all in lowercase, the command in the terminal must be all in lowercase.

PostTxSamples_NFV_Wav1 ACtrialTrimmed/ is the name I gave to my output folder. You can change this part of the command depending on the name you want to give your output file.

image-20240603-193228.png

Anchor
_Toc168315559
_Toc168315559
7. Derive first acoustic parameters

Then, type conda activate textGridPauseSyllable

Press enter

mkdir PostTxSamples_NFV_TextGrids1

Then, enter the command below, press enter:

python derivePraatFeatsAndTextGrids_CAC.pyPostTxSamples_NFV_Wav1/ PostTxSamplesNFVPostTxSamples_NFV_TextGrids1/ ACtrialTrimmed/ ACtrialOutputOneeACtrialTextGrids/

#Generalized command: python derivePraatFeatsAndTextGrids_CAC.py wavDirectoryinputDirectory/ outputNametextGridDirectory1/

Again, the sections in purple will remain the same while the parts in red, green, and blue may change. The part in red needs to match the command that you entered in step 6 (name of output folder). The 2 green sections in this step need to match as they are the same code. The part in blue can also change to whatever you decide. Just note that green and blue will be repeated in step 8, and they need to match in both steps.

*If your files were originally WAV file and you didn’t need to do step 6, then the section in red will be the same name that you gave the folder where your input files are located.

**If you encountered an error in this step, it is possible that there is a “hidden” file that is causing another error. This error will look like:

Sound not read from sound file “/work/09424/smgrasso1/ls6/acousticScripts/TRIMMED_FOLDER/.ipynb_checkpoints”.

If you get this error, you need to delete this file by typing in:

rm -rf TRIMMED_FOLDER/.ipynb_checkpoints

image-20240603-193345.png

Note: The command mkdir PostTxSamples_NFV_TextGrids1 ACtrialTextGrids can change (green section only). This code creates a new directory. Whatever you type, make sure it matches the final command (the long one at the end of step 7).

You should see this when the command is done running:

image-20240603-193443.png

Anchor
_Toc168315560
_Toc168315560
8. Derive second acoustic parameters

Now, type the following command, then press enter:

python derivePauseDurationStatsFromTextGrid_CAC.pyPostTxSamples_NFV_TextGrids1/ ACtrialTextGrids/ ACtrialOutputTwo

PostTxSamples_NFV #Generalized command: python derivePauseDurationStatsFromTextGrid_CAC.pytextGridDirectory1 outputName

The code in green and blue will change depending on what you typed in previous steps *they need to match what you typed previously*

image-20240603-193615.png

You will see this when it is done running:

image-20240603-193552.png

Anchor
_Toc168315561
_Toc168315561
9. Download output files

Go back to the working directory. If the generated files did not pop up, click on “last modified,” sometimes it takes a minute to update.

You will see 2 output files. Check the boxes and click on download. You may need to download one by one.

image-20240603-193649.png

Anchor
_Toc168315562
_Toc168315562
10. Log out and end job

Go back to the terminal and run the following command:

Log out from the terminal by typing the command logout(see picture A)

Press enter

Go back to the original TACC page and click on “end job”(see picture B).

IT IS VERY IMPORTANT TO END THE JOB AS THE NODES ARE VERY LIMITED. THE JOB WILL KEEP RUNNING UNLESS YOU COMPLETE THIS STEP.

image-20240603-193720.png

image-20240603-193749.png

Anchor
_Toc168315563
_Toc168315563

11. Merge the files

Once the files have been downloaded. Merge the columns to create a single file.

On file A: you can see 7 features: # of syllables, speech rate, average syllable duration, articulation rate, speech-to-pause ratio, time, # of pauses (Picture A).

On file B, you can see two features: Mean Pause Duration and Variability of Pause Duration (Picture B).

Manually copy and paste these two features into file A to form one spreadsheet with 9 features (see picture).

This is the final file that contains the acoustic derivations of the samples.

image-20240603-194127.png

image-20240603-194141.png

image-20240603-194201.png