Acoustic Derivation Guide
...
| Access the shared credentials to enter TACC Analysis Portal Go to https://stache.utexas.edu/ Enter with your UT credentials Click on “secret” | |||||||
1.1 | Access TACC Analysis Portal https://tap.tacc.utexas.edu/jobs/ Enter Dr. Grasso’s account information generated by Stache and log in. | |||||||
1.2 | Submit a job on TACC by clicking on the dropdowns and selecting: Lonestar 6 Jupyter Notebook DBS23006 vm-small Nodes 1; Tasks 1 Job name (can be anything, we will use AcousticDer_Practice in this tutorial guide) Time limit (1 hour should be more than enough. However, if it’s your first time following this tutorial, you may want to give yourself more time to get through it – 2 hours should be fine) Click on “submit” *Note. If you're struggling to get nodes on the vm-small queue, I'd recommend trying the development queue. This applies to everything except transcription (where you should try gpu-a100-small, followed by gpu-a100-dev, followed by gpu-a100). | |||||||
| If there are available nodes (picture A), you will be able to enter Jupiter right away. In that case, follow these steps: Click on “connect” Click on “work” Click on “acousticScripts” If there are no available nodes (picture B), you will have to wait in a queue until it’s available. | |||||||
| Click on “new” Click on “folder” Name folder (for this example, the name PostTxSamples_NFVwill be used) | |||||||
| Go into the folder you just created Click on “upload” and upload the input files If you cannot see the file you uploaded, click on “last modified” a couple of times. Sometimes it doesn’t update immediately. | |||||||
| Once the file has been uploaded, click on the New menu dropdown Click on terminal | |||||||
Trim pauses at the start and end of audio files(This script works for any type of audio file! It will output an monochannel .wav file without pauses at the start and end) | If your files are in a format OTHER than wav, convert them into wav files following the next steps: Once in the terminal, type cdw, press enter Type cdacousticScripts, press enter Type conda activate racs, press enter Type or copy the following command, then press enter: python monoTrimAudioFiles./monoAudioFiles.sh PostTxSamples_NFV/ PostTxSamples_NFV_Wav1 #Generalized command: ./monoAudioFiles.sh inputDirectory/ outputWavDirectorypy inputFolder/ outputFolder/ Keep in mind that the red and green sections change depending on the name you gave your folder (with input files) and the name you will give your folder (with output files). The commands in purple always remain the same (See picture A). Wait a few seconds (or minutes). You will know when it’s done running when you see this at the bottom of the terminal (see picture B). Then, type conda deactivate Press enter (see picture C) | The parts circled in red change depending on the name of your folder and the output folder. PostTxSamples_NFVinpuFolder/is the name I gave my folder. Your command will change depending on the name of your folder. Remember that the exact orthography must match. If the name of the file that you uploaded is all in lowercase, the command in the terminal must be all in lowercase. PostTxSamples_NFV_Wav1 outputFolder/ is the name I gave to my output folder. You can change this part of the command depending on the name you want to give your output file. | ||||||
| Then, type conda activate textGridPauseSyllable Press enter mkdir PostTxSamples_NFV_TextGrids1 Then, enter the command below, press enter: python derivePraatFeatsAndTextGrids_CAC.pyPostTxSamples_NFV_Wav1/ PostTxSamplesNFVPostTxSamples_NFV_TextGrids1/ #Generalized command: python derivePraatFeatsAndTextGrids_CAC.py wavDirectoryinputDirectory/ outputNametextGridDirectory1/ Again, the sections in purple will remain the same while the parts in red, green, and blue may change. The part in red needs to match the command that you entered in step 6 (name of output folder). The 2 green sections in this step need to match as they are the same code. The part in blue can also change to whatever you decide. Just note that green and blue will be repeated in step 8, and they need to match in both steps. *If your files were originally WAV file and you didn’t need to do step 6, then the section in red will be the same name that you gave the folder where your input files are located. | Note: The command mkdir PostTxSamples_NFV_TextGrids1can change (green section only). This code creates a new directory. Whatever you type, make sure it matches the final command (the long one at the end of step 7). You should see this when the command is done running: | ||||||
| Now, type the following command, then press enter: python derivePauseDurationStatsFromTextGrid_CAC.pyPostTxSamples_NFV_TextGrids1/ PostTxSamples_NFV #Generalized command: python derivePauseDurationStatsFromTextGrid_CAC.pytextGridDirectory1 outputName The code in green and blue will change depending on what you typed in previous steps *they need to match what you typed previously* |
You will see this when it is done running: | ||||||
| Go back to the working directory. If the generated files did not pop up, click on “last modified,” sometimes it takes a minute to update. You will see 2 output files. Check the boxes and click on download. You may need to download one by one. | |||||||
| Go back to the terminal and run the following command: Log out from the terminal by typing the command logout(see picture A) Press enter Go back to the original TACC page and click on “end job”(see picture B). IT IS VERY IMPORTANT TO END THE JOB AS THE NODES ARE VERY LIMITED. THE JOB WILL KEEP RUNNING UNLESS YOU COMPLETE THIS STEP. | |||||||
11. Merge the files | Once the files have been downloaded. Merge the columns to create a single file. On file A: you can see 7 features: # of syllables, speech rate, average syllable duration, articulation rate, speech-to-pause ratio, time, # of pauses (Picture A). On file B, you can see two features: Mean Pause Duration and Variability of Pause Duration (Picture B). Manually copy and paste these two features into file A to form one spreadsheet with 9 features (see picture). This is the final file that contains the acoustic derivations of the samples. |