3. Whisper transcription process

 

 

How to process audios through whisper

1. Access TACC Account

Access the shared credentials to enter TACC Analysis Portal

 

 

 

1.1

If you are accessing TACC for the first time, please go here: https://tap.tacc.utexas.edu/

 

If you have already logged into TACC before, please access the Analysis Portal here: https://tap.tacc.utexas.edu/jobs/

Enter Dr. Grasso's account information generated on Stache and log in.

 

 

 

 

1.2

Submit a job on TACC by clicking on the dropdowns and selecting:

  • Lonestar 6

  • Jupiter Notebook

  • DBS23006

  • gpu-a100-small

  • Nodes 1; Tasks 1

  • Job name (can be anything)

  • Time limit (2 hours is recommended when downloading several or lengthy audio files)

  • Click on "submit"

    *Note: If you're struggling to get nodes on the gpu-a100-small queue, try by gpu-a100-dev, followed by gpu-a100


 

 

 

2. Enter Jupiter

If there are available nodes (picture A), you will be able to enter Jupiter right away. In that case, follow these steps:

  • Click on "connect"

  • Click on "work"

  • Click on whisperRuns


    If there are no available nodes (picture B), you will have to wait in a queue until it's available.


 

BA

 

 

3. Set up folder

  • Click on "new"

  • Click on "folder"

  • Name folder (for this example, the name Feb09Spanish_WAB will be used)


    This step is only needed when running several files at the same time.


 

 

 

4. Upload audio files

  • Go to the file you created (Feb09Spanish_WAB)

  • Create another file with the name "input"

  • Go into the folder and click on "upload" to upload audio files.  It is important not to leave spaces between words when naming the audio eg: BISE016_post_1_Cat instead of 

    BISE016_post 1_Cat because the program will not find the file. 

  • Remember that you must have downloaded and stored the audio files prior to this step

    If running only 1 file, upload your file directly in the folder whisperRuns



 



-------------------------------------------------------------------------------------------------





 

 

5. Go to the terminal

  • Once your files have been uploaded, click on the new menu dropdown

  • Click on terminal




 

 

 

6. Type the command (if running several files at once)

Once you are in the terminal, follow these steps (if running several audio files):

  • Type cdw, press enter

  • Type cd whisperRuns, press enter

  • Type cd Feb09Spanish_WAB, press enter

  • Type conda activate runWhisper, press enter

  • Type command below, then press enter:



    whisper --model large-v3 --language Spanish --output_format txt --device cuda --hallucination_silence_threshold 8 --output_dir Feb09Output input/*

 




The commands in red change depending on:

  • Language of the audio sample, in this case Spanish.

  • Feb09Output is the name that we are giving to the folder where the generated files will be stored. You can change this part of the command depending on the name you want to give the folder.

  • Input is the name of the folder that we previously created (where our input audio files are saved). Remember that the exact ortography must match. If the name of the folder is all in lowercase, the command in the terminal must be in all lowercase.

 

 

Type the command (if running single file)

If running 1 single file, follow these steps:

  • Type cdw, press enter

  • Type cd whisperRuns, press enter

  • Type conda activate runWhisper

  • Press enter

Type command below, then press enter:

whisper --model large-v3 --language Spanish --output_format txt --device cuda --hallucination_silence_threshold 8 --output_dir BISE004_Output BISE004_CatRescue.wav

The commands in bold change depending on:

  • Language of the audio sample, in this case Spanish.

  • BISE004_Output is the name that we are giving to the output file. You can change this part of the command depending on the name you want to give the file.

  • BISE004_CatRescue.wav is the name of the file that we previously uploaded. The command will change depending on the file being transcribed and its name.

    Remember that the exact ortography must match. If the name of the folder is all in lowercase, the command in the terminal must be in all lowercase.

 

 

 

 

 

7. Whisper Running

Depending on the number of files, they may take some time to run. You will see it transcribing in real time and will know when it's done running when you see this at the bottom of the terminal (see picture).

 

 

 

8. Find output files

Once the files are finished running, follow the next steps:

  • Go back to the notebook

  • Click on the refresh button (if needed)

    If the generated files did not pop up, click on "last modified".
    If single file, you will find the output file within the whisperRuns folder (see picture A).
    If multiple files, you will find the output files within the Feb09Spanish_WAB folder (see picture B). You should see a subfolder called Feb09Output where your transcriptions will be stored.

 

A



B

 

 

9. Download output files

  • Download your files

  • You may need to download one by one (if several files)



 

 

 

10. Clear Cache from TACC

Run command to clear cache. This is an important step because TACC will not allow you to start in the future if the home directory exceeds 9GB (the cache directory is in the home directory).

  • Type rm -r ~/.cache/ on the terminal

  • Press enter

 

 

 

 

11. Log out

  • Logout from the terminal by typing the command "logout" (see picture A).






  • Go back to the original TACC page and click on "end job"


    IT IS VERY IMPORTANT TO END THE JOB AS THE NODES ARE VERY LIMITED. IT WILL KEEP RUNNING UNLESS YOU DO THIS STEP.

 

A
B

 

 

12. Upload files to Box

Go to this link: https://utexas.box.com/s/ghd8ho1ciko1n2le94u796cnhcf57rgw
The link will show you the hierarchy (task steps and box folders)
As you can see on the hierarchy, we want to be on the Connected Speech Data folder: https://utexas.box.com/s/uz6206lel0544c3auou0nsw57egcv5q6
Once in this folder, go to the following sub-folders:

  1. Therapy trial

  2. Spanish (choose language depending on your transcription)

  3. Clipped audio of tasks

  4. S1_CatRescue_PictureStoryDescription

  5. Finally, upload the txt files (Whisper output) in this folder: 1. CatRescue_whisper_output

    Keep in mind that the folders highlighted in purple will change depending on your transcription task. Cat Rescue Picture Description was chosen for the example. However, you will need to choose the appropriate folder name depending on the type of file you are running through Whisper. For example, if you are running a Cat Rescue Recall task, then the folders would be:
     

    1. Therapy trial

    2. Spanish

    3. S1_CatRescue_Recall

    4. CatRescueRecall_whisper_output




 




 

 

13. Delete files from TACC

Once the files have been uploaded to Box, delete them from TACC.

 

 

 

 

Update the status of whisper transcriptions in Smartsheet


What should I prioritize? This is the hierarchy:

1. Run Whisper R01 samples or non-Barcelona samples (CONNECTED SPEECH)

Picnic Scene Spa, Cat

2. Run Whisper R01 samples or non-Barcelona samples (VISTA PROBES) 

3. Run Whisper R01 samples or non-Barcelona samples (CONNECTED SPEECH) other samples,

order of priority according to column more on the left of the smartsheet reports

4. Run non-r01, eval básica, screening samples (CONNECTED SPEECH)

Related pages