GSoC Red Hen Lab — Week 2

3 min readJun 20, 2022

Pipeline for Multimodal Television Show Segmentation

Hi, welcome back! If you haven’t already read the introductory article based on the community bonding period then I strongly recommend you do so before continuing.

It’s currently week two of Google Summer of Code and so far I’ve been having a blast setting up my project. During the first week I got myself familiarized with the CWRU HPC clusters and developed the baseline code for music classification.

In order to accomplish this task I set out to utilize the inaSpeechSegmenter library. At an early stage, Frankie help point out that the library was not optimal to run on the Rosenthal collection. One of the major targets for this week is to adapt this library to optimally run on batches of 8 hour long mp4 files asynchronously.

Goals for the week

Modify slurm job to copy source code and batch of mp4 files to the temporary directory on the allocated gpu node cluster.
Modify the featGenerator method of InaSpeechSegmenter to process multiple files, break them down into segments of 45min and segment them into music/noise/speech.
If 1 & 2 are accomplished, then proceed to save the outputs back to the gallina home directory.

Difficulties along the way

One of the most troublesome problems during this week was with setting up the singularity image on HPC login node and utilizing the singularity container on the GPU nodes. Since the home directory on the HPC is quite limited, I setup a symbolic link to point to my gallina home directory. However, this ended up causing a load of issues when trying to utilize singularity within a GPU node. Since the GPU nodes don’t have access to the gallina directory, the .singularity folder was nowhere to be found. Ultimately, this inhibited me to utilize the GPU nodes.

Conclusion

I was able to successfully write a bash script to dynamically load in the mp4 files based on batch index value.

n=$SLURM_ARRAY_TASK_ID
i=0
allFiles=()
while IFS= read -r line; do
    if [ $i -eq $n ]  
    then  
        echo i equal to n  
        allFiles+=($line)        for f in ${allFiles[@]}; do   
            echo $f
            rsync -az hpc3:${f} /tmp/$USER/mtvss/data/tmp/video_files  
        done 
    fi 
    i=$((i+1))
done < /tmp/$USER/mtvss/data/tmp/batch_cat1.txt

As shown above, it reads the batch_cat1.txt file line by line. If the line read is equal to the index (array task id), it loops through all the files and uses rysnc to load them to /tmp

In the end, I was able to make significant changes to the media2feats module to process files in 45 minute segments. However, the process in which I was starting and joining the threads was still suboptimal. Frankie suggested using a queue data structure to store the processed 45 min segments. This queue can then be used to yield the features (mfcc, loge) for segmentation. This multithreaded speedup would substantially increase the rate at which the files are processed.

GSoC Red Hen Lab — Week 2

Pipeline for Multimodal Television Show Segmentation

Goals for the week

Difficulties along the way

Conclusion

Written by Harshith Mohan Kumar

No responses yet