media presentation and may include receiving an audio component and a subtitle com
ponent associated with a media presentation, the audio component including an audio
sequence, the audio sequence divided into a plurality of audio segments; evaluating the
plurality of audio segments using a combination of a recur rent neural network and a
convolutional neural network to identify refined speech segments associated with the audio …