作者
Tobias Raphael Spiller, Finn Rabe, Ziv Ben-Zion, Nachshon Korem, Achim Burrer, Philipp Homan, Ilan Harpaz-Rotem, Or Duek
发表日期
2023
出版商
OSF
简介
Background
Transcription of audio files in mental health research has historically been labor-intensive and prone to error. The advent of advanced language models, such as Whisper AI, presents an opportunity to optimize the transcription process while addressing privacy and Institutional Review Board (IRB) concerns.
Methods
We provide a comprehensive tutorial on implementing a transcription pipeline using Whisper AI for psychology, psychiatry, and neuroscience research. The pipeline includes setting up the system, recording, preprocessing, transcribing, and post-processing audio data. A detailed example demonstrates the application of Whisper AI in a Python environment, guiding users through the necessary steps to initialize the model, transcribe audio files, and save the results.
Results
The provided example demonstrates the effectiveness of Whisper AI in transcribing a 1-minute audio file with only minor inconsistencies.
Conclusions
Besides its limitations, the implementation of Whisper AI for transcription in mental health research can dramatically reduce the time-intensive work invested in transcription and facilitate the analysis of audio data. This tutorial empowers researchers to make informed decisions about incorporating AI-driven transcription into their research methodologies and harness the full potential of audio data in their studies.
引用总数