Paper Key : IRJ************217
Author: Prathamesh Lal Bahadur Yadav
Date Published: 17 Oct 2023
Abstract
Convolutional Neural Networks (CNNs) have become a potent tool in the field of computer vision, transforming image classification, object detection, and other visual recognition tasks. In this research paper, we experimented with using the CNN model for speech classification. Speech Emotion Recognition (SER) is a crucial task in the field of human-computer interaction. It finds applications in areas like mental health monitoring, customer service, and human-robot interaction. SER systems use various methodologies to process and classify speech signals to detect embedded emotions. In this research, we used the Crema dataset, which comprises audio data in .wav format, to improve accuracy. Our research demonstrates the potential of CNN in Speech Emotion Recognition. It highlights how CNNs can simplify feature engineering and improve accuracy, thereby contributing to more emotionally intelligent human-computer interactions.