EMOTIONECHO: AI-Powered Speech Emotion Detection For Human-Machine Interaction

Authors

  • Dr. Md Nayer Associate Professor, Dept. Of CSE-AIML, Lords Institute Of Engineering And Technology Author
  • Md Ishaq Ahmed, Ahmed Khan, Mohd Rayaan Uddin Haqqani B.E Student, Dept. Of CSE-AIML, Lords Institute Of Engineering And Technology Author

Keywords:

Speech Emotion Recognition, LSTM, MFCC, Deep Learning, Human-Computer Interaction, Sentiment Analysis

Abstract

This research presents a speech emotion recognition (SER) system utilizing deep learning techniques, specifically Long Short-Term Memory (LSTM) networks, to classify emotions from audio signals. The system leverages Mel-Frequency Cepstral Coefficients (MFCC) with delta and delta-delta features for robust temporal feature extraction. Two widely used emotional speech datasets, TESS and RAVDESS, were combined to enhance model generalization across diverse voices and expressions. The audio data was preprocessed to standardize sampling rates and durations, followed by MFCC feature extraction with mean pooling over time. The LSTM model, trained on the combined dataset, classifies seven emotion classes: angry, calm, disgust, fear, happy, sad, and surprise. The proposed system achieved high accuracy, demonstrating the effectiveness of temporal feature modeling in capturing emotional cues from speech. This study highlights the significance of deep learning in voice-based sentiment analysis, with potential applications in human-computer interaction, virtual assistants, and mental health monitoring.

Published

2026-04-06

Issue

Section

Articles

How to Cite

EMOTIONECHO: AI-Powered Speech Emotion Detection For Human-Machine Interaction. (2026). International Journal of Engineering and Science Research, 16(2), 34-38. https://www.ijesr.org/index.php/ijesr/article/view/1579

Similar Articles

1-10 of 1122

You may also start an advanced similarity search for this article.