DeepSense: An Explainable AI Multi-Modal Platform for Deepfake Detection Across Image, Audio, and Video

Syed Numaan, Mohammed Shameem Sarwar, Shaik Jamal, Naila Fathima

Authors

Syed Numaan, Mohammed Shameem Sarwar, Shaik Jamal, Naila Fathima B.E Department of CSE-AIML, Lords Institute of Engineering and Technology Author

Keywords:

AI

Abstract

The rapid proliferation of generative AI has given rise to highly realistic synthetic media, commonly known as
deepfakes, posing severe threats to personal identity, democratic processes, and digital trust. Existing detection
systems are predominantly uni-modal and opaque, offering little forensic evidence to support their binary
classifications. This paper presents DeepSense, a comprehensive, explainable AI-powered multi-modal deepfake
detection platform capable of concurrently analyzing static images, digital audio recordings, and video files. The
system integrates XceptionNet for image analysis, a hybrid XceptionNet+LSTM for video, and a CNN-BiLSTM
architecture for audio, achieving detection accuracies of 90.83%, 95.25%, and 98.32% respectively. Explainable
AI (XAI) techniques -- specifically Gradient-weighted Class Activation Mapping (Grad-CAM) for visual media and
high-resolution spectral feature visualization for audio -- are deeply integrated into the inference pipeline. The
Google Gemini 3.1 Flash LLM is employed to translate raw algorithmic outputs into natural-language forensic
narratives. DeepSense is deployed via an interactive Streamlit web interface, democratizing access to digital
forensics for non-technical users, journalists, and legal professionals

DeepSense: An Explainable AI Multi-Modal Platform for Deepfake Detection Across Image, Audio, and Video

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Call For Paper

Submission

MenuBar

Visitors in IJESR

Images

Indexed

Information

Reach Us

Important Links

Downloads & Indexing

Ethics & Policies