Social Media Forensics: Cyberbullying & Hate Speech Analysis Using Machine Learning And NLP

Authors

  • Syed S. Quadri · Syed Muzammil Hussaini · Omer Farooq · Mustafa Wasif Hussain btech Students Department Of Computer Science And Engineering, Lords Institute Of Engineering And Technology, Hyderabad, India Author
  • Ms. Saniya Assistant Professor Department Of Computer Science And Engineering, Lords Institute Of Engineering And Technology, Hyderabad, India Author

Keywords:

Cyberbullying Detection, Hate Speech Classification, Natural Language Processing, TF-IDF, Support Vector Machine, Social Media Forensics, Text Classification, Flask, Affective Computing

Abstract

The exponential proliferation of user-generated content on social media platforms has created an urgent need for
automated systems capable of identifying cyberbullying, hate speech, and offensive language at scale. This paper
presents a comprehensive machine learning-based web application — Social Media Forensics (SMF) — that classifies
social media text into three categories: Hate Speech, Offensive Language, and Clean Content. The system employs
Natural Language Processing (NLP) preprocessing pipelines (tokenization, stop-word removal, lemmatization)
combined with TF-IDF (Term Frequency–Inverse Document Frequency) vectorization for feature extraction. Six
supervised machine learning classifiers — Logistic Regression, Naïve Bayes, Support Vector Machine (SVM), KNearest
Neighbors (KNN), Random Forest, and Gradient Boosting — are systematically trained, evaluated, and
compared. The best-performing model achieves approximately 94% classification accuracy. The full-stack web
application is developed using Python Flask, SQLite, Bootstrap 5, and Docker containerization, incorporating user
authentication, real-time prediction with confidence scoring, analysis history tracking, and a multi-chart analytics
dashboard. Mathematical formulations of TF-IDF, Bayes theorem, SVM hyperplane optimization, and informationgain-
based ensemble methods are derived. System architecture, algorithmic pseudocode, UML diagrams, and a
comprehensive performance comparison across all six classifiers are presented.

Downloads

Published

2026-04-22

Issue

Section

Articles

How to Cite

Social Media Forensics: Cyberbullying & Hate Speech Analysis Using Machine Learning And NLP. (2026). International Journal of Engineering and Science Research, 16(2), 509-516. https://www.ijesr.org/index.php/ijesr/article/view/1652

Similar Articles

1-10 of 977

You may also start an advanced similarity search for this article.