Car Insurance Claim Prediction Using Machine Learning: A Comparative Study of Ensemble and Classical Classifiers

Authors

  • Shadman Ahmad, Shaik Sufyaan Ahmed, Syed Affan Hussain Syed Barey, Syed Jawad BTech Students Department of Computer Science and Engineering, Lords Institute of Engineering and Technology, Hyderabad, India Author
  • Md. Dilwar Alam Assistant Professor Department of Computer Science and Engineering, Lords Institute of Engineering and Technology, Hyderabad, India Author

Keywords:

Car Insurance Claim Prediction, Gradient Boosting, Random Forest, SVM, Logistic Regression, Flask, scikit-learn, SQLite, Chart.js, Bootstrap 5, Docker, Class Imbalance, Feature Engineering

Abstract

This paper presents a comprehensive web-based machine learning platform for predicting car insurance claim
outcomes using four classification algorithms: Gradient Boosting, Random Forest, Support Vector Machine (SVM),
and Logistic Regression. Insurance companies face the dual challenge of class imbalance (≈74% no-claim, ≈26%
claim) and non-linear interactions between policyholder features that defeat traditional actuarial GLM models. The
system processes a synthetic dataset of 10,000 policy records across 17 features—including driving experience, credit
score, annual mileage, speeding violations, DUIs, and past accidents—after applying a two-stage preprocessing
pipeline: LabelEncoding for 9 categorical variables and StandardScaler normalization for 8 numeric features,
followed by a stratified 80/20 train-test split. Gradient Boosting achieved the highest performance (Accuracy =
91.95%, Precision = 89.45%, Recall = 78.27%, F1 = 83.49%), surpassing SVM (91.10%), Logistic Regression
(90.25%), and Random Forest (90.10%). The full-stack Flask application integrates scikit-learn inference, SQLitebacked
authentication (Werkzeug PBKDF2-SHA256 hashing), prediction history, Chart.js analytics dashboards, and
Docker containerization—all within a Bootstrap 5 dark-themed UI. This article details the mathematical foundations
of all four algorithms, the end-to-end system architecture, the ML pipeline, algorithmic pseudocode, and rigorous
results analysis with comparative tables and performance graphs.

Downloads

Published

2026-04-22

Issue

Section

Articles

How to Cite

Car Insurance Claim Prediction Using Machine Learning: A Comparative Study of Ensemble and Classical Classifiers. (2026). International Journal of Engineering and Science Research, 16(2), 536-543. https://www.ijesr.org/index.php/ijesr/article/view/1655

Similar Articles

1-10 of 525

You may also start an advanced similarity search for this article.