Celebrity Insight Generation Using Word Embedding

Authors

  • M Naresh Newton’s Institute of Engineering, Guntur, Andhra Pradesh Author

Keywords:

Text Analysis, Word Embeddings, Word2Vec, FastText, Naïve Bayes Multinomial, Support Vector Machine (SVM), Random Forest

Abstract

Celebrity profiling is a specialized branch of author profiling focused on identifying attributes like gender, birth
year, fame, and occupation through textual analysis. Social media has become a platform for celebrities to share
interests and engage with fans, but it has also led to impersonation issues. To address this, researchers are
developing methods to verify whether texts are genuinely authored by celebrities and determine their profiling
characteristics. In 2019, the PAN competition introduced a celebrity profiling task, challenging participants to
predict celebrity attributes based on written texts. Researchers employed various stylistic features and machine
learning techniques for this task. Our approach leverages word embedding techniques like Word2Vec and
FastText to represent words as vectors, capturing semantic relationships. These word vectors were aggregated to
create document-level representations, which were then classified using Naïve Bayes Multinomial, Support
Vector Machine (SVM), and Random Forest algorithms. The results highlighted that combining Word2Vec with
Random Forest achieved the highest accuracy for predicting fame and occupation, showcasing the effectiveness
of advanced word embeddings and robust machine learning in celebrity profiling.

Downloads

Published

2025-01-21

How to Cite

Celebrity Insight Generation Using Word Embedding. (2025). International Journal of Engineering and Science Research, 15(1), 174-180. https://www.ijesr.org/index.php/ijesr/article/view/577

Similar Articles

11-20 of 470

You may also start an advanced similarity search for this article.