ENHANCED SENTIMENT ANALYSIS WITH BIDIRECTIONAL ENCODER REPRESENTATIONS FROM TRANSFORMERS
Keywords:
Sentiment Classification, Data Analytics, BERT Framework, Neural Language Models, Fine-tuning, Natural Language Processing (NLP), TopicBERT, Aspect Topic Prediction (ATP), SemEval 2014 Task 4, Direct Augmentation.Abstract
Sentiment classification is a kind of data analytics in which data is mined to extract people's sentiments
and opinions regarding something. However, with the recent development of the BERT framework and its pretrained
neural language models, sentiment classification has seen newfound success. They are adequate models for
certain natural language processing tasks right out of the box. Most models, however, are fine-tuned using domainspecific
information to improve accuracy and usefulness. Motivated by the idea that more fine-tuning would
increase performance for downstream sentiment classification tasks, we developed TopicBERT—a BERT model
fine-tuned to recognize topics at the corpus level in addition to the word and sentence levels. TopicBERT comprises
two variants: TopicBERT-ATP (aspect topic prediction), which captures topic information via an auxiliary training
task, and TopicBERT-TA, where topic representation is directly injected into a topic augmentation layer for
sentiment classification. With TopicBERT-ATP, the topics are predetermined by an LDA mechanism and collapsed
Gibbs sampling. With TopicBERT-TA, the topics can change dynamically during the training. Experimental results
show that both approaches deliver the stateof-the-art performance in two different domains with SemEval 2014 Task
4. However, in a test of methods, direct augmentation outperforms further training. Comprehensive analyses in the
form of ablation, parameter, and complexity studies accompany the results.










