Image Captioning Using CNN and LSTM

K.RAMBABU; Bagadi Peddi Raju

Authors

K.RAMBABU (Assistant Professor), Master of Computer Applications, DNR college, Bhimavaram, Andhra Pradesh. Author
Bagadi Peddi Raju PG scholar, Department of MCA, DNR College, Bhimavaram, Andhra Pradesh Author

Keywords:

Image captioning, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM

Abstract

The Image Caption Generator introduces an innovative approach to automatically describing image content by seamlessly integrating computer vision and natural language processing (NLP). Leveraging recent advancements in neural networks, NLP, and computer vision, the model combines Convolutional Neural Networks (CNNs), specifically the pre-trained Xception model, for precise image feature extraction, with Recurrent Neural Networks (RNNs), including Long Short-Term Memory (LSTM) cells, for coherent sentence generation. Enhanced by the incorporation of a Beam Search algorithm and an Attention mechanism, the model significantly improves the accuracy and relevance of generated captions by dynamically focusing on different parts of the image and exploring multiple caption sequences. Trained on a dataset of 8,000 images from the Flickr8K dataset paired with human-judged descriptions over multiple epochs, the model achieves a significant reduction in loss. Additionally, it incorporates a text-to-speech module using the pyttsx3 library to audibly articulate the generated text from the image captions, enhancing accessibility for visually impaired individuals or users who prefer audio output. Evaluation using BLEUscoreand METEOR metrics confirms the model's proficiency in producing coherent and contextually accurate image captions, marking a significant advancement in image captioning technology.

Image Captioning Using CNN and LSTM

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Similar Articles

Call For Paper

Submission

MenuBar

Visitors in IJESR

Images

Indexed

Information

Reach Us

Important Links

Downloads & Indexing

Ethics & Policies