ISSN:2582-5208

www.irjmets.com

Paper Key : IRJ************283
Author: Sahadev Bhaganagare,U. Mukesh Gopinandh,P. Nithin Kumar,Dr. K. Suresh,K. Chandusha
Date Published: 03 Apr 2024
Abstract
The process of generating a textual description for images is known as image captioning. Nowadays it is one of the recent and growing research problems. Day by day various solutions are being introduced for solving the problem. Even though many solutions are already available, a lot of attention is still required for getting better and precise results. So, we came up with the idea of developing an image captioning model using different combinations of Convolutional Neural Network architecture along with Long Short Term Memory in order to get better results. This project focuses on the realm of image captioning with the help of machine learning, employing advanced neural network architectures to unite the semantics of visual content and natural language descriptions. The project Visual semantic extraction for textual description using CNN and LSTM is a deep learning based project that leverages the power of convolutional neural networks (CNNs) for image feature extraction and long short term memory (LSTM) for sequential language generation, our model endeavours to independently generate descriptive captions for diverse image. There are multiple use cases and applications of this model like accessibility, social media content sharing, human-computer interaction and robotics. In conclusion, image captioning, driven by advancements in machine learning, has emerged as a powerful tool for bridging the gap between visual content and natural language. The continuous improvement in accuracy and versatility signifies its growing significance in diverse applications, promising a more accessible and enriched user experience across various domains.
DOI Requested
Paper File to download :