Evaluating and Enhancing Image Caption Generation: A Comparative Study of LSTM and GRU Models with Object and Feature Recognition Strategies
POSTER
Abstract
The combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) has significantly transformed the domain of image caption generation in recent times. The objective of this study is to assess the effectiveness of different combinations of Convolutional Neural Networks (CNNs) and language models, such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), in producing detailed and precise captions for images. The results of our study demonstrate that models based on Long Short-Term Memory (LSTM) show a higher level of performance in terms of the quality of captions. Conversely, GRU models showcase the benefit of decreased computing time. In order to improve the quality of the captions produced, we suggest implementing a three-pronged approach. Initially, the present framework might be enhanced by integrating object detection techniques to accurately recognise and isolate distinct things within an image. Furthermore, the process of feature extraction could be enhanced by incorporating attributes such as colour, position, and other prominent qualities of each identified object. Furthermore, sophisticated language models can be utilised to effortlessly connect these characteristics and entities, thereby generating captions that are not just descriptive but also contextually pertinent. This research aims to establish a foundation for the development of image captioning systems that are more sophisticated and effective. We emphasise the trade-offs between LSTM and GRU models and provide a comprehensive strategy for future research to enhance the quality of captions. This has major implications for various domains, including computer vision and natural language processing.
Presenters
-
Md Ragib Shaharear
Johns Hopkins University
Authors
-
Md Ragib Shaharear
Johns Hopkins University
-
Md Shah Imran Shovon
Shahjalal University of Science and Technology
-
Jannatul Mowa Arzu
Shahjalal University of Science and Technology