A Quantitative Based Research on the Production of Image Captioning


AJIBADE S. M., Zaidi A., Maidin S. S., Ishak W. H. W., Adetunla A.

International Journal of Intelligent Systems and Applications in Engineering, cilt.11, sa.4, ss.816-830, 2023 (Scopus) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 11 Sayı: 4
  • Basım Tarihi: 2023
  • Dergi Adı: International Journal of Intelligent Systems and Applications in Engineering
  • Derginin Tarandığı İndeksler: Scopus
  • Sayfa Sayıları: ss.816-830
  • Anahtar Kelimeler: Attention Model, Image Caption, Multimodal Model, Region Level Captions, Semantic Content
  • İstanbul Ticaret Üniversitesi Adresli: Evet

Özet

It is widely recognized that modern systems can discern the context of an image and enrich it with relevant captions through the fusion of computer vision and natural language processing, a technique referred to as 'image caption production.' This article aims to shed light on and analyze various image captioning techniques that have evolved over the past few decades, including the Attention Model, Region-Level Caption Detection, Semantic Content-Based Models, Multimodal Models, and more. The evaluation of these techniques employs diverse criteria such as Precision Rate, Recall Rate, F1 Score, Accuracy Rate, among others, while employing various datasets for comparison. This article offers a comprehensive structural examination of contemporary image captioning methods. Researchers can leverage the insights from this analysis to develop innovative, improved approaches that sidestep the shortcomings of older methods while retaining their beneficial aspects.