Please use this identifier to cite or link to this item: https://dspace.ctu.edu.vn/jspui/handle/123456789/78035
Title: IMAGE CAPTIONING USING RESNET AND BI-TRANSFORMER MODELS
Authors: Lâm, Nhựt Khang
Ngô, Đình Trường
Keywords: CÔNG NGHỆ THÔNG TIN-CHẤT LƯỢNG CAO
Issue Date: 2022
Publisher: Trường Đại Học Cần Thơ
Abstract: Image captioning, a topic in the field of machine learning, uses image recognition techniques and natural language processing models to generate captions of photos. In this thesis, we perform experiments with several models to automatically create descriptions for images using the ResNet and Transformer models. In particular, the ResNet-50, ResNet-101, and ResNet-152 models, are used to extract image features, which are later used to feed into the Transform or BiTransformer models to generate image captions. We perform experiments on the Flickr8k dataset in English and Vietnamese, and evaluated it using the BLEU metric. The experimental results show that the combination of ResNet-152 and BiTransformer helps achieve better BLEU scores than another one.
Description: 39 Tr
URI: https://dspace.ctu.edu.vn/jspui/handle/123456789/78035
Appears in Collections:Trường Công nghệ Thông tin & Truyền thông

Files in This Item:
File Description SizeFormat 
_file_
  Restricted Access
1.25 MBAdobe PDF
Your IP: 3.144.212.145


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.