IMAGE CAPTIONING USING RESNET AND BI-TRANSFORMER MODELS

Please use this identifier to cite or link to this item: https://dspace.ctu.edu.vn/jspui/handle/123456789/78035

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Lâm, Nhựt Khang	-
dc.contributor.author	Ngô, Đình Trường	-
dc.date.accessioned	2022-07-04T02:17:14Z	-
dc.date.available	2022-07-04T02:17:14Z	-
dc.date.issued	2022	-
dc.identifier.other	B1607136	-
dc.identifier.uri	https://dspace.ctu.edu.vn/jspui/handle/123456789/78035	-
dc.description	39 Tr	vi_VN
dc.description.abstract	Image captioning, a topic in the field of machine learning, uses image recognition techniques and natural language processing models to generate captions of photos. In this thesis, we perform experiments with several models to automatically create descriptions for images using the ResNet and Transformer models. In particular, the ResNet-50, ResNet-101, and ResNet-152 models, are used to extract image features, which are later used to feed into the Transform or BiTransformer models to generate image captions. We perform experiments on the Flickr8k dataset in English and Vietnamese, and evaluated it using the BLEU metric. The experimental results show that the combination of ResNet-152 and BiTransformer helps achieve better BLEU scores than another one.	vi_VN
dc.language.iso	en	vi_VN
dc.publisher	Trường Đại Học Cần Thơ	vi_VN
dc.subject	CÔNG NGHỆ THÔNG TIN-CHẤT LƯỢNG CAO	vi_VN
dc.title	IMAGE CAPTIONING USING RESNET AND BI-TRANSFORMER MODELS	vi_VN
dc.type	Thesis	vi_VN
Appears in Collections:	Trường Công nghệ Thông tin & Truyền thông

Files in This Item:

File	Description	Size	Format
_file_ Restricted Access		1.25 MB	Adobe PDF
Your IP: 216.73.216.226

LRC Digital repo