MESHED-MEMORY TRANSFORMER FOR IMAGE CAPTIONING IN VIETNAMESE

Please use this identifier to cite or link to this item: https://dspace.ctu.edu.vn/jspui/handle/123456789/84782

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Lâm, Nhựt Khang	-
dc.contributor.author	Mai, Phước Vinh	-
dc.date.accessioned	2023-01-05T03:12:58Z	-
dc.date.available	2023-01-05T03:12:58Z	-
dc.date.issued	2022	-
dc.identifier.other	B1805835	-
dc.identifier.uri	https://dspace.ctu.edu.vn/jspui/handle/123456789/84782	-
dc.description	40 Tr	vi_VN
dc.description.abstract	Image captioning has been a topic of interest due to its various implementations such as impaired persons support, recommendations system, virtual assistants, image indexing, social media, and many other applications. Several studies using different architectures, such as LSTM, CNN, or RNN, have yielded promising results and have steadily improved over time. This thesis uses the mesh-memory Transformer approach to infer description sentences for images. The experimental results show that the meshmemory Transformer model effectively generates captions for images in Vietnamese. In particular, the BLEU-1, 2, 3, and 4 scores of the model on the Flickr8k dataset Vietnamese are 0.703, 0.589, 0.489, and 0.397, respectively.	vi_VN
dc.language.iso	en	vi_VN
dc.publisher	Trường Đại Học Cần Thơ	vi_VN
dc.subject	CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO	vi_VN
dc.title	MESHED-MEMORY TRANSFORMER FOR IMAGE CAPTIONING IN VIETNAMESE	vi_VN
dc.title.alternative	XÂY DỰNG CÂU MÔ TẢ CHO HÌNH ẢNH SỬ DỤNG MÔ HÌNH MESHED-MEMORY TRANSFORMER	vi_VN
dc.type	Thesis	vi_VN
Appears in Collections:	Trường Công nghệ Thông tin & Truyền thông

Files in This Item:

File	Description	Size	Format
_file_ Restricted Access		2.65 MB	Adobe PDF
Your IP: 18.225.56.194

LRC Digital repo