IMAGE CAPTIONING USING INCEPTION-V4, LSTM MODELS AND BERT EMBEDDINGS

Huỳnh, Quang Nhật Hào

Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này: https://dspace.ctu.edu.vn/jspui/handle/123456789/84781

Nhan đề:	IMAGE CAPTIONING USING INCEPTION-V4, LSTM MODELS AND BERT EMBEDDINGS
Nhan đề khác:	XÂY DỰNG CÂU MÔ TẢ CHO HÌNH ẢNH SỬ DỤNG MÔ HÌNH INCEPTION-V4, LSTM VÀ BERT EMBEDDINGS
Tác giả:	Lâm, Nhựt Khang Huỳnh, Quang Nhật Hào
Từ khoá:	CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO
Năm xuất bản:	2022
Nhà xuất bản:	Trường Đại Học Cần Thơ
Tóm tắt:	Image captioning uses image recognition techniques and natural language processing models to generate captions of photos. In this thesis, we perform experiments with several models to automatically create descriptions for images using the Inception-v4, LSTM Models and BERT Embeddings. In particular, the Inception-v4 model extracts image features later fed into the LSTM and BERT Embeddings model to generate image captions. We perform experiments on the Flickr8k dataset in English and Vietnamese and evaluate the models using the BLEU metric. The experimental results show that combining the Inception-v4, LSTM Models and BERT Embeddings helps achieve better BLEU scores than others. The experimental results show that the combination of Inception-v4, LSTM Models and BERT Embeddings help achieve better BLEU scores than other models. The BLEU1, 2, 3, and 4 scores of the Inception-v4, LSTM Models and BERT Embeddings on the English and Vietnamese Flickr8k datasets are 0.689, 0.479, 0.3649, 0.267; and 0.647, 0.501, 0.332, 0.271 respectively.
Mô tả:	42 Tr
Định danh:	https://dspace.ctu.edu.vn/jspui/handle/123456789/84781
Bộ sưu tập:	Trường Công nghệ Thông tin & Truyền thông

Các tập tin trong tài liệu này:

Tập tin	Mô tả	Kích thước	Định dạng
_file_ Giới hạn truy cập		1.56 MB	Adobe PDF
Your IP: 216.73.216.213

Hiển thị đầy đủ biểu ghi tài liệu Xem thống kê

Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.

Thư viện số DSPACE

Thư viện số cho phép quản lý các nguồn tài liệu số như: Văn bản, hình ảnh, âm thanh, phim ảnh...