Vui lòng dùng định danh này để trích dẫn hoặc liên kết đến tài liệu này:
https://dspace.ctu.edu.vn/jspui/handle/123456789/119563
Nhan đề: | OD-VR-Cap: Image captioning based on detecting and predicting relationships between objects |
Tác giả: | Nguyen, Van Thinh Tran, Van Lang Van, The Thanh |
Từ khoá: | Image captioning Object detection Visual relationship Attention mechanism Deep neural network |
Năm xuất bản: | 2024 |
Tùng thư/Số báo cáo: | Journal of Computer Science and Cybernetics;Vol.40, No.04 .- P.327-346 |
Tóm tắt: | Recent image captioning works often focus on global features or individual object regions within the image without exploiting the relational information between them, resulting in limited accuracy. In this paper, the proposed image captioning model leverages the relationships between objects in the image to fully understand the content and improve accuracy. The approach goes through the following steps: First, objects in the image are detected using an object detection model combined with a graph convolutional network (GCN). From this, a relationship prediction model based on relational context information and knowledge is proposed to classify relationships between objects to create a relationship graph to represent the image. Subsequently, a dual attention mechanism is built to enable the model to focus on relevant parts of both object regions and vertices in the relationship graph when generating captions. Finally, an LSTM network with dual attention is trained to generate captions relying on the image representation and given captions. Experiments conducted on MS COCO and Visual Genome datasets demonstrate that the proposed model achieves higher accuracy compared to baseline methods and some recently published works. Bộ sưu tập: Journal of Computer Science and Cybernetics. |
Định danh: | https://dspace.ctu.edu.vn/jspui/handle/123456789/119563 |
ISSN: | 1813-9663 |
Bộ sưu tập: | Tin học và Điều khiển học (Journal of Computer Science and Cybernetics) |
Các tập tin trong tài liệu này:
Tập tin | Mô tả | Kích thước | Định dạng | |
---|---|---|---|---|
_file_ Giới hạn truy cập | 953.32 kB | Adobe PDF | ||
Your IP: 216.73.216.121 |
Khi sử dụng các tài liệu trong Thư viện số phải tuân thủ Luật bản quyền.