Please use this identifier to cite or link to this item:
https://dspace.ctu.edu.vn/jspui/handle/123456789/94523
Title: | HARMONIZING LEXICAL AND SEMANTIC ASPECTS IN ABSTRACTIVE TEXT SUMMARIZATION USING BART-BRIO |
Other Titles: | TỪ VỰNG VÀ NGỮ NGHĨA TRONG TÓM TẮT VĂN BẢN TRỪU TƯỢNG SỬ DỤNG MÔ HÌNH BART-BRIO |
Authors: | Lâm, Nhựt Khang Dương, Huỳnh Nhân |
Keywords: | CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO |
Issue Date: | 2023 |
Publisher: | Trường Đại Học Cần Thơ |
Abstract: | Abstractive text summarization is of interest to many researchers. An extremely important thing in the summary is the use of words and ensuring semantic accuracy. Don't simply extract important pieces of text, but also create new sentences or paragraphs, often using natural language. This requires the model to be able to understand the context, grammar, and meaning of the text to be able to produce quality summaries. This thesis proposes a text summarization model (English and Vietnamese) based on BART using BRIO training technique, called BART-BRIO. BRIO assumes a non-deterministic probability distribution to reduce the model's dependence on the reference summary and improve model performance during inference. We tested it on the Vietnamese-News-Data dataset. The results show that our model outperforms all existing Vietnamese summary models. Specifically, the ROUGE-1, ROUGE-2 and ROUGE-LSum scores of the BART-BRIO model are 60.20, 32.25 and 43.12, respectively. For English (using the CNNDM dataset), the scores were 45.75, 22.36, and 45.46, respectively. |
Description: | 38 Tr |
URI: | https://dspace.ctu.edu.vn/jspui/handle/123456789/94523 |
Appears in Collections: | Trường Công nghệ Thông tin & Truyền thông |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
_file_ Restricted Access | 1.34 MB | Adobe PDF | ||
Your IP: 18.219.119.163 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.