Please use this identifier to cite or link to this item: https://dspace.ctu.edu.vn/jspui/handle/123456789/94523
Title: HARMONIZING LEXICAL AND SEMANTIC ASPECTS IN ABSTRACTIVE TEXT SUMMARIZATION USING BART-BRIO
Other Titles: TỪ VỰNG VÀ NGỮ NGHĨA TRONG TÓM TẮT VĂN BẢN TRỪU TƯỢNG SỬ DỤNG MÔ HÌNH BART-BRIO
Authors: Lâm, Nhựt Khang
Dương, Huỳnh Nhân
Keywords: CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO
Issue Date: 2023
Publisher: Trường Đại Học Cần Thơ
Abstract: Abstractive text summarization is of interest to many researchers. An extremely important thing in the summary is the use of words and ensuring semantic accuracy. Don't simply extract important pieces of text, but also create new sentences or paragraphs, often using natural language. This requires the model to be able to understand the context, grammar, and meaning of the text to be able to produce quality summaries. This thesis proposes a text summarization model (English and Vietnamese) based on BART using BRIO training technique, called BART-BRIO. BRIO assumes a non-deterministic probability distribution to reduce the model's dependence on the reference summary and improve model performance during inference. We tested it on the Vietnamese-News-Data dataset. The results show that our model outperforms all existing Vietnamese summary models. Specifically, the ROUGE-1, ROUGE-2 and ROUGE-LSum scores of the BART-BRIO model are 60.20, 32.25 and 43.12, respectively. For English (using the CNNDM dataset), the scores were 45.75, 22.36, and 45.46, respectively.
Description: 38 Tr
URI: https://dspace.ctu.edu.vn/jspui/handle/123456789/94523
Appears in Collections:Trường Công nghệ Thông tin & Truyền thông

Files in This Item:
File Description SizeFormat 
_file_
  Restricted Access
1.34 MBAdobe PDF
Your IP: 18.219.119.163


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.