Please use this identifier to cite or link to this item:
https://dspace.ctu.edu.vn/jspui/handle/123456789/124166| Title: | DETECTING SCAM CONTENT USING PHOBERT AND FINE-TUNED GEMINI |
| Other Titles: | PHÁT TRIỂN MÔ HÌNH NHẬN BIẾT CÁC NỘI DUNG CÓ TÍNH LỪA ĐẢO |
| Authors: | Thái, Minh Tuấn Phan, Thị Hồng Nguyên |
| Keywords: | CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO |
| Issue Date: | 2025 |
| Publisher: | Trường Đại Học Cần Thơ |
| Abstract: | In the era of rapid digital transformation, online communication platforms have become integral to daily life in Vietnam, accompanied by a significant surge in sophisticated cyber fraud. Scammers continuously adapt their linguistic patterns to bypass traditional keyword-based filters, rendering reactive security measures ineffective. Consequently, there is a critical need for advanced Natural Language Processing (NLP) methodologies capable of deeply understanding the semantic nuances of the Vietnamese language to identify these evolving threats. Addressing this challenge, this thesis conducts a comparative analysis of two distinct state-of-the-art NLP architectures: PhoBERT and a fine-tuned Gemini model. PhoBERT, a pre-trained language model optimized for Vietnamese, is utilized to assess the efficacy of discriminative modeling in extracting contextual features for robust text classification. Complementing this, the research investigates the advanced reasoning capabilities of Large Language Models (LLMs) by fine-tuning Google’s Gemini to analyze complex narrative structures and psychological triggers, such as urgency and authority impersonation, that traditional models often miss. The models are evaluated on a curated dataset of real-world scam messages designed to distinguish malicious intent from legitimate communications. The experimental results provide empirical evidence on the trade-offs between the syntactic precision of discriminative models and the semantic reasoning of generative models. Ultimately, this study offers critical insights into the applicability of deep learning approaches for proactive scam detection, highlighting the potential of semantic analysis in mitigating 'zero-day' vulnerabilities inherent in conventional defense mechanisms. |
| Description: | 53 Tr |
| URI: | https://dspace.ctu.edu.vn/jspui/handle/123456789/124166 |
| Appears in Collections: | Trường Công nghệ Thông tin & Truyền thông |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| _file_ Restricted Access | 1.01 MB | Adobe PDF | ||
| Your IP: 216.73.216.63 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.