Please use this identifier to cite or link to this item: https://dspace.ctu.edu.vn/jspui/handle/123456789/111671
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorThái, Minh Tuấn-
dc.contributor.authorLê, Trung Nhật-
dc.date.accessioned2025-02-14T06:29:48Z-
dc.date.available2025-02-14T06:29:48Z-
dc.date.issued2024-
dc.identifier.otherB2005888-
dc.identifier.urihttps://dspace.ctu.edu.vn/jspui/handle/123456789/111671-
dc.description86 Trvi_VN
dc.description.abstractFile classification is crucial in information security, system management, and digital forensics. Traditional methods like classification by file extensions, header extraction, or basic machine learning have limitations such as low accuracy and poor scalability. This thesis proposes a file classification approach using byte histograms combined with Decision Tree and Random Forest models, enhanced by supplemental features like entropy and file size. A dataset with 12 common file types was used, with 80% for training and 20% for testing. The Random Forest model with additional features achieved the highest accuracy of 92.5%, outperforming Decision Tree in Precision, Recall, and F1-Score, especially for types like pdf, exe, and json. The proposed method offers high accuracy, scalability, and practical applicability for file classification tasks.vi_VN
dc.language.isoenvi_VN
dc.publisherTrường Đại Học Cần Thơvi_VN
dc.subjectCÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAOvi_VN
dc.titleFILE TYPE CLASSIFICATION USING MACHINE LEARNINGvi_VN
dc.title.alternativePHÂN LOẠI TẬP TIN SỬ DỤNG MÁY HỌCvi_VN
dc.typeThesisvi_VN
Appears in Collections:Trường Công nghệ Thông tin & Truyền thông

Files in This Item:
File Description SizeFormat 
_file_
  Restricted Access
2.35 MBAdobe PDF
Your IP: 13.59.153.218


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.