Please use this identifier to cite or link to this item:
https://dspace.ctu.edu.vn/jspui/handle/123456789/110485
Title: | BUILDING A MUSICAL INSTRUMENT RECOGNITION SYSTEM USING DEEP LEARNING MODELS. |
Other Titles: | XÂY DỰNG HỆ THỐNG NHẬN DIỆN NHẠC CỤ BẰNG MÔ HÌNH HỌC SÂU |
Authors: | Trần, Công Án Trương, Khả Thi |
Keywords: | CÔNG NGHỆ THÔNG TIN - CHẤT LƯỢNG CAO |
Issue Date: | 2024 |
Publisher: | Trường Đại Học Cần Thơ |
Abstract: | This thesis focuses on developing a musical instrument recognition system using deep learning models to address two main tasks: detection and classification. The proposed system leverages state-of-the-art models, including YOLOv10 and YOLO11 for detection and EfficientNetB3 for classification. The system aims to identify musical instruments from images in real-time scenarios, providing accurate and efficient results. A diverse dataset was collected and prepared, consisting of 5,378 images for detection and 5,107 images for classification, supplemented with augmentation techniques to enhance robustness. Detection models were trained with augmented datasets containing 9,279 images, achieving notable performance. YOLOv10m achieved a precision of 96%, recall of 92.6%, and mAP50-95 of 89.9%, while YOLO11s demonstrated faster inference with a mAP50-95 of 87.3%. For classification, EfficientNetB3 achieved a superior top-1 accuracy of 96.93%, outperforming YOLO11m-cls, which attained 94.1%. The system was deployed on a user-friendly web interface built with Streamlit, enabling users to upload images or use a webcam for real-time recognition. The results include detailed instrument labels, bounding boxes, and confidence scores. Additionally, the system provides an option for database integration using MySQL to store and retrieve instrument details. This research contributes to the field by introducing a robust image-based instrument recognition system, addressing challenges like occlusion, lighting variation, and visually similar instruments. Future directions include expanding the dataset with rare and culturally diverse instruments, optimizing processing speed, and exploring cross-platform deployment for broader applicability. |
Description: | 59 Tr |
URI: | https://dspace.ctu.edu.vn/jspui/handle/123456789/110485 |
Appears in Collections: | Trường Công nghệ Thông tin & Truyền thông |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
_file_ Restricted Access | 3.72 MB | Adobe PDF | ||
Your IP: 18.226.98.244 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.