Please use this identifier to cite or link to this item:
https://dspace.ctu.edu.vn/jspui/handle/123456789/74475
Title: | BUILDING A PLAGIARISM DETECTION SYSTEM MODULE: INDEXING AND VECTOR DATABASE |
Authors: | Trần, Công Án Lý, Hiểu Sang |
Keywords: | CÔNG NGHỆ THÔNG TIN-CHẤT LƯỢNG CAO |
Issue Date: | 2021 |
Publisher: | Trường Đại Học Cần Thơ |
Abstract: | With the rapid development of the Internet, there are thousands of published scientific articles and technical guides, which has led to plagiarism becoming more common in the academic environment. This shows that a plagiarism detection system is absolutely necessary. However, the challenge in detecting plagiarism is the very high search time. Fortunately, in recent years, with the great advances in the field of ML, there have been many technological solutions that can be used to solve this problem. This study deals with Milvus and Faiss. Milvus is a vector database and Faiss is a library with indexing support. These two technologies are very important in solving the similarity search problem. In this report, the rationale for using these two technologies in the plagiarism detection system is presented as well as an overview of the architecture of Milvus, Faiss indexing and its performance. To this end, information on related studies was researched and summarized, the theory of the vector database and index was examined, and finally the results were tested and evaluated. In summary, this report has presented the application of Milvus and Faiss for plagiarism detection systems with the aim of improving the time for similarity search and providing the theoretical basis of Milvus and Faiss as a basis for other relevant studies. |
Description: | 35 Tr |
URI: | https://dspace.ctu.edu.vn/jspui/handle/123456789/74475 |
Appears in Collections: | Trường Công nghệ Thông tin & Truyền thông |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
_file_ Restricted Access | 796.66 kB | Adobe PDF | ||
Your IP: 3.145.88.111 |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.