Intelligent System for Detection of Cybercrime Vocabulary on Websites

I Castillo-Zúñiga, JI López-Veyna… - DYNA New …, 2020 - revista-dyna.com
DYNA New Technologies Journal, 2020revista-dyna.com
This article presents an intelligent system to detect Cybercrime lexicon on Web sites, to find
knowledge about large amounts of information on the Internet in an acceptable response
time. The proposed architecture uses a Web Scraper to locate and download information
from the Internet. To obtain the linguistic corpus of Cybercrime, a parallel genetic strategy is
executed, which distributes the processes of cleaning Web pages and the techniques for
Natural Language Processing (tokenization, stop words, frequency of term, term frequency …
This article presents an intelligent system to detect Cybercrime lexicon on Web sites, to find knowledge about large amounts of information on the Internet in an acceptable response time. The proposed architecture uses a Web Scraper to locate and download information from the Internet. To obtain the linguistic corpus of Cybercrime, a parallel genetic strategy is executed, which distributes the processes of cleaning Web pages and the techniques for Natural Language Processing (tokenization, stop words, frequency of term, term frequency with inverse document frequency), together with lemmatization methods and synonyms. To obtain knowledge, a dataset was generated using a semantic ontology with the general characteristics of Cybercrime. To evaluate the efficiency of the model, supervised learning algorithms were used: Boosting, Neural Network and Random Forests in parallel. The results reveal 97.64% accuracy in the detection of Cybercrime vocabulary, which was verified by the LOOCV cross-validation technique, in addition, a time-saving was obtained in data recovery and knowledge search of 292% and 1220% respectively using parallel processing.
revista-dyna.com
以上显示的是最相近的搜索结果。 查看全部搜索结果