NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
ERIC Number: ED579547
Record Type: Non-Journal
Publication Date: 2017
Pages: 115
Abstractor: As Provided
ISBN: 978-0-3552-6206-3
ISSN: EISSN-
Detecting and Analyzing Cybercrime in Text-Based Communication of Cybercriminal Networks through Computational Linguistic and Psycholinguistic Feature Modeling
Mbaziira, Alex Vincent
ProQuest LLC, Ph.D. Dissertation, George Mason University
Cybercriminals are increasingly using Internet-based text messaging applications to exploit their victims. Incidents of deceptive cybercrime in text-based communication are increasing and include fraud, scams, as well as favorable and unfavorable fake reviews. In this work, we use a text-based deception detection approach to train models for detecting text-based deceptive cybercrime in native and non-native English-speaking cybercriminal networks. I use both computational linguistic (CL) and psycholinguistic (PL) features for my models to study four types of deceptive text-based cybercrime: fraud, scams, favorable and unfavorable fake reviews. The data is obtained from three web genres namely: email, websites and social media. I build 1-dataset non-hybrid models as well as two types of hybrid models for native and non-native English speaking cybercriminal networks: 2-dataset and 3-dataset hybrid models. I use Naive Bayes, Support Vector Machines and kth Nearest Neighbor to train and test all the models. All the 1-dataset non-hybrid models are trained on data from one web genre and then used to detect and analyze other types of cybercrime in other web genres that are not part of the training set. Furthermore, all the 2-dataset hybrid models are trained on data combined from two web genres and then used to detect cybercrime in other web genres that are not part of the training set. Further still, the 3-dataset models are trained on every triplet data in three web genres and used to detect and analyze cybercrime in the web genre which was not part of the training set. Performance of the models on test datasets ranges from 60% to 80% accuracy with best performance on detection of fraud and unfavorable reviews. There were notable differences in models in detecting and analyzing scams in both native and non-native English speaking cybercriminal networks. This work can be applied as provider- or user-based filtering tools to identify cybercriminal actors and block or label messages before they reach their intended audience. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
ProQuest LLC. 789 East Eisenhower Parkway, P.O. Box 1346, Ann Arbor, MI 48106. Tel: 800-521-0600; Web site: http://www.proquest.com/en-US/products/dissertations/individuals.shtml
Publication Type: Dissertations/Theses - Doctoral Dissertations
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A