Md Rayhanul Masud, Ben Treves, Michalis Faloutsos
How can we identify malicious hackers participating in different online platforms using their usernames only? Establishing the identity of a user across online platforms (eg security forums, GitHub, YouTube) is an essential capability for tracing malicious hackers. Although a hacker could pick arbitrary names, they often use the same or similar usernames as this helps them establish an online “brand”. We propose GeekMAN, a systematic human-inspired approach to identify similar usernames across online platforms focusing on technogeek platforms. The key novelty consists of the development and integration of three capabilities:(a) decomposing usernames into meaningful chunks,(b) de-obfuscating technical and slang conventions, and (c) considering all the different outcomes of the two previous functions exhaustively when calculating the similarity. We conduct a study using 1.8 M usernames from three different types of forums:(a) security forums,(b) malware authors from GitHub, and (c) mainstream social media platforms, which we use as reference. First, our method outperforms previous methods with a Precision of 81-86% on technogeek datasets. Second, we find 6327 forum users that match malware authors on GitHub with a high similarity score (≥ 0.7). Finally, we provide a translation dictionary for slang terms with 5.8 K entries, and create GeekMAN platform to facilitate further studies https://geekman. streamlit. app.