Researchers from Palo Alto Networks created a machine-learning model that feeds on “crumbs of information” that malicious actors leave and detects tens of thousands of malicious domains weekly before they’re used for illicit activities.
Malicious actors create vast numbers of domains for redundancy and to ensure uptime for phishing or scam campaigns, malware distribution, adversarial Search Engine Optimization (SEO), or other illicit content.
Stockpiled domains are left dormant until they’re needed for specific campaigns. However, the scripts cybercriminals use for automation can now be used against them.
Cybersecurity researchers from Unit 42, a security arm of Palo Alto Networks, noticed that attackers leave traces of information about their true intentions and that malicious domains can be detected early.
Leveraging “crumbs of information,” such as certificate transparency logs and passive DNS (pDNS) data that gives insights into infrastructure reuse and characteristics, researchers built a detector called the Random Forest machine learning algorithm, which yielded remarkable results.
“As of July 2023, our detection pipeline has found 1,114,499 unique stockpiled root domain names and identifies tens of thousands of malicious domains weekly. Our model, on average, found stockpiled domains 34.4 days earlier compared to vendors on VirusTotal. The success of our approach emphasizes the need to combine multiple large datasets, such as passive DNS and certificate logs, to detect malicious campaigns,” Unit 42’s blog post reads.
For the model to provide “patient-zero” detections, Unit42 engineered over 300 features to process many terabytes of data and billions of pDNS and certificate records. Millions of malicious and benign domains were used in training the model.
“Cybercriminals started to automate their infrastructure setup. However, bulk domain registration and infrastructure automation can leave crumbs of information that allow us to detect stockpiled domains. The success of our approach emphasizes the need for security defenders looking to improve their detection to combine multiple large datasets, such as pDNS and certificate logs, to uncover malicious campaigns,” researchers write.
The classifier was immediately included in multiple security solutions.
Cyber researchers have long been monitoring newly registered domains and applying extra scrutiny to them. Cybercriminals, on the other hand, have adapted to evade scrutiny by stockpiling and aging domains for at least a month. Their other tactics also include cloaking (showing benign content to suspected crawling bots) and user targeting (showing malicious content only to specific users).
The new model can now find malicious domains leveraging automation and infrastructure that the same criminal group owns.
“Our classifier can achieve 99% precision with 48% recall, even though many of the malicious domains might not be stockpiled or cybercriminals might not leave traces of information in certificate logs and passive DNS data,” researchers noted.
Researchers shared in detail a variety of campaigns, including scams, phishing, malware distribution, and command and control.
More from Cybernews:
Subscribe to our newsletter