Malicious systems of a feather flock together

Researchers analyze blacklist data to help identify bad actors on the Internet

The Internet can be a dangerous place. But like any large community, some neighborhoods are more dangerous than others. Researchers from the Oak Ridge National Laboratory and Indiana University have developed a technique they say could help identify where the bad actors are hanging out.

“Malicious activity is not necessarily evenly distributed across the Internet,” they write in a paper describing their initial work. “This analysis shows that there are dense clusters of malicious activity in the Internet.”

The researchers performed statistical analysis of IP addresses contained in blacklists commonly used for filtering and blocking malicious activity to see if they could identify Internet service providers, hosting services or other autonomous systems as having a disproportionate share of them. That could help ISPs and other organizations evaluate their own condition and others' and then make decisions about prioritizing traffic.

“We wanted to be able to say if a particular network is dong a good job of cleaning up its machines,” said Craig Shue, cybersecurity research scientist at the Oak Ridge National Laboratory’s Computational Sciences and Engineering Division.

They found that not only were some doing a poor job of cyber hygiene but also a few appeared to be overtly malicious. “We found four spectacularly bad ISPs that were big blips on the radar,” he said.

Shue, along with Andrew Kalafut and Minaxi Gupta of Indiana University’s School of Informatics and Computing, are presenting the results of their research at the IEEE Infocom conference in San Diego.

In a few cases, autonomous systems responsible for malicious activity have been cut off or shut down, such as Atrivo, McColo and Pricewert Networks. But generally, “ISPs have never had any motivation to clean up their acts,” Shue said.

He and his collaborators used data from 12 common blacklist services on millions of IP addresses associated with spam, phishing, malware and botnet activities. When possible, host names were resolved to IP addresses and the addresses associated with particular systems. The researchers then evaluated the data to determine the percentage of a system’s addresses that were blacklisted and the percentage of a blacklist that a system hosted.

“Very few had more than 0.5 percent bad addresses,” Shue said. “The ones that have more than that jump to the top.” Some autonomous systems have more than 80 percent of their routable IP address space blacklisted, and others account for large fractions between 50 and 80 percent of their addresses blacklisted.

Three U.S.-based hosting providers accounted for more than 6 percent of at least one of the blacklists, a disproportionately large percentage for the size of the systems.

“This indicates that some [autonomous systems] have either too lax a security policy or may be intentionally harboring cyber crime,” the researchers conclude in their paper.

Despite the results, traffic cannot simply be declared malicious solely because it originated from one of the systems with a high degree of maliciousness, and it is too early to identify the bad actors, Shue said.

“We have a little difficulty with naming names,” he said, because of liability and the preliminary nature of the work. He said the quality of the blacklist data the work was based on is a concern because there are few industry standards for compiling and maintaining the lists. There are often no provisions for removing addresses from a blacklist once they appear, so the largest lists might contain data that is no longer accurate.

One of the next steps for the researchers is to evaluate the quality of blacklist data, Shue said.

About the Author

William Jackson is a Maryland-based freelance writer.

Stay Connected

Sign up for our newsletter.

I agree to this site's Privacy Policy.