Thoughtful Design in the Age of Cybersecurity AI

AI and machine learning offer tremendous promise for humanity in terms of helping us make sense of Big Data. But, while the processing power of these tools is integral for understanding trends and predicting threats, it’s not sufficient on its own.

Thoughtful design of threat intelligence—design that accounts for the ultimate needs of its consumers—is essential too. There are three areas where thoughtful design of AI for cybersecurity increases overall utility for its end users.

Designing where your data comes from

To set the process of machine learning in motion, data scientists rely on robust data sets they can use to train models that deduce patterns. If your data is siloed, it relies on a single community of endpoints or is made up only of data gathered from sensors like honeypots and crawlers. There are bound to be gaps in the resultant threat intelligence.

A diverse set of real-world endpoints is essential to achieve actionable threat intelligence. For one thing, machine learning models can be prone to picking up biases if exposed to either too much of a particular threat or too narrow of a user base. That may make the model adept at discovering one type of threat, but not so great at noticing others. Well-rounded, globally-sourced data provides the most accurate picture of threat trends.

Another significant reason real-world endpoints are essential is that some malware excels at evading traditional crawling mechanisms. This is especially common for phishing sites targeting specific geos or user environments, as well as for malware executables. Phishing sites can hide their malicious content from crawlers, and malware can appear benign or sit on a user’s endpoint for extended periods of time without taking an action.

Designing how to illustrate data’s context

Historical trends help to gauge future measurements, so designing threat intelligence that accounts for context is essential. Take a major website like www.google.com for example. Historical threat intelligence signals it’s been benign for years, leading to the conclusion that its owners have put solid security practices in place and are committed to not letting it become a vector for bad actors. On the other hand, if we look at a domain that was only very recently registered or has a long history of presenting a threat, there’s a greater chance it will behave negatively in the future.

Illustrating this type of information in a useful way can take the form of a reputation score. Since predictions about a data object’s future actions—whether it be a URL, file, or mobile app—are based on probability, reputation scores can help determine the probability that an object may become a future threat, helping organizations determine the level of risk they are comfortable with and set their policies accordingly.

For more information on why context is critical to actionable threat intelligence, click here.

Designing how you classify and apply the data

Finally, how a threat intelligence provider classifies data and the options they offer partners and users in terms of how to apply it can greatly increase its utility. Protecting networks, homes, and devices from internet threats is one thing, and certainly desirable for any threat intelligence feed, but that’s far from all it can do.

Technology vendors designing a parental control product, for instance, need threat intelligence capable of classifying content based on its appropriateness for children. And any parent knows malware isn’t the only thing children should be shielded from. Categories like adult content, gambling sites, or hubs for pirating legitimate media may also be worthy of avoiding. This flexibility extends to the workplace, too, where peer-to-peer streaming and social media sites can affect worker productivity and slow network speeds, not to mention introduce regulatory compliance concerns. Being able to classify internet object with such scalpel-like precision makes thoughtfully designed threat intelligence that is much more useful for the partners leveraging it.

Finally, the speed at which new threat intelligence findings are applied to all endpoints on a device is critical. It’s well-known that static threat lists can’t keep up with the pace of today’s malware, but updating those lists on a daily basis isn’t cutting it anymore either. The time from initial detection to global protection must be a matter of minutes.

This brings us back to where we started: the need for a robust, geographically diverse data set from which to draw our threat intelligence. For more information on how the Webroot Platform draws its data to protect customers and vendor partners around the globe, visit our threat intelligence page.

About the Author

Cathy Yang

Product Manager, Threat Intelligence

As product manager for Webroot’s threat intelligence solutions, Cathy Yang drives excellence in the quality of data and services for technology partners.