Beyond Machine Learning: Using Models in AI for Security

By Bryan Ware, October 20, 2017 | SHARE

Some impressive people have said bearish things recently about the use of artificial intelligence (AI) in cybersecurity.

A recent example is Heather Adkins, who for 15 years has been director of information security at Google (itself no slouch in the AI department). Adkins said at a September TechCrunch conference that AI techniques like machine learning are insufficient to protect companies from cyber attacks.

“The machine is really good at picking out anomalous behavior — but it’ll also pick out 99 events that are really good, and then a human has to go through and sort them,” she said. “That’s quite tedious and it doesn’t scale.”

Her point will be a familiar one to organizations deploying SIEM and UBA systems: current detection tools that rely on lists of rules or even machine learning to correlate raw data inputs and find anomalies discover soon enough that they typically generate far too many ‘false positive’ alerts to be truly useful.

The main problem, Adkins noted, is that machines “just don’t have a sense of what is good and bad from a security perspective,” because “there’s not enough data for the machines to consume.”

I agree with Adkins’ assessment about both the inherent limitations of detecting cyber threats using machine learning alone and the excessive false alarms such systems generate, which collectively make life difficult for any organization trying to protect itself from external and internal cyber threats.

That being said, there is another AI approach that eliminates many of the false positives generated by ML-based systems, easing the alert overload that afflicts security analysts and providing decision-makers with prioritized alerts of their biggest risks.

At Haystax Technology we first encode indicators of risky behavior into probabilistic models and only then do we identify a diverse array of network and non-IT data sets to run through the models to prioritize risks. This AI technique, which we call ‘model first,’ has already been operationally proven to drastically reduce false positives while prioritizing real risks to an organization. It’s an approach that allows analysts to look at the source of the highest risk first and then drill into all the alerts or events that drove that risk, versus manually sifting and correlating indicators of compromise — which is time-consuming and inefficient. Our process eases the strain on the overloaded SOC analysts and lets them focus on the threats that really matter.

To be sure, Haystax uses machine learning and other AI techniques in our Constellation Analytics Platform™ to find everything from insider threats and compromised accounts to indications of terrorist and criminal activity. As Adkins pointed out, many of these techniques excel at identifying anomalies — and we use them for exactly that.

Instead of feeding anomalies to an analyst, however, Constellation feeds them to a model of the particular problem domain to do the heavy lifting. And because that model has already encoded the inherent knowledge and judgment of the very analysts Adkins refers to, it frees them to focus on what’s important, and to operate at a scale and speed that would be unachievable were there no model performing the analysis and prioritization for them.

#  #  #

Watch the entire TechCrunch interview with Heather Adkins on YouTube; her discussion of applying AI and machine learning to security starts at 11:05.