The real-world applications of artificial intelligence have grown so quickly and become so widespread that it’s hard to get through everyday routines, such as driving or messaging friends, without witnessing their impact. The same holds true in the world of cybersecurity, with both attackers and defenders looking to AI to gain an upper hand. Its rise coincides with the proliferation of data itself, and as we increasingly lean on AI to make sense out of this new data-centric world, we also need to understand its security implications.

For decades, defenders have fought off attacks by detecting signatures, or specific patterns indicating malicious activity. This bottom-up approach was reactive. New attacks required the deployment of new signatures, so attackers were always one step ahead in the digital dogfight. Next-gen, AI-based solutions addressed this by taking a top-down approach and feeding large datasets of activity into statistical models. This transition from signatures to statistics meant that defenses could be proactive and generalize better to new attacks.

AI-fueled defense has flourished and is now commonly applied to classic problems like spam filtering, malicious file or URL detection, and more. These models typically rely on supervised machine learning algorithms, which map a function from their inputs (e.g. domain names like “” or “”) to an output (e.g. labels like “benign” or “malicious”). While supervised learning may clearly map to the defender’s need to distinguish benign from malicious, it’s also expensive and time-consuming to implement due to its dependence on preexisting labels. Data labeling requires upfront effort, demands domain expertise, and can’t always be repurposed elsewhere, meaning there’s a fundamental bottleneck to building an effective AI-based defense.

The structural dominance of offensive AI

AI-based defense suffers from other exploitable weaknesses. Because the accuracy of a model is governed by the fidelity of its labels, attackers can poison models when the model’s creator trained it on datasets injected by purposefully corrupt labels. This allows an attacker to construct specific samples that bypass detection. Other models are systematically vulnerable to slightly perturbed inputs that cause them to produce errors with embarrassingly high confidence. So-called adversarial examples are best illustrated by physical attacks, like placing stickers on stop signs to fool the object recognizers used in self-driving cars and implanting hidden voice commands to fool the speech recognizers used in smart speakers into calling the police.

While these examples might hit close to home for the average citizen, for cybersecurity professionals, similar errors could mean the difference between a breach and a promotion. Attackers are increasingly turning to automation, and they’ll soon turn to AI themselves to exploit such weaknesses. In short, “red team” attackers can benefit from data just as well as “blue team” defenders.

There’s a growing list of theoretical, AI-based red team workflows around spear phishing, password cracking, Captcha subversion, steganography, Tor de-anonymization, and antivirus evasion. In each simulation, attackers leverage readily accessible data, demonstrating that the data-labeling bottleneck makes AI-based attacks easier to pull off than their defensive counterparts.

At first glance, this may seem like history repeating itself. Attackers have always enjoyed an advantage simply because of what’s at stake. Blue teams only truly win when detection approaches 100 percent success, whereas red teams win even when they succeed only one time out of 100.

What’s different this time is a broader industry trend that, unfortunately, benefits the red team. One of the reasons we’ve made so much progress on problems like image recognition is that its researchers are rewarded for collaboration. On the other hand, cybersecurity researchers are often constrained because their data is too sensitive or even illegal to share, or it’s viewed as intellectual property, the secret sauce that gives vendors a leg up in the fiercely competitive cybersecurity market. Attackers can exploit this fragmented landscape and lack of data sharing to outpace defense.

Exacerbating this asymmetry, it’s only a matter of time before the barrier to entry for applied AI retreats from Ph.D. dissertations to high school classrooms. Free educational resources, available datasets and pre-trained models, access to powerful cloud-based computing resources like GPUs, and open source software libraries all lower the bar for AI newcomers, and therefore would-be attackers. Deep learning itself is actually more user-friendly than older paradigms, and in many cases, it produces state-of-the-art accuracy without the expert hand-engineering previously required.

The calm before the storm, so get your raincoat

Given these realities, the phrase “a dollar of offense beats a dollar of defense” certainly appears to hold true for the malicious use of AI. As for now, good old-fashioned manual attacks still reign, and there’s been no credible evidence documenting an AI-based attack in the wild. However, it is at this precise moment that we should consider how to ameliorate the data-labeling bottleneck and lessen the potential for future impacts.

While the odds may be stacked against them, defenders do have tools available to help them reduce the cost and time of labeling data. Crowdsourced labeling services provide a cheap, on-demand workforce whose consensus can approach the accuracy of experts. Other important tricks of the trade include speeding up the deployment of AI-based defense through the following tactics:

  • Active learning, where comparatively slow human experts label only the most informative data.
  • Semi-supervised learning, where models trained on limited labeled data learn the problem structure from available unlabeled data.
  • Transfer learning, where models previously trained for a problem with copious available labeled data are tailored for a new problem with limited labeled data.

Lastly, the best defense is a good offense. If done with care, companies can produce adversarial samples that harden AI-based defense, meaning defenders can preemptively wage attacks against their own models to help plug up any holes.

While the data-labeling bottleneck gives AI-based attacks a tactical advantage, defenders can and should take steps now to level the playing field before attackers unleash these threats.

Philip Tully is the principal data scientist at ZeroFOX, a company that detects and remediates threats to businesses and their employees on social, mobile, digital, and collaboration platforms.