The era of AI hacking has arrived

This summer, Russia’s hackers put a new twist on the barrage of phishing emails sent to Ukrainians.

The hackers included an attachment containing an artificial intelligence program. If installed, it would automatically search the victims’ computers for sensitive files to send back to Moscow.

That campaign, detailed in July in technical reports from the Ukrainian government and several cybersecurity companies, is the first known instance of Russian intelligence being caught building malicious code with large language models (LLMs), the type of AI chatbots that have become ubiquitous in corporate culture.

Those Russian spies are not alone. In recent months, hackers of seemingly every stripe — cybercriminals, spies, researchers and corporate defenders alike — have started including AI tools into their work.

LLMs, like ChatGPT, are still error-prone. But they have become remarkably adept at processing language instructions and at translating plain language into computer code, or identifying and summarizing documents.

The technology has so far not revolutionized hacking by turning complete novices into experts, nor has it allowed would-be cyberterrorists to shut down the electric grid. But it’s making skilled hackers better and faster. Cybersecurity firms and researchers are using AI now, too — feeding into an escalating cat-and-mouse game between offensive hackers who find and exploit software flaws and the defenders who try to fix them first.

“It’s the beginning of the beginning. Maybe moving towards the middle of the beginning,” said Heather Adkins, Google’s vice president of security engineering.

In 2024, Adkins’ team started on a project to use Google’s LLM, Gemini, to hunt for important software vulnerabilities, or bugs, before criminal hackers could find them. Earlier this month, Adkins announced that her team had so far discovered at least 20 important overlooked bugs in commonly used software and alerted companies so they can fix them. That process is ongoing.

None of the vulnerabilities have been shocking or something only a machine could have discovered, she said. But the process is simply faster with an AI. “I haven’t seen anybody find something novel,” she said. “It’s just kind of doing what we already know how to do. But that will advance.”

Adam Meyers, a senior vice president at the cybersecurity company CrowdStrike, said that not only is his company using AI to help people who think they’ve been hacked, he sees increasing evidence of its use from the Chinese, Russian, Iranian and criminal hackers that his company tracks.

“The more advanced adversaries are using it to their advantage,” he said. “We’re seeing more and more of it every single day,” he told NBC News.

The shift is only starting to catch up with hype that has permeated the cybersecurity and AI industries for years, especially since ChatGPT was introduced to the public in 2022. Those tools haven’t always proved effective, and some cybersecurity researchers have complained about would-be hackers falling for fake vulnerability findings generated with AI.

Scammers and social engineers — the people in hacking operations who pretend to be someone else, or who write convincing phishing emails — have been using LLMs to seem more convincing since at least 2024.

But using AI to directly hack targets is only just starting to actually take off, said Will Pearce, the CEO of DreadNode, one of a handful of new security companies that specialize in hacking using LLMs.

The reason, he said, is simple: The technology has finally started to catch up to expectations.

“The technology and the models are all really good at this point,” he said.

Less than two years ago, automated AI hacking tools would need significant tinkering to do their job properly, but they are now far more adept, Pearce told NBC News.

Another startup built to hack using AI, Xbow, made history in June by becoming the first AI to climb to the top of the HackerOne U.S. leaderboard, a live scoreboard of hackers around the world that since 2016 has kept tabs on the hackers identifying the most important vulnerabilities and giving them bragging rights. Last week, HackerOne added a new category for groups automating AI hacking tools to distinguish them from individual human researchers. Xbow still leads that.

Hackers and cybersecurity professionals have not settled whether AI will ultimately help attackers or defenders more. But at the moment, defense appears to be winning.

Alexei Bulazel, the senior cyber director at the White House National Security Council, said at a panel at the Def Con hacker conference in Las Vegas last week that the trend will hold, at least as long as the U.S. holds most of the world’s most advanced tech companies.

“I very strongly believe that AI will be more advantageous for defenders than offense,” Bulazel said.

He noted that hackers finding extremely disruptive flaws in a major U.S. tech company is rare, and that criminals often break into computers by finding small, overlooked flaws in smaller companies that don’t have elite cybersecurity teams. AI is particularly helpful in discovering those bugs before criminals do, he said.

“The types of things that AI is better at — identifying vulnerabilities in a low cost, easy way — really democratizes access to vulnerability information,” Bulazel said.

That trend may not hold as the technology evolves, however. One reason is that there is so far no free-to-use automatic hacking tool, or penetration tester, that incorporates AI. Such tools are already widely available online, nominally as programs that test for flaws in practices used by criminal hackers.

If one incorporates an advanced LLM and it becomes freely available, it likely will mean open season on smaller companies’ programs, Google’s Adkins said.

“I think it’s also reasonable to assume that at some point someone will release [such a tool],” she said. “That’s the point at which I think it becomes a little dangerous.”

Meyers, of CrowdStrike, said that the rise of agentic AI — tools that conduct more complex tasks, like both writing and sending emails or executing code that programs — could prove a major cybersecurity risk.

“Agentic AI is really AI that can take action on your behalf, right? That will become the next insider threat, because, as organizations have these agentic AI deployed, they don’t have built-in guardrails to stop somebody from abusing it,” he said.

NBC News