Prompt injection: New malware targets AI cybersecurity tools

Correspondent

Nation Media Group

Cyber safety researchers have unearthed a new malware strain that brings to light the first ever documented attempt to weaponise attacks against Artificial Intelligence (AI)-powered security safeguards and analysis tools, in what sector pundits have christened prompt injection.

Researchers at software vendor Check Point reported that the incident was first noticed early this month via a sample that was anonymously uploaded by a user in the Netherlands, with planted malware designed to evade detection from AI-powered protections.

“Malware authors have long evolved their tactics to avoid detection. They leverage obfuscation, packing, sandbox evasions, and other tricks to stay out of sight,” stated Check Point.

“As defenders increasingly rely on AI to accelerate and improve threat detection, a subtle but alarming new contest has emerged between attackers and defenders.”

A short malware refers to any intrusive software developed by hackers to steal data and damage or destroy computers and their systems. Common examples include viruses, worms, spyware and ransomware, among others.

Check Point’s discovery of the malware, dubbed ‘Skynet’ by its creators, marks a significant evolution in adversarial tactics targeting AI systems deployed to help in attacks detection and related risk analysis.

The emergence of the new threat mode coincides with the rapidly growing adoption of AI large language models (LLMs) in cybersecurity workflows, particularly in automated malware analysis and reverse engineering tasks.

Cyber safety practitioners are increasingly relying on AI models like OpenAI’s GPT-4 and Google’s Gemini to analyse and process suspicious code samples, creating a new attack surface that malicious actors are now attempting to exploit.

According to Check Point, Skynet’s novel evasion mechanism was embedded within its code structure, with the researchers describing it as an ‘experimental proof-of-concept’ that demonstrates how threat actors are adapting to the AI-driven security landscape.

How the evasion tactic works

The software vendor reports that after the malware sample was anonymously uploaded, it looked incomplete at the first glance as some parts of the code weren’t fully functional, adding that it printed system information that would usually be exfiltrated to an external server.

“What stood out, however, was a string embedded in the code that appeared to be written for an AI, not a human. It was crafted with the intention of influencing automated, AI-driven analysis, not to deceive a human looking at the code,” says Check Point.

Read: Alarm as attackers turn to fake CAPTCHA to plant malware

The researchers note that by placing a language that mimics the authoritative voice of the legitimate user instructing the LLM, the attacker was attempting to hijack the AI’s stream of consciousness and manipulate it into outputting a fabricated verdict, and even into running malicious code.

“This technique is known as prompt injection,” they said.

The team says they tested the malware sample against their analysis system, noting that the prompt injection did not succeed in an indication that the underlying model flagged the file as malicious.

“While the technique was ineffective in this case, it is likely a sign of things to come. Attacks like this are only going to get better and more polished. This marks the early stages of a new class of evasion strategies,” wrote Check Point.

“These techniques will likely grow more sophisticated as attackers learn to exploit the nuances of LLM-based detection.”

The emergence of prompt injection comes at a time a separate report has sounded alarm over a rising trend of social engineering campaigns that rely on fake authentication systems known as Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA).

The drive has been to infect web users with malware using CAPTCHA clicks, with potential victims being directed to websites controlled by attackers where they are prompted to complete a series of verification steps which, if followed, lead to users running malicious commands on their computers.

Cyber fraudsters are also deploying AI capabilities to scale up attacks, including creating fake reviews in the app stores that go as far as allotting five-star ratings to specific targeted apps to artificially inflate their credibility, leading people to download potentially harmful or deceptive content.

Reviews show that users who download these apps often find themselves bombarded with an overwhelming number of out-of-context adverts, akin to websites created solely to display ads.

→ [email protected]

🪫 Your Subscription Ends Soon!

Prompt injection: New malware targets AI cybersecurity tools

How the evasion tactic works

PAYE Tax Calculator

Latest

Borrowing in T-bills hits Sh1trn on State cash crunch

StanChart blocks Sh7 billion payout to 629 retirees

National Treasury cuts funding for MPs’ kitty by Sh12bn

Popular musician gets swift justice for 'stolen' gospel hit

In the headlines

Borrowing in T-bills hits Sh1trn on State cash crunch

StanChart blocks Sh7 billion payout to 629 retirees

National Treasury cuts funding for MPs’ kitty by Sh12bn

Weaker shilling, stockpile lift foreign currency deposits

🪫 Your Subscription Ends Soon!

PRIME The rise of dark AI: Hackers weaponise language models to scale cyberattacks

PRIME Cyber attacks in Kenya triple to 2.5bn as criminals target key sectors

How the evasion tactic works

PAYE Tax Calculator

Get it First!

PRIME Borrowing in T-bills hits Sh1trn on State cash crunch

PRIME StanChart blocks Sh7 billion payout to 629 retirees

PRIME National Treasury cuts funding for MPs’ kitty by Sh12bn

PRIME Popular musician gets swift justice for 'stolen' gospel hit

PRIME Borrowing in T-bills hits Sh1trn on State cash crunch

PRIME StanChart blocks Sh7 billion payout to 629 retirees

PRIME National Treasury cuts funding for MPs’ kitty by Sh12bn

PRIME Weaker shilling, stockpile lift foreign currency deposits

The rise of dark AI: Hackers weaponise language models to scale cyberattacks

Cyber attacks in Kenya triple to 2.5bn as criminals target key sectors

Borrowing in T-bills hits Sh1trn on State cash crunch

StanChart blocks Sh7 billion payout to 629 retirees

National Treasury cuts funding for MPs’ kitty by Sh12bn

Popular musician gets swift justice for 'stolen' gospel hit

Borrowing in T-bills hits Sh1trn on State cash crunch

StanChart blocks Sh7 billion payout to 629 retirees

National Treasury cuts funding for MPs’ kitty by Sh12bn

Weaker shilling, stockpile lift foreign currency deposits