AI tools immature and prone to cyber-sabotage

Large-language models (LLMs) may be soaring in popularity, but this meteoric rise means that many programs are far from secure against cyberattacks and manipulation, an analyst warns.

Automated service cybersecurity provider Rezilion cautions that LLM training data, essentially the food for ‘thought’ that powers AI generative models, could be manipulated by threat actors, introducing vulnerabilities, backdoors, or biases that undermine the security and ethical behavior of the machine.

“This malicious act aims to compromise the integrity and reliability of the LLM by injecting misleading or harmful information during the training process,” said Rezilion.

Moreover, even without the involvement of a malicious actor, a simple error in an LLM could also have negative consequences. For instance, a data leakage could lead it to inadvertently revealing sensitive data and proprietary algorithms that developers would rather keep secret.

“This inadvertent disclosure can lead to unauthorized access to valuable data or intellectual property, compromising privacy and giving rise to various security breaches,” said Rezilion, adding that personally identifying information could also be exposed, putting people at risk too.

“An additional concern related to the disclosure of private data is the potential for ChatGPT to reveal personal information, leading to the dissemination of speculative or harmful content,” it said.

Rezilion also added its voice to the growing chorus expressing concerns about data “hallucinations" — essentially when an LLM like ChatGPT fabricates a response to a question and passes this off as real.

Citing the recent instance of two New York lawyers who damaged their careers after using OpenAI’s LLM to research a case and inadvertently used fabricated legal history, Rezilion urged caution for all LLM users.

“It is important to remember that ChatGPT and any language-based generative model have limitations,” it said. “One of the main limitations of such models is the inherent tendency to produce false or fabricated information.”

However, Rezilion adds that a malicious actor could deliberately exploit this defect to publish faulty information that might cause harm to selected targets — if they’re taken in by the ruse.

“In a targeted attack scenario, an adversary uses ChatGPT’s response generation capabilities to manipulate its recommendations,” it said. “The attacker initiates the process by posing a question to ChatGPT, seeking a solution to a coding problem and asking for package recommendations. ChatGPT responds with a list of packages, which may include non-existent ones. Exploiting this vulnerability, the attacker identifies a recommendation for an unpublished package and proceeds to publish their malicious package as a substitute.”

This can then have a ripple effect, whereby subsequent users pose similar questions to ChatGPT and the AI generates responses drawing on the fake or malicious coding that has now entered its lexicon.

“Consequently, unsuspecting users may unknowingly adopt the recommended package, putting their systems and data at risk,” said Rezilion.

Happy hunting ground

Moreover, ChatGPT’s stellar progress since entering the mainstream in November means threat actors have myriad nodes to choose from, with more than 30,000 GPT-related open-source projects created on GitHub since then.

“The pace at which GPT-related open-source projects are being created and gaining popularity is astounding,” said Rezilion. “And let’s not forget these projects are in their infancy — only two to six months old.”

It singled out AutoGPT, an experiment to try and make GPT-4 autonomous, as a case in point: at less than three months old, it has more than 140,000 stars on its GitHub rating but also suffers from more than 500 data security issues.

“The open-source community is something we all rely on every day,” said Rezilion. “But we also know that open-source projects have their security risks, and it’s imperative to be aware of these risks when using them. In the case of newborn GPT-related projects, it seems that everyone has forgotten how critical this is.”

Commenting on the study’s findings, Rezilion vulnerability research director Yotam Perkal urged any organization using an LLM to regularly audit it using tools such as Google’s Secure AI Framework, NeMo Guardrails, or Mitre’s ATLAS™.

He said: “Generative AI is increasingly everywhere, but it’s immature, and extremely prone to risk. On top of their inherent security issues, individuals and organizations provide these AI models with excessive access and authorization without proper security guardrails.”

Perkal added that LLM interactions should be regularly monitored to detect any potential security and privacy issues, and allow for appropriate updating and fine-tuning. He stressed that this must be a dual effort between users and developers alike.

“Responsibility for preparing and mitigating LLM risks lies with both the organizations integrating the technology and the developers involved in building and maintaining these systems,” he said.

Perkal further warned that the next year and a half would see the problem spiral if due diligence is not conducted, as more and more people adopt LLMs.

“Without significant improvements in the security standards and practices surrounding LLMs, the likelihood of targeted attacks and the discovery of vulnerabilities in these systems will increase,” he said. “Organizations must stay vigilant and prioritize security measures to mitigate evolving risks and ensure the responsible and secure use of LLM technology.”

More from Cybernews:

China pushes ahead with its own vision of an AI-enhanced future

Twitter users’ accounts restricted due to glitch

Wagner ransomware wants to recruit its victims

Top 5 cyber threat actors of 2023

Meta tightens up parental controls on Instagram

Subscribe to our newsletter