Software supply chain risks for AI and ML models


As organizations become more dependent on third-party libraries, frameworks, and services to develop and deploy their AI applications, software supply chain risks are increasing exponentially. These risks can emerge in various forms, potentially leading to data breaches and other security vulnerabilities in affected systems.

Two recent cyber incidents have occurred that show the high risk of using dependencies to develop AI solutions. In late 2023, different security research teams discovered five security vulnerabilities in Ray, a popular open-source unified framework for scaling AI and Python applications.

One of these vulnerabilities (CVE-2023-48022) was not considered critical because static scanning tools could not detect any problem. However, according to researchers at Oligo Security, this vulnerability has been exploited by different threat actors to execute arbitrary code remotely via the job submission API. The vulnerability affected thousands of Ray servers worldwide.

ADVERTISEMENT

Hugging Face is a popular platform to facilitate collaboration between the Machine Learning (ML) community on models, datasets, and applications. The company provides different components for its developers; some are paid, while others are open-source.

In late February, the JFrog Security Research team discovered various ways threat actors could exploit the open-source ML models hosted on Hugging Face to distribute vulnerable components. The researchers found a malicious ML model that executes malicious code upon loading, giving attackers a backdoor into target devices. The malicious payload was concealed within a seemingly innocent pickle file, enabling silent infiltration that could result in compromising affected systems and other malicious actions such as catastrophic data breaches or corporate espionage.

As we note from both attacks, threat actors actively exploit software dependencies in AI and ML projects to plant backdoors or other security vulnerabilities to compromise all systems that use infected components in their development, deployment, or training.

Software supply chain risks for AI and ML models

Threat actors use a few different attack vectors to exploit AI solutions. Here are the main ones:

Insecure dependencies

Insecure dependencies can significantly weaken the security of AI and ML solutions in the following ways:

  • Dependency vulnerabilities: As noted, AI/ML models rely on different software components to create the finished product or solution. For instance, software dependencies for an AI solution could include software libraries and frameworks, software development kits, API, drivers and connectors, middleware, and plugins. Threat actors may exploit any of these components to infect them with a backdoor to facilitate access to the solution using the compromised component.
  • Data poisoning: Developers need training data to train ML models. Developers use training data from various sources, such as the internet (social media, news websites, blogs, forums, source code repositories, and anything publicly available) and public databases – such as government databases. Threat actors may inject malicious data into the training data sets to make the model give inaccurate results after deploying it.
  • Risks to other AI systems: AI solutions are commonly connected with other solutions across the internet to obtain some of their functionalities. This amplifies risk, as one vulnerable system will propagate infections to other AI systems.

Compromised open-source repositories

ADVERTISEMENT

As we saw in the Hugging Face incident, online platforms housing ML tools and training datasets have become hackers' primary targets. Threat actors may exploit open-source ML components to:

  • Backdoors: Plant backdoors to facilitate access to the underlying AI solution using it.
  • Trojan horses: Integrate malicious code into the libraries, frameworks, or packages used in ML/AI development to create a trojan horse that is activated once some conditions are met.
  • Data exfiltration: Facilitate data exfiltration by introducing malicious code into public AI/ML repositories to reverse engineer the ML model or steal its training data.

The typosquatting attack is another important tactic cybercriminals use to mislead unsuspecting users when visiting ML repositories. In typosquatting, attackers create software projects with names similar to legitimate packages. When unaware developers download and use the attacker's package in their ML/AI solution, they unknowingly introduce security vulnerabilities into their program.

For example, a legitimate package called "pyDML" can be typosquatted to create similar naming to this project, such as:

  • PyDM
  • pyDMl
  • PyDMLib
  • Pydml
  • pydML

Vulnerable third-party components

Threat actors may exploit vulnerabilities in the API or other external services used to develop the ML/AI model, leading to data breaches and unauthorized access. For example, threat actors may gain unauthorized access to the cloud solution used by the AI/ML system, which will lead to compromising training datasets and gaining access to sensitive ML model parameters.

Attacking development environment

ML/AI solutions are developed using tools and frameworks like traditional software projects. These elements, in addition to the development environment, may be infiltrated by threat actors to manipulate ML models or training datasets.

Deployment platforms

ADVERTISEMENT

ML/AI solutions are commonly deployed using third-party solutions, such as SageMaker from Amazon, Vertex AI from Google, and Microsoft Azure Machine Learning. These platforms provide a complete suite of tools for developing, training, and deploying AI models in addition to other deployment options. Threat actors try to attack these platforms to gain unauthorized access to the deployment pipeline, which enables them to modify the ML models before deployment – such as installing backdoors or tampering with the model data.

Malicious updates

Despite using legitimate software components to power your AI solution, you still need to update these components after a while. When these components come from third-party providers, you will generally need to download and install the update from the vendor's official website. Threat actors may exploit the update function to insert malicious code into the update files before pushing it into users' devices. The cyberattack against the AnyDesk company, which provides remote access software, is a good example of using the update function to spread malware to clients instead of regular software updates.

Connected devices

The software supply chain is not limited to software components. For instance, connected endpoint devices of third-party providers, such as laptops, mobile devices, and the Internet of Things (IoT), are all considered a source of risk. For example, a vulnerable supplier endpoint device that has access to your IT environment will introduce security risks. Hardening endpoint devices’ operating systems and any device connected to your digital ecosystem where AI/ML development occurs is also critical.