Framework to host AI locally on Windows and macOS leaks massive amount of data


Shadow AI is spreading across the internet, and no one is really watching. Security researchers have discovered over 170,000 publicly accessible and unmonitored AI systems.

New research shows that a growing network of artificial intelligence (AI) infrastructure is running outside the safety rails that big tech has spent years building.

According to security researchers from SentinelLABS and Censys, more than 175,000 publicly accessible AI systems are exposed online.

ADVERTISEMENT

The exposed instances are built on Ollama, an open-source framework supported by Microsoft, macOS, and Linux, which enables users to run large language models (LLMs) locally on their hardware.

Ollama framework leak
Top 10 Countries by share of unique hosts. Source: SentinelLABS

Unlike AI systems run by companies like OpenAI, Google, or Anthropic, Ollama models operate without centralized monitoring and access controls, putting users globally at risk.

The research shows that China accounts for the largest share of exposed systems, making up just over 30% of the total. The remaining infrastructure is spread across the United States, Germany, France, South Korea, India, Russia, Singapore, Brazil, and the United Kingdom.

“For defenders, the key takeaway is that LLMs are increasingly deployed to the edge to translate instructions into actions. As such, they must be treated with the same authentication, monitoring, and network controls as other externally accessible infrastructure,” the researchers warn.

175,000 machines analyzed globally

Over 293 days, the researchers scanned the internet for exposed AI systems and found about 175,000 different machines running Ollama across 130 countries. During research time, those machines appeared more than 7 million times in the scans, indicating this isn’t a small or isolated issue.

The analyzed AI ecosystem is bimodal. Some of the hosts pop up once or twice, then disappear. Over a third of all systems were observed only once, and together they contributed little to overall activity.

ADVERTISEMENT

However, roughly 13% of hosts persisted across scans. Despite the low number, they accounted for three-quarters of all observed activity.

“This is where capability, exposure, and operational value converge. These are systems that provide ongoing utility to their operators and, by extension, represent the most attractive and accessible targets for adversaries,” the researchers highlight.

Ollama framework leak
Top 20 model families by share of unique hosts. Source: SentinelLABS

Powerful AI models open for exploitation

By default, Ollama binds its service to the loopback address 127.0.0.1, restricting access to the local host only.

However, a small configuration change can make the instance reachable from the internet. The researchers changed the bind address to 0.0.0.0, which exposed the service to external network traffic.

At scale, such a misconfiguration could create a new massive attack surface. Many of the exposed systems aren’t just chatbots.

“Nearly half of observed hosts are configured with tool-calling capabilities that enable them to execute code, access APIs, and interact with external systems,” the researchers wrote in the report.

Ollama framework leak
Ollama framework leak. Source: SentinelLABS

Almost 48% of the observed hosts advertised tool-calling capabilities, meaning they can execute code, interact with APIs, or access external systems.

ADVERTISEMENT

Around 22% supported vision, allowing them to process images. A quarter ran “thinking” models optimized for multi-step reasoning.

Decentralized systems cause governance problems

One of the study's findings is that AI infrastructure is mainly located on residential and telecom networks.

Because exposed Ollama instances are scattered across both cloud servers and home networks, responsibility is fragmented, creating clear gaps in oversight and governance.

“Over the past year, as open-weight models have proliferated and local deployment frameworks have matured, we observed growing discussion in security communities about the implications of this trend,” the researchers said.

“Unlike platform-hosted LLM services with centralized monitoring, access controls, and abuse prevention mechanisms, self-hosted instances operate outside emerging AI governance boundaries.”

Why Ollama exposure matters and why it’s risky

Exposed Ollama servers might look like a niche configuration mistake, but the researchers warn that they form a loosely connected, global layer of AI compute that anyone can tap into without oversight.

Unlike commercial AI platforms, which are protected with authentication and abuse-detection systems, many publicly accessible models are effectively open for exploitation.

Malicious actors can route spam generation, phishing, and disinformation campaigns, or other large-scale automated activities through someone else’s hardware, making malicious behaviour harder to trace.

ADVERTISEMENT

Also, if attackers target the AI models that can trigger functions, call APIs, or interact with external systems, the potential risks escalate sharply.

Prompt injection is another powerful attack vector. An attacker might abuse an AI model by asking questions that could prompt it to retrieve internal documents, summarize private data, or expose configuration details.

jurgita justinasv Izabelė Pukėnaitė vilius Ernestas Naprys Gintaras Radauskas
Don't miss our latest stories on Google News. Add us as your Preferred Source on Google

The location of these systems makes things even murkier. A large share runs on residential or telecom networks, not enterprise clouds. Traffic from these IPs often appears to come from a normal person, not a bot or server. That opens the door to identity laundering, where attackers hide malicious activity behind infrastructure that appears trustworthy to other services.

There is also a broader, structural risk in how these systems are being rolled out. Even though exposed Ollama servers are scattered across thousands of networks worldwide, they tend to run the same few model families, often packaged in identical compression formats. That homogeneity simplifies deployment, but it comes at a cost.

A flaw in a popular model or a commonly used quantization format would not affect a single server or organization in isolation. Instead, it could cascade across a large share of publicly exposed systems at the same time, dramatically increasing the potential blast radius of a single vulnerability.

“Software monocultures have historically amplified the impact of vulnerabilities. When a single implementation error affects a large percentage of deployed systems, the blast radius expands accordingly,” the researchers explained.


Unlock more exclusive Cybernews content on YouTube.

ADVERTISEMENT