Largest ever data leak exposes over 4 billion user records


In what’s likely the biggest data leak to ever hit China, billions of documents with financial data, WeChat and Alipay details, as well as other sensitive personal data, were exposed to the public. Worryingly, there’s little that impacted users can do to protect themselves.

Key takeaways:

The supermassive data breach likely exposed hundreds of millions of users, primarily from China, the Cybernews research team’s latest findings reveal. A humungous, 631 gigabytes-strong database was left without a password, publicizing mind-boggling 4 billion records.

ADVERTISEMENT

Bob Dyachenko, cybersecurity researcher and owner at SecurityDiscovery.com, together with the Cybernews team, discovered billions upon billions of exposed records on an open instance.

Largest Chinese data leak ever
Image by Cybernews.

The database consisted of numerous collections, containing from half a million to over 800 million records from various sources. The Cybernews research team believes the dataset was meticulously gathered and maintained for building comprehensive behavioral, economic, and social profiles of nearly any Chinese citizen.

“The sheer volume and diversity of data types in this leak suggests that this was likely a centralized aggregation point, potentially maintained for surveillance, profiling, or data enrichment purposes,” the team observed.

There’s no shortage of ways threat actors or nation states could exploit the data. With a data set of that magnitude, everything from large-scale phishing, blackmail, and fraud to state-sponsored intelligence gathering and disinformation campaigns is on the table.

What data was included in the largest Chinese data breach?

Despite the team’s best efforts, Cybernews only got a peek at the database because the exposed instance was quickly taken down. This also prevented the team from revealing the identity of the database's owners. However, collecting and maintaining this sort of database requires time and effort, often linked to threat actors, governments, or very motivated researchers.

The team managed to see sixteen data collections, likely named after the type of data they included.

ADVERTISEMENT

The largest collection, with over 805 million records, was named “wechatid_db,” which most likely points to the data coming from the Tencent-owned super-app WeChat.

Largest leak of Chinese data
Image by Cybernews.

The second largest collection, “address_db,” had over 780 million records containing residential data with geographic identifiers. The third largest collection, simply named “bank,” had over 630 million records of financial data, including payment card numbers, dates of birth, names, and phone numbers.

Possessing only these three collections would enable skilled attackers to correlate different data points to find out where certain users live and what their spending habits, debts, and savings are.

Another major collection in the dataset was named in Mandarin, which roughly translates to “three-factor checks.” With over 610 million records, the collection most likely contained IDs, phone numbers, and usernames.

Meanwhile, a collection named “wechatinfo” contained nearly 577 million records. Since WeChat user IDs were stored in a separate collection, wechatinfo most likely had metadata, communication logs, or even user conversations.

“The sheer volume and diversity of data types in this leak suggests that this was likely a centralized aggregation point, potentially maintained for surveillance, profiling, or data enrichment purposes,”

researchers said.

Another 300 million records were stored in a data collection “zfbkt_db”, containing Alipay card and token information. Attackers could attempt to enable unauthorized payments, takeover accounts, and steal users’ identity. Coupled with a smaller collection in the leak with 20 million records on Alipay-related financial data, this could spell disaster for users whose data was leaked.

More than 353 million records were unevenly distributed among nine more collections with data points on a very wide array of topics. Whoever owns the dataset has information on gambling, vehicle registration, employment information, pension funds and insurance. Researchers believe that one collection, named “tw_db” contains Taiwan-related details.

Chinese data leak problem

ADVERTISEMENT

Despite our best efforts, the team could not attribute the data to any identifiable organization. No attribution or headers that indicate ownership were present, and the infrastructure was removed from public access shortly after discovery.

“Individuals who may be affected by this leak have no direct recourse due to the anonymity of the owner and lack of notification channels,” the team noted.

China-based data leaks are hardly new. We ourselves have previously written about a data leak that exposed 1.5 billion Weibo, DiDi, Shanghai Communist Party, and others’ records, or a mysterious actor spilling over 1.2 billion records on Chinese users. More recently, attackers leaked 62 million iPhone users’ records online.

However, we could not identify any data leak that surpasses four billion records. That would make this data leak the largest single-source leak of Chinese personal data ever identified.


  • Leak discovered: May 19th, 2025
  • Leak closed: May 20th, 2025