Researchers trace massive data leak to US data broker: why should you care


Sensitive data exposing a staggering amount of individuals continues to leak online, most likely originating from datasets belonging to People Data Labs.

On June 25th, the Cybernews research team discovered a dataset containing over 170 million sensitive data records that were exposed to anyone on the internet. The leaked data included:

  • Full names
  • Phone numbers
  • Emails
  • Location details
  • Skills
  • Professional summaries
  • Education history
  • Employment history
ADVERTISEMENT

The data leak’s trail points to People Data Labs (PDL), a San Francisco-based data broker, since the leaked dataset was labeled "PDL."

The company's website claims to have profiles of 1.5 billion individuals that various businesses can use for marketing, sales, recruiting, and data enrichment purposes. PDL boasts “unparalleled coverage across over 150 data points.”

The unprotected Elasticsearch server responsible for the leak was not directly connected to the company, suggesting that an unidentified third party may have mishandled the company's data.

Although the party ultimately responsible for the data leak remains unknown, it’s crucial to emphasize that leaving the Elasticsearch server without a password is highly dangerous. Threat actors can discover this kind of exposed data in mere seconds, putting individuals at risk of identity theft and fraud and increasing their chances of being targeted by phishing attacks.

“The existence of data brokers is already a controversial issue, as they often have insufficient checks and controls to ensure that data doesn’t get sold to the wrong parties. Leaking large segments of their datasets makes it easier and more convenient for threat actors to abuse the data for large-scale attacks," said the Cybernews research team.

PDL data leak
Source: Cybernews

PDL has already been impacted by a massive data leak, which exposed more than a billion records online in 2019. The leak stemmed from the same issue: an exposed and unprotected Elasticsearch server. At the time, PDL denied responsibility for leaking data.

The currently leaked dataset was marked as “Version 26.2”, suggesting it could be related to the previous data leak. Whether directly from PDL or not, such data leaks cause substantial reputational damage to data brokers, undermining their trust with clients and partners.

ADVERTISEMENT

“If this is a new leak, and not processed and enriched data from the 2019 leak by a third party, such an incident would show a high level of ignorance from the company regarding personal data security,” added our researchers.

Cybernews has contacted PDL for a comment and is awaiting a reply.

If you believe you may have been impacted by the data leak, there are several steps you can take to mitigate potential harm.