Online-to-offline shopping giant leaks sensitive user data

Locally, one of the largest online-to-offline shopping sites, leaked 48 million records exposing customer data. Threat actors could exploit the datasets for social engineering, BEC, and spreading malware.

Recently, I discovered a non-password-protected database that contained 47,814,836 error logs dating back to October 2019. Upon further research, I recorded many references to as the “host domain.” I immediately sent a responsible disclosure notice of my findings, and public access was restricted within hours.

The Cybernews research team informed me that they had also discovered an exposed dataset with references to Locally. We analyzed the databases and concluded there were two separate data exposures of logging and error records on a different IP address. These records gave a first-hand look at what data is collected and how, and where that information is stored.

According to its website, Locally is one of the world’s biggest online-to-offline shopping networks and has created the world's largest decentralized shopping network. The data contained information about system repositories, link structures, and included pathways to multiple .csv (comma separated value) documents or spreadsheet files, and other potentially sensitive records. A limited sampling of the data revealed emails from partners, affiliates, and customers.

It is unclear how long the database was exposed for or who else may have gained access to these records. Only a forensic audit would identify the IP addresses and locations of those who may have accessed this dataset. At the time of publication, we have not received any reply or comment from Locally regarding our discovery or disclosure notice.

What the databases contained

  • Two separate exposures of logging data on different IP addresses.
  • Total records on the first exposure: 47,814,836.
  • Total records on the second exposure: "count": 2,520,715,307 and "deleted": 14,253,850.
  • Internal error logging records that include a large number of files. The environment was marked as “Production”.
  • Customer and partner data, such as email, phone, physical addresses, and other details.
  • What appeared to be order and invoice information. Customers and users could be potentially targeted for social engineering scams or phishing attacks or to validate credit card information.
  • Order confirmation page exposure with customer personally identifying information (PII) and order information.
  • The files also show where data is stored and how the logging network operates from the back end, including third-party cloud storage repositories.
  • The database was at risk of a ransomware attack that could have encrypted the data. The exposed environment could have potentially allowed cyber criminals to insert malicious code or identify vulnerabilities for a future cyberattack.

The database exposed information of over 700,000 “dealers.” Although stores, vendors, and dealers often offer publicly accessible information, there were internal account numbers and other information indicating their affiliate status.

One simple example of how cybercriminals could use these records as a working list for fraud would be to contact the businesses and say they were from or affiliated with Locally, provide their affiliate ID number, and state that they need the business to update their payment or banking information. Once they establish trust and credibility, criminals could use this information for fraudulent purposes.

Screenshot showing 708,298 “dealer” accounts.

Order confirmation page exposure

In a separate potential vulnerability, I noticed that the database contained links to what appeared to be an order summary or confirmation page with a direct url that was not password protected. These online records also contained references to name, email, billing, and payment method. Although these links were configured to not be indexed by search engines such as Google, they were publically accessible to anyone with an internet connection and provided a very detailed summary of orders. The page also shows the retailer disbursement amount, merchant fees, and Locally’s fee.

In a sampling of 10,000 records, the order url appeared 1,492 times. Some of the older invoices returned a 404 error and seemed to have been deleted. I was able to match information such as names, addresses, or phone numbers that appeared on the order confirmation pages to real individuals with the same names and addresses. It is important to remember that this was not every customer and all of their personal data, but an error logging dataset that monitored for when something went wrong or was not functioning properly. Error logs contain fragments of real data and are included in the reporting process: they serve as a piece of the puzzle in creating a bigger picture of how the network operates, and could expose data that should not be publicly available.

Example of an order summary page showing billing, customer, and product information.
According to multiple IoT search engines, the exposed IP address had nearly 152 open ports.

Increased risk

Open ports can be potentially dangerous and allow attackers to exploit legitimate services and search for security vulnerabilities. They become a very big security risk if the services operating through them are misconfigured, vulnerable, or unpatched. The IP address identifies the network where the data packets are stored, and port numbers identify the specific application or a service that a system uses.

Once these security vulnerabilities have been identified, criminals could launch a range of malicious attacks including trying to inject malware into the system or using open ports to gain unauthorized access to sensitive data. I am not implying that Locally’s system was breached in this way, but only highlighting how bad actors could use this information. In my personal experience, this is an unusually large number of open ports for one location or a single IP address.

The danger of exposed log data

Logging records will always have a level of risk associated with them, because they collect so much information needed to understand how an organization's systems are functioning and detect any problems. All data is valuable, and even the smallest mistake can lead to potentially sensitive data being publicly exposed in plain unencrypted text.

Sensitive data can include any identification material, also known as PII. Improper handling of errors can have a broad effect and even the smallest amount of data could be exploited or identify additional weaknesses in the host network. When detailed internal error messages are unencrypted or available in plain text, they can identify stack traces, database dumps, error codes, and much more. Hackers and cybercriminals could theoretically see multiple vulnerabilities and entry points into a network or other content storage systems, such as a customer relationship management (CRM) or content management system (CMS.)

Example of an error log that potentially exposed sensitive information.

Social engineering is the most common form of fraud, and these records could have theoretically contained enough information for a criminal to launch a targeted attack. The screenshot above is an example of what appeared to be a credit card transaction that contained the card holder’s name, billing address, card type, expiration date, and last four digits of the card number. The record also shows what the individual was trying to purchase and the merchant charge ID number. It would theoretically be possible to launch a targeted phishing attack using internal privileged information: then criminals could obtain payment information from a position of trust. For example, through knowing shopping history, order amount, and the company that sold the products, a cybercriminal could hypothetically verify information that only the merchant would know. The customer would have no reason to doubt that this could be anyone but the legitimate seller. This is how social engineering fraud works.

What are error logs?

Web applications must identify and log when something goes wrong. Error logs can range from memory issues, search and system failure, or database being unavailable, to hundreds of other problems. These errors provide a snapshot and diagnostic information to an organization's administrators. The danger is in this information being also highly valuable for hackers and cybercriminals. Even “access or permission denied” errors can reveal if privileged files exist, the site’s file and directory structure.

Virtually, all web servers, applications, and environments are potentially susceptible to error logging and handling problems. Only through rigorous testing can an organization understand how errors can affect their systems. The massive amount of records produced during the error logging process makes it extremely hard to do manual code reviews, so these logs are often ignored or considered not valuable if everything appears to be functional.

I would recommend that organizations adopt a specific policy on how to handle the documentation of errors, including their types. Error logs should have a lifespan that expires, or be stored offline after they are no longer in use, to prevent unwanted exposures.

Small business vs big business

The concept behind Locally is actually very respectable to me personally. By helping small businesses get back some of what they have lost to giant online retailers, they take a Robin Hood approach to the David vs Goliath reality of sales in this market. During the 1980s and ‘90s, big offline retailers would come into a community in the US and wipe out local businesses that couldn’t compete with their buying power and low prices and employee salaries. Then online services like Google, Amazon, and eBay dominated the market and created a near-monopoly.

According to an article posted by RetailWire in 2012: “Frustrated by losing sales to the Amazons of the world, a handful of outdoor retailers in 2014 launched to open up online inventory visibility for local retailers. A unique aspect is how it helps brands sell through local dealers”.

The internet was the final frontier where the little guy could compete against the brick-and-mortar chains and their big-tech successors. But between 2011 and 2016, Google changed its algorithm, with the Panda and Penguin search updates, to give more value to brands over non-brand sites that used search engine optimization (SEO) to gain ranking and traffic. In the process, Google effectively erased much of the gains of traditional

SEO and punished or devalued sites with questionable link building or content strategies.

After these changes, the only way that small businesses could compete was to buy ads. The algorithm change reportedly affected the rankings of almost 12 percent of all search results and shifted an estimated $1 billion in annual revenue to larger web properties. This is one of many reasons why online businesses can feel the game is rigged and that Google’s algorithm changes have stacked the deck against small businesses. But it also created a perfect scenario for a business model like Locally to help connect customers and local businesses.

All data is valuable

This discovery is a wakeup call for any company or organization that collects and stores data online. As customers, we expect a reasonable level of data privacy protection but it is never guaranteed.

I am not implying any wrongdoing by Locally, Local Gear, or their affiliates, nor am I claiming that any customer or user data was ever at risk. I am only highlighting my discovery to raise awareness for cybersecurity and data protection purposes. As a Security Researcher, I never circumvent passwords or extract the publicly exposed data I find. The primary goal is to notify an organization of our findings and help protect this information before it was accessed, stolen, or encrypted by ransomware. With 48 million records, we can only assume there could have been additional information that was potentially sensitive.

More from Cybernews:

Quantum ransomware gang: fast and furious

Microsoft says Windows Autopatch tool is generally live

How to stay safe on Amazon Prime Day

TikTok halts privacy policy update aimed at tracking users’ behavior without consent

Ransomware attacks knock schools out for months, some don't recover

Subscribe to our newsletter


Philip Riley
prefix 1 year ago
I retract my earlier comment. Actually this was a well written article and your organization should be commended for your work. If this article helps make companies aware of vulnerabilities in their data then you have done a great service to all customers.
Philip Riley
prefix 1 year ago
VERY MISLEADING AND DAMAGING HEADLINIE...“Online-to-offline shopping giant leaks sensitive user data” and then you say “nor am I claiming that any customer or user data was ever at risk.”
Leave a Reply

Your email address will not be published. Required fields are markedmarked