Duolingo, the popular language learning app, has had some of its users exposed online. The scraped data of 2.6 million people, which was on sale in January, is now available on the cybercrime marketplace BreachForums. Open API allows the scraping of more data.
The shared sample of data contains email addresses, usernames, names, and phone numbers (if provided by the user), information about social networks, and other generic info such as language studies, experience, progress and achievements, etc.
The structure of the data in the provided sample looks like this:
The full list, containing over 2.6 million unique entries, has been on sale in a hacker’s forum since January with a starting price of $1,500, open to negotiations. The company has since acknowledged the issue, commenting to the Record that the data was scraped from public profile information, but no data breach or hack had occurred.
Now, the data in BreachForums can be accessed for 8 credits used as in-house currency, worth $2.13.
“Today I have uploaded the Duolingo Scrape for you to download, thanks for reading and enjoy!” the forum user writes.
Vx-underground researchers noticed that a Threat Actor identified a bug in the Duolingo API. Sending a valid email to the API returns generic account information on the user. They warn that the leaked data will be used for doxxing, a type of cyber-attack aimed at discovering a person's real identity and publishing their private information online. That also may lead to targeted phishing attacks.
The Cybernews research team discovered that user data on Duolingo is still available for scraping, extending beyond the publicly available list. This implies the potential possibility of obtaining other data like location, public avatar, or photo.
The vulnerability lies in Duolingo's exposed application programming interface (API), which hackers can easily exploit by submitting a username or email to gather public profile details.
Twitter (X) users have disclosed the availability of this API since at least March.
“In the past I was able to answer difficult OSINT related questions thanks to this API,” Ivano Somaini shared.
Private information such as emails should not be available to obtain from public sources. In one of the most notorious data scraping cases, Facebook was fined $276m for a data leak in the EU.
Duolingo investigating if further action is needed
A spokesperson for Duolingo told Cybernews that they're aware of this report.
"These records were obtained by data scraping public profile information. We have no indication that our systems were compromised. We take data privacy and security seriously and are continuing to investigate this matter to determine if any further action is needed to protect our learners," he said.
Also, Duolingo added that the email addresses in this incident were obtained from other websites, not from Duolingo.
“The API used in this incident is intentionally made public to help our learners find friends who are also using Duolingo. Duolingo learners have the option to make their profiles private if they would prefer not to have their profiles publicly searchable,” the spokesperson commented.
Update: September 1, 2023 – Duolingo provided additional comment:
“Our investigation confirmed that this was not a breach or a hack; it was a scrape of data from public Duolingo profiles. No Duolingo systems or private user data were compromised. Regardless, as a precautionary measure we have taken some steps to limit this from happening again. We have put in place rate limits on the specific API endpoint to make it more difficult for attackers to abuse. We take data privacy and security seriously and will continue to constantly evaluate our security measures to ensure learner safety.”
The language learning platform Duolingo has over 500 million registered users and over 60 million monthly active users. The platform was founded in 2011 by Luis von Ahn and Severin Hacker.
More from Cybernews:
Subscribe to our newsletter