“Not ideal” from a privacy standpoint: Clubhouse API lets “anyone” scrape public user data


According to Clubhouse, anyone can access its users' profile information via the invite-only app, while its API appears to allow unlimited scraping of public profile data.

On Saturday, an SQL database containing data of 1.3 million Clubhouse users was posted on a hacker forum for anyone to access and do with as they please. The data included names, user IDs, social media profile names, and other details about Clubhouse users.

While not highly sensitive, the data could still be used by malicious actors to stage social engineering attacks, which might have been the reason why the threat actor released the database to the hacker forum.

ADVERTISEMENT

The response from Clubhouse in the wake of the incident was swift, if somewhat perplexing. Clubhouse stated that the company “has not been breached or hacked,” and that the data of 1.3 million people posted on the hacker forum “is all public profile information from [the Clubhouse] app, which anyone can access via the app or [their] API.”

“Not ideal” from a privacy standpoint

While Clubhouse’s statement makes it clear that the platform did not suffer a breach, it also raises the question of consumer privacy.

According to Dave Hatter, a cybersecurity expert at the Cincinnati-based cybersecurity company IntrustIT, adds that accessing user information via an app API is not something that social media companies generally should be happy about.

“It doesn’t look like anything super sensitive was leaked, but at the end of the day, I don’t think it’s good for Clubhouse users to have that information out there. […] But if you can scrape every Clubhouse user’s account, you could potentially use social engineering techniques against clubhouse users because of the information we’ve got there,” Hatter explained to CyberNews.

He claims that the standard approach should be focused on safeguarding privacy. If a bad actor had to collect some information going from one account to another, it might prevent most attempts at trying to steal user’s data.

“From a privacy standpoint, it’s certainly not ideal. I would assume that they would want to take the same kind of stance towards scraping that other large market layers take, which is ‘we’re not going to allow that’,” Hatter explained.

ADVERTISEMENT

William Malik, information security expert and the vice president of infrastructure systems at Trend Micro, also disapproved of how Clubhouse reacted to the news about the data being posted on a hacker forum. He told CyberNews that the reaction is akin to saying 'the crooks didn't breach the safe, we accidentally left it open.'

CyberNews senior information security researcher Mantas Sasnauskas adds that the user data dump reveals a potential privacy issue within the social media platform itself: “The way the Clubhouse app is built lets anyone with a token, or via an API, to query the entire body of public Clubhouse user profile information, and it seems that the token does not expire.”

Sasnauskas argues that even though the Clubhouse privacy policy does not allow unauthorized data mining and data scraping, the platform should go beyond simply stating it in the rules. “This should not only be reflected in the ToS, but also in the technical implementation of the app, making it harder for anyone to scrape user data. Having no anti-scraping measures in place can be seen as a privacy issue.”

We reached out to Clubhouse regarding their API policy, but received no response at the time of writing this report.

Unaddressed gaps in the system?

There were instances when Clubhouse missed or ignored cases where users notified the platform about gaps in their system. Roman Mittermayr, managing director at TwentyPeople, told CyberNews he reached out to Clubhouse back in February.

Mittermayr noticed that it was possible to leave a room in the app but remain connected to it on a second connection. Which meant that even though the account was not in the room, it was still possible to listen and record everything.

“This token can also be used to extract and search through public data. They offer a search endpoint that can return millions of records and then lets me slowly page through all the records. The search also allows using wildcards, so that means I can just search for a non-specific term and get all accounts back,” he explained to CyberNews.

According to Mittermayr, Clubhouse did not respond to issue reports he made on February 9 and 13.

Cybercriminals scrape public data to find new targets

ADVERTISEMENT

While the data associated with the Clubhouse user base was not acquired as a result of a breach, allowing ‘anyone with an API’ to download public Clubhouse profile information on a mass scale can backfire.

For example, data scraping is often used by spammers and phishers to find new victims: they aggregate public contact details and use them for spam lists, robocalls, or social engineering attacks. This is why web applications typically use scraping mitigation tools to protect against hostile data collection by bots and threat actors.

The fact that Clubhouse does not seem to implement any in-app anti-scraping measures could either mean that it was an oversight, or a deliberate decision on the part of the company whose peculiar data practices have been slammed by privacy advocates in the past. Such a permissive attitude towards user data could have made it much easier for cybercriminals to get their hands on massive amounts of user-related information, as demonstrated by the recent posting on the hacker forum.

Thankfully, this time the information released by the threat actor was not deeply sensitive. We can only hope that there won’t be a next time.


More from CyberNews:

Read our guide on best password managers and how to keep your passwords safe

Scraped data of 500 million LinkedIn users being sold online, 2 million records leaked as proof

Guide to best VPN services in 2024, such as Surfshark, ProtonVPN & NordVPN

ADVERTISEMENT