Deleted GitHub data is forever accessible to anyone, researchers claim


Microsoft-owned GitHub’s design makes repository data forever available, potentially enabling malicious actors to access sensitive information such as API keys and secrets even after users think they’ve deleted it.

Researchers at Truffle Security dubbed the GitHub vulnerability a Cross Fork Object Reference (CFOR). The flaw manifests when “one repository fork can access sensitive data from another fork, including data from private and deleted forks.”

“You can access data from deleted forks, deleted repositories, and even private repositories on GitHub. And it is available forever. This is known by GitHub and intentionally designed that way,” researchers claim.

ADVERTISEMENT

Forks, or copies of other users’ work, are kept in GitHub’s network. When changes, such as deletions or visibility changes, are made, the data remains accessible through other parts of the network.

In other words, GitHub keeps copies of data without users realizing it. GitHub is an essential collaboration tool for software developers with over 100 million users, meaning that a number of organizations may have their sensitive data unintentionally exposed.

According to the researchers, after a fork is deleted, the committed data remains accessible via the original repository. Additionally, after users delete the entire forked repository, the data can still be accessed via any existing fork. Finally, after users make a private repository public, with some internal forks kept private, sensitive data may be accessible via the public repository.

“The implication here is that any code committed to a public repository may be accessible forever as long as there is at least one fork of that repository,” the report’s authors claim.

To protect users from exposing their projects, GitHub hashes snapshots of in-progress projects, called commits within GitHub. However, researchers claim that hashes can be brute-forced or accessed via Github’s public events API.

Truffle Security’s researchers note that GitHub is not secretive about keeping the data accessible, outlining the whole process in its documentation. However, many GitHub users are likely unaware that separating private and public repositories doesn’t guarantee privacy.

“The average user views the separation of private and public repositories as a security boundary and understandably believes that any data located in a private repository cannot be accessed by public users. Unfortunately, as we documented above, that is not always true,” the report’s authors claim.

CFOR may pose risks to organizations as confidential information developers thought they’d secured is actually accessible. That may include API keys, passwords, and proprietary code, unwittingly leaving organizations open to data breaches.

ADVERTISEMENT

Earlier this week, researchers from Check Point unveiled a “never seen before” sophisticated malicious operation on GitHub. A phishing ring, dubbed Stargazers Ghost Network, which is spreading malware and targeting gamers, social media enthusiasts, and crypto holders via malicious repositories.