
Meta has released an open-source tool powered by artificial intelligence (AI) that will label and find sensitive documents, potentially putting an end to data leaks.
Huge tech companies like Meta often possess troves of sensitive data that hackers would love to get their hands on.
Instead of letting data fall through the cracks, Meta came up with a solution that will hopefully prevent data leaks as the company can locate data and secure it before nasty hackers get their hands on it.
The AI tool, called “Automated Sensitive Document Classification," was initially designed for Meta’s internal use, but has since been made publicly available, according to Help Net Security.
With the help of AI and a toolkit of custom classification rules, the tool can filter through documents to find sensitive information and automatically apply labels to find it easily.

Applying labels is useful as once the sensitive information has been identified, those working with the documents can protect the data from unauthorized access, potentially putting an end to data leaks.
This sensitive information can also be excluded from AI systems, suggesting that AI models will not be trained on it.
The tool uses Apache Tika to gather information from Google Docs, Sheets, and Slides. It then uses Llama, Meta’s own AI model, to identify sensitive content, using the Google Drive API to label them appropriately, Help Net Security writes.
Robin Franklin, a Security Engineer at Meta, told Help Net Security that data loss prevention is a massive issue in security and privacy.

“To meet our scalability and accuracy goals, we decided to build an LLM-based solution, which also ensured seamless auditability in our deployment,” Franklin told Help Net Security.
What’s unique about this tool is that it doesn’t just automatically label content that’s deemed sensitive – it also helps companies like Meta keep track of where this information is, so it can be secured.
The release of the tool comes just after hackers claimed a massive 1.2 billion user record database, which was supposedly scraped from the Meta-owned Facebook.
The hackers supposedly achieved this by abusing one of the social media platform‘s application programming interfaces (APIs).
Your email address will not be published. Required fields are markedmarked