Google can use public data for AI training, policy update says


Google has adjusted its privacy policy and can now use public data to help train and create its artificial intelligence products, including Bard.

As of July 1st, the tech giant's newly updated policy reads: "Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities."

Previously, the tech giant’s policy claimed that public data could only be used to train Google’s “language models” and mentioned Google Translate once.

ADVERTISEMENT
google-privacy
Google has updated its privacy policy. Courtesy of Google.

The user experience shouldn’t change, but the adjustment is a signal that Google is leaning more heavily into its AI bid. It’s also a sign that the general public’s search behavior is an important factor in further product development.

Google recently showed off its latest AI innovations, including a more powerful version of Bard, in its annual keynote address in California. Google’s chief executive Sundar Pichai insisted that products were being developed responsibly.

However, critics have raised concerns about companies using information posted online to train their large language models for generative AI.

Just recently, OpenAI, creator of wildly popular generative AI model ChatGPT, was hit with a proposed class action lawsuit alleging it stole people’s data.

The nearly 160-page complaint alleges that this personal data, including “essentially every piece of data exchanged on the internet it could take,” was also seized by the company without notice, consent or “just compensation.” Moreover, this data scraping occurred at an “unprecedented scale,” the suit claims.

Some social media sites have also been taking action to either prevent or profit from the AI boom. Reddit has started charging for access to its Application Programming Interface (API), and Twitter’s owner Elon Musk threatened to sue Microsoft for using Twitter data to train its AI.

Twitter also put a restriction on how many tweets per day a user can see and blamed the move on “extreme levels of data scraping and system manipulation,” although other factors might also be involved.

In the case of policy updates at Google, lawsuits having to do with copyright can also be expected. Still, data scraping isn’t illegal if only data that’s publicly available on the internet is vacuumed up – Google surely knows that.

ADVERTISEMENT