Google’s going to scrape the entire public Internet to train its AI tools and there’s nothing we can do about it

The gatekeepers of online content doing what everyone expected.

When you purchase through links on our site, we may earn an affiliate commission.Here’s how it works.

What you need to know

Google’s latest privacy policy update isn’t necessarily surprising, but it does also set off some alarm bells. Particularly for those who already have their doubts over the AI revolution.

As highlighted byGizmodothe latest statement on the search giant’sprivacy policycontains a key update relating to AI:

“For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”

The most recent policy prior to this only made mention of “language models” and specifically, Google Translate. The latest update makes it clear that anything public on the Internet Google is going to be feeding into its AI tools likeBard.

Is this surprising? Not at all. Google is the gatekeeper to the Internet, especially for publishers like us and our parent company. Playing the game of getting your content to rank well in Google is exhausting, but also critical. And now all of that content is going to be fed into Google AI.All of it.

It’s certainly going to stoke the flames of debate. Recently we’ve seen issues onReddit with regards to access to its API, the losers of which were basically the users of Reddit. Twitter’s owner, Elon Musk, has also been vocal about scraping, claiming the recent disaster on the platform with rate limits is in response to that (even if it might not be 100% true).

This move is only going to further stoke the debate, and the backlash over the training of AI tools.OpenAIhas already had its fair share over the data used to train the GPT model, the same that powers Microsoft’sBing Chat. Microsoft also has a search engine, but its reach pales in comparison to that of Google Search.

Get the Windows Central Newsletter

All the latest news, reviews, and guides for Windows and Xbox diehards.

The legality will also come into question. We’re in uncharted, murky waters with all this stuff. TheEU already has issues with Google Bard, and quite how this will align with the territory’s GDPR rules will be interesting to find out. Until it’s technically not illegal, maybe Google is just going to do what Google does. Which is whatever it wants.

AI models need to be trained somehow. But Google’s latest policy doesn’t seem to indicate the company is willing to compensate any of the creators of that content. Everyone needs their stuff to be surfaced in Google, and it does feel kind of like Google is abusing that to its own ends.

Buckle up, it’s going to be a bumpy ride.

Richard Devine is a Managing Editor at Windows Central with over a decade of experience. A former Project Manager and long-term tech addict, he joined Mobile Nations in 2011 and has been found on Android Central and iMore as well as Windows Central. Currently, you’ll find him steering the site’s coverage of all manner of PC hardware and reviews. Find him on Mastodon atmstdn.social/@richdevine

Google’s going to scrape the entire public Internet to train its AI tools and there’s nothing we can do about it#

What you need to know#

Get the Windows Central Newsletter#

Google’s going to scrape the entire public Internet to train its AI tools and there’s nothing we can do about it

What you need to know

Get the Windows Central Newsletter