TikTok dethroned Google and became world’s most-scraped website


TikTok is now topping the list of the most-scraped websites in the world, leaving tech giants such as Google and others in lower positions. A new report highlights the dramatic changes not only in positions on the list but also in how companies collect online data for AI training and market analysis.

Data scraping is evolving as quickly as the data itself. If several years ago data scraping could have been seen as an act of collecting text, it’s now broken that limit and is used for harvesting videos, images, and audio. All of that to feed the ever-data-hungry AI machine.

As the need to collect more data to train AI systems, track trends, and gain a deeper understanding of their audiences grows, it’s not only the media where scrapers work changes, but the big tech companies behind them as well.

ADVERTISEMENT

New research shows that video and social media platforms are the dominant data collection sources in 2025. They account for 38% of all scraping activity. The report, conducted by web-data firm Decodo, specifies which platforms are being scraped the most, how the types of data collected are changing, and what these trends mean for businesses.

“We're seeing a clear move toward websites that have lots of different types of content instead of just basic info. The biggest reason for this shift is that everyone needs tons of varied, good-quality data to train AI chatbots, language models, and other smart tools,” says Vaidotas Juknys, Chief Commercial Officer at Decodo.

Data transfer
Image by Thx4Stock team | Shutterstock

“Companies operating in various industries are also realizing that the best insights come from mixing different kinds of content together – videos, text, images, and how people interact with certain platforms," he adds.

So, what’s on the list?

The top 3: TikTok, Google, and Amazon

According to researchers, TikTok’s jump to the first place of the most scraped website in the world comes as a massive surprise, as last year, the China-rooted company didn’t even make it to the top 10.

“With over 1.5B active users and a unique algorithm-driven discovery system, this change reflects the AI industry's appetite for short-form video content and cultural trend analysis to train next-generation multimodal models,” the report states.

ADVERTISEMENT

Companies are increasingly turning to the platform to gather video content and metadata, hashtag trends, user-engagement metrics, audio and music usage patterns, creator analytics, comment sentiment, and location-based trends.

list of top 10 websites that are most often scraped, tiktok, amazon, youtube, google, eBay, airbnbn
Image by Decodo.

Although Google has dropped to second place, the report notes that it remains “absolutely critical for a range of use cases”.

“The platform processes over 13.7B searches daily, providing insights into global search trends, consumer behavior patterns, and real-time market demand across every industry and geography,” states the report.

The most commonly collected datasets include search result rankings and featured snippets, local business listings and customer reviews, Google Shopping product listings and pricing, and image search outputs.

Users also gather news aggregation results and auto-suggest keyword data, offering real-time insight into consumer interests, trending topics and search behaviour across global markets.

Amazon
Image by Pascal Rossignol | Reuters

Amazon fell to third place but continues to play a crucial role in e-commerce intelligence, from pricing trends and consumer reviews to market competition.

Another company that has also made it’s way to the top 10 is YouTube.

Specifically in its case, companies tend to explore this company’s data in order to understand speech, recognize objects, analyze facial expressions, and even pick up on cultural nuances from visual storytelling.

ADVERTISEMENT

“The platform's mix of languages, accents, and content types gives AI developers exactly what they need to build systems that can actually understand how humans communicate through sight and sound, not just text,” the report states.

In the meantime, Walmart has already been in the top 10, but this year it went one position down to the 5th place. According to the report, this is a reflection of the overall summary of this research: data scrapers turning to video-first platforms.

However, America’s largest retailer keeps its position strong on the list as data scraped from the company’s website plays a key role in research on the market, pricing strategies, retail intelligence, and becomes even more powerful when combined with data from other e-commerce giants such as Amazon or Target.


Unlock more exclusive Cybernews content on YouTube.