Friday, December 20, 2024

AI Crawlers Account For 28% Of Googlebot’s Traffic, Study Finds

Must read


A report released by Vercel highlights the growing impact of AI bots in web crawling.

OpenAI’s GPTBot and Anthropic’s Claude generate nearly 1 billion requests monthly across Vercel’s network.

The data indicates that GPTBot made 569 million requests in the past month, while Claude accounted for 370 million.

Additionally, PerplexityBot contributed 24.4 million fetches, and AppleBot added 314 million requests.

Together, these AI crawlers represent approximately 28% of Googlebot’s total volume, which stands at 4.5 billion fetches.

Here’s what this could mean for SEO.

Key Findings On AI Crawlers

The analysis looked at traffic patterns on Vercel’s network and various web architectures. It found some key features of AI crawlers:

  • Major AI crawlers do not render JavaScript, though they do pull JavaScript files.
  • AI crawlers are often inefficient, with ChatGPT and Claude spending over 34% of their requests on 404 pages.
  • The type of content these crawlers focus on varies. ChatGPT prioritizes HTML (57.7%), while Claude focuses more on images (35.17%).

Geographic Distribution

Unlike traditional search engines that operate from multiple regions, AI crawlers currently maintain a concentrated U.S. presence:

  • ChatGPT operates from Des Moines (Iowa) and Phoenix (Arizona)
  • Claude operates from Columbus (Ohio)

Web Almanac Correlation

These findings align with data shared in the Web Almanac’s SEO chapter, which also notes the growing presence of AI crawlers.

According to the report, websites now use robots.txt files to set rules for AI bots, telling them what they can or cannot crawl.

GPTBot is the most mentioned bot, appearing on 2.7% of mobile sites studied. The Common Crawl bot, often used to collect training data for language models, is also frequently noted.

Both reports stress that website owners need to adjust to how AI crawlers behave.

3 Ways To Optimize For AI Crawlers

Based on recent data from Vercel and the Web Almanac, here are three ways to optimize for AI crawlers.

1. Server-Side Rendering

AI crawlers don’t execute JavaScript. This means any content that relies on client-side rendering might be invisible.

Recommended actions:

  • Implement server-side rendering for critical content
  • Ensure main content, meta information, and navigation structures are present in the initial HTML
  • Use static site generation or incremental static regeneration where possible

2. Content Structure & Delivery

Vercel’s data shows distinct content type preferences among AI crawlers:

ChatGPT:

  • Prioritizes HTML content (57.70%)
  • Spends 11.50% of fetches on JavaScript files

Claude:

  • Focuses heavily on images (35.17%)
  • Dedicates 23.84% of fetches to JavaScript files

Optimization recommendations:

  • Structure HTML content clearly and semantically
  • Optimize image delivery and metadata
  • Include descriptive alt text for images
  • Implement proper header hierarchy

3. Technical Considerations

High 404 rates from AI crawlers mean you need to keep these technical considerations top of mind:

  • Maintain updated sitemaps
  • Implement proper redirect chains
  • Use consistent URL patterns
  • Regular audit of 404 errors

Looking Ahead

For search marketers, the message is clear: AI chatbots are a new force in web crawling, and sites need to adapt their SEO accordingly.

Although AI bots may rely on cached or dated information now, their capacity to parse fresh content from across the web will grow.

You can help ensure your content is crawled and indexed with server-side rendering, clean URL structures, and updated sitemaps.


Featured Image: tete_escape/Shutterstock



Source link

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article