💧

GEO for Drupal: Complete Implementation Guide

intermediate

Drupal implements GEO via three modules: Metatag (for meta tags and Open Graph), Schema.org Metatag (for JSON-LD generation), and Simple XML Sitemap (for lastmod support). Configure robots.txt in the public files directory to allow AI crawlers. Drupal's static HTML output is natively crawlable.

GEO for Drupal: Complete Implementation Guide

Drupal implements GEO via three modules: Metatag (for meta tags and Open Graph), Schema.org Metatag (for JSON-LD generation), and Simple XML Sitemap (for lastmod support). Configure robots.txt in the public files directory to allow AI crawlers. Drupal’s static HTML output is natively crawlable.

Drupal renders server-side HTML by default, making it natively compatible with AI crawlers. The primary GEO work is module configuration and robots.txt setup.

Required Modules

Install via Composer:

composer require drupal/metatag
composer require drupal/schema_metatag
composer require drupal/simple_sitemap
drush en metatag schema_metatag simple_sitemap -y
drush cr

Metatag Module Configuration

The Metatag module handles all meta tags, Open Graph, and article dates.

Configuration path: Admin → Configuration → Search and metadata → Metatags

Global defaults

Set these defaults that apply to all content types:

Title: [node:title] | [site:name]
Description: [node:summary]
Canonical URL: [node:url]

Open Graph (OG) tab

OG: Type: article
OG: Title: [node:title]
OG: Description: [node:summary]
OG: URL: [node:url]
OG: Site Name: [site:name]
OG: Image: [node:field_image:entity:url]
OG: Locale: en_US

Article tags (critical for recency)

Article: Published Time: [node:created:html_datetime]
Article: Modified Time: [node:changed:html_datetime]
Article: Author: [node:author:url]
Article: Section: [node:field_category:entity:name]

Note: [node:created:html_datetime] outputs ISO 8601 format (2026-04-18T00:00:00+00:00) which is correct for GEO recency signals.

Schema.org Metatag for JSON-LD

The Schema.org Metatag module adds JSON-LD structured data. After enabling, go to Admin → Configuration → Search and metadata → Metatags → Edit (for Article content type).

Article JSON-LD configuration

In the Schema.org tab, configure Article schema:

@type: Article
headline: [node:title]
description: [node:summary]
datePublished: [node:created:html_datetime]
dateModified: [node:changed:html_datetime]
author @type: Person
author name: [node:author:display-name]
author url: [node:author:url]
publisher @type: Organization
publisher name: [site:name]
publisher logo url: https://yoursite.com/logo.png
mainEntityOfPage @id: [node:url]

This generates complete JSON-LD for every Article node automatically.

robots.txt

Drupal’s robots.txt is in the Drupal root directory. Edit it to add AI crawlers:

# AI Crawlers — required for GEO
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: BingBot
Allow: /

# Existing Drupal rules below
User-agent: *
Crawl-delay: 10
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /filter/tips
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
Disallow: /user/logout/

Sitemap: https://yoursite.com/sitemap.xml
Sitemap: https://yoursite.com/llms.txt

Important: For managed hosting (Acquia, Pantheon, etc.) where you can’t edit robots.txt directly, use the robots.txt module which adds UI-based management.

llms.txt via Custom Module

Create a simple custom module to serve llms.txt:

<?php
// web/modules/custom/geo_llms/geo_llms.routing.yml
geo_llms.txt:
  path: '/llms.txt'
  defaults:
    _controller: '\Drupal\geo_llms\Controller\LlmsController::generate'
  requirements:
    _access: 'TRUE'
<?php
// web/modules/custom/geo_llms/src/Controller/LlmsController.php
namespace Drupal\geo_llms\Controller;

use Drupal\Core\Controller\ControllerBase;
use Symfony\Component\HttpFoundation\Response;

class LlmsController extends ControllerBase {
  public function generate(): Response {
    $site_name = \Drupal::config('system.site')->get('name');
    $site_slogan = \Drupal::config('system.site')->get('slogan');
    $base_url = \Drupal::request()->getSchemeAndHttpHost();

    // Get published nodes
    $query = \Drupal::entityQuery('node')
      ->condition('status', 1)
      ->sort('changed', 'DESC')
      ->range(0, 30)
      ->accessCheck(TRUE);
    $nids = $query->execute();
    $nodes = \Drupal\node\Entity\Node::loadMultiple($nids);

    $output = "# {$site_name}\n";
    $output .= "> {$site_slogan}\n\n";
    $output .= "## Main Content\n";

    foreach ($nodes as $node) {
      $title = $node->getTitle();
      $url = $node->toUrl()->setAbsolute()->toString();
      $summary = $node->hasField('body') ? strip_tags($node->get('body')->summary) : '';
      $output .= "- [{$title}]({$url}): {$summary}\n";
    }

    return new Response($output, 200, ['Content-Type' => 'text/plain']);
  }
}

Simple XML Sitemap Configuration

After enabling the module:

  1. Admin → Configuration → Search and metadata → Simple XML Sitemap
  2. Enable all content types you want indexed
  3. Set regeneration to “On entity save” — this updates lastmod automatically
  4. Set priority for different content types (0.9 for key guides, 0.7 for blog posts)

The module generates /sitemap.xml with <lastmod> from Drupal’s entity changed timestamp.

Programmatic Meta Tags (Alternative to Module UI)

For developers who prefer code-based configuration:

<?php
// In a custom module or theme's preprocess_html
function mymodule_preprocess_html(&$variables) {
  $node = \Drupal::routeMatch()->getParameter('node');
  if (!$node) return;

  $published = date('c', $node->getCreatedTime());
  $modified = date('c', $node->getChangedTime());

  $variables['page']['#attached']['html_head'][] = [
    [
      '#tag' => 'meta',
      '#attributes' => [
        'property' => 'article:published_time',
        'content' => $published,
      ],
    ],
    'article_published_time',
  ];

  $variables['page']['#attached']['html_head'][] = [
    [
      '#tag' => 'meta',
      '#attributes' => [
        'property' => 'article:modified_time',
        'content' => $modified,
      ],
    ],
    'article_modified_time',
  ];
}

GEO Checklist for Drupal

  • Metatag module: installed and configured with Open Graph and article dates
  • Schema.org Metatag: Article schema with publisher and dates
  • robots.txt: all 8 AI crawlers explicitly allowed
  • llms.txt: endpoint created (custom module or static file)
  • Simple XML Sitemap: configured with lastmod for all content types
  • Sitemap regeneration on entity save enabled
  • Canonical URL configured in Metatag module
  • Article description/summary field populated for all nodes
  • Inverted pyramid: answer in first paragraph of body field
  • Core Web Vitals: LCP < 2.5s, INP < 200ms, CLS < 0.1