robots.txt #36

Open
opened 2026-05-06 08:34:34 +02:00 by inhji · 0 comments
Owner
  • Generate /sitemap.xml listing canonical URLs, keep it updated on publish, and reference it from /robots.txt.
  • Add explicit User-agent entries for AI crawlers (GPTBot, OAI-SearchBot, Claude-Web, Google-Extended) with allow/disallow rules that match your policy.
  • Add Content-Signal directives to your robots.txt declaring preferences for ai-train, search, and ai-input. For example: Content-Signal: ai-train=no, search=yes, ai-input=no
- Generate /sitemap.xml listing canonical URLs, keep it updated on publish, and reference it from /robots.txt. - Add explicit User-agent entries for AI crawlers (GPTBot, OAI-SearchBot, Claude-Web, Google-Extended) with allow/disallow rules that match your policy. - Add Content-Signal directives to your robots.txt declaring preferences for ai-train, search, and ai-input. For example: Content-Signal: ai-train=no, search=yes, ai-input=no
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
inhji/hajur#36
No description provided.