📖 User Guide

How to use the AI Bot Access toolkit

Three tools, one goal: make sure AI models can read your site and describe it accurately. Here's what each tool does, when to use it, and how to get the most out of it.

Contents

Why this matters (the short version)
Tool 1 — robots.txt Scanner
Tool 2 — Free llms.txt Generator
Tool 3 — AI llms.txt Builder (BYOK)
Which tool should I use?
The recommended workflow
Troubleshooting & FAQ

Why this matters (the short version)

AI-generated answers are replacing the top of search. When someone asks ChatGPT, Claude, Perplexity or Google's AI Overviews a question, the models decide which sources to cite based on two things: whether they're allowed to read your site, and whether they understand what your site is about.

The AI Bot Access toolkit covers both sides of that equation:

Access is controlled by robots.txt. Our Scanner tells you which AI crawlers your current rules are blocking.
Understanding is communicated by llms.txt — a plain-Markdown file that tells AI models what your site does and which pages matter. Our Free Generator and AI Builder help you create one.

"llms" stands for Large Language Models — the AI systems behind ChatGPT, Claude, Perplexity, and Google's Gemini.

Tool 1 — robots.txt Scanner

Free · Public · No signup

robots.txt Scanner

aibotaccess.com/

What it does: Fetches your site's robots.txt, parses it using the same logic Google's robots parser uses, and resolves each major AI crawler against the ruleset. For each bot you see Allowed, Partially allowed, or Blocked, plus an explanation of why.

When to use it

You just launched or migrated a site and want to confirm AI tools can read it.
You're not sure whether a security plugin, CDN, or default hosting rule is quietly blocking AI bots.
You changed your robots.txt and want to verify the new rules do what you intended.
You're running an audit for a client site before shipping a visibility strategy.

Benefits

Covers 18 AI crawlers including GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, Claude-Web, anthropic-ai, PerplexityBot, Perplexity-User, Google-Extended, Applebot-Extended, CCBot, Meta-ExternalAgent, Bytespider, cohere-ai, DuckAssistBot, Amazonbot, Diffbot, and YouBot.
Distinguishes training bots from answer bots — so you know which blocks hurt your live AI citations vs. long-term model training.
Copy-paste fix: one click gives you a corrected robots.txt snippet that explicitly allows the AI crawlers you want.
Also checks llms.txt presence as a positive signal.

How to use it

Go to aibotaccess.com and enter any domain (e.g. example.com or https://example.com).
Click Check Site. The scanner fetches /robots.txt via a server-side proxy (no CORS issues) and analyzes it.
Review the summary: Allowed, Partial, and Blocked counts tell you at a glance where you stand.
Scroll the per-bot list to see exactly which rule (if any) is blocking each crawler.
If you have blocks to fix, click 📋 Copy to Clipboard under the recommended robots.txt section and paste the result into your site's robots.txt file at the root of your domain.
Re-scan to confirm the new rules behave as expected.

Reading the results

Allowed — the bot is free to fetch the whole site.
Partial — the bot can access the root, but specific paths are disallowed (common for /admin, /cart, etc. — usually fine).
Blocked — the bot is disallowed from the root. This is the problem state; the fix-snippet addresses it.

Tip. If a block shows up that you don't recognize, check for: (1) a bare User-agent: * with Disallow: /, (2) a WAF/firewall rule at your CDN (Cloudflare, etc.) that blocks bots before they even see robots.txt, or (3) a stale cached robots.txt served by your CDN.

Tool 2 — Free llms.txt Generator

Free · Public · No signup

Free llms.txt Generator

aibotaccess.com/llms-txt-generator

What it does: Reads your site's sitemap.xml and homepage, ranks the most likely important pages, and drops them into a valid llms.txt template you can edit, copy, and publish. Purely template-based — no AI call, no cost.

When to use it

You don't have an llms.txt yet and want to ship one today.
Your site is small-to-medium (roughly under 50 content pages).
You're comfortable editing the generated descriptions yourself.
You want a zero-cost baseline before deciding whether to invest in the AI Builder.

Benefits

Instant output. No API keys, no wait.
Correct structure. Follows the llmstxt.org spec exactly — H1, blockquote summary, H2 sections with linked page lists.
Sensible page ranking. Picks up to 10 pages, prioritizing shallow paths and keyword hits like /about, /pricing, /docs, /features, /blog, etc.
Editable in place. Result lands in a textarea so you can rewrite descriptions, reorder sections, or remove links before downloading.

How to use it

Go to the generator and enter your site URL.
Click Generate llms.txt. The tool fetches your homepage (for title and description) and your sitemap (for URLs), then builds the template.

Review the draft in the editable textarea. It will look like:

# Your Site
> One-line summary from your meta description

## Key pages
- [Home](https://yoursite.com): replace this with a one-line description...
- [About](https://yoursite.com/about): replace this...

Replace the "replace this" placeholders with real one-line descriptions of each page.
Click 📋 Copy or ⬇ Download llms.txt.
Upload the file to the root of your domain so it's reachable at https://yoursite.com/llms.txt.

Where to put it. llms.txt belongs at your domain root, same place as robots.txt. In WordPress, a plugin or a rule in your theme's functions. In Next.js/Astro/static sites, drop it in the public/ folder. In Nginx/Apache, it's a file in your docroot.

Tool 3 — AI llms.txt Builder (BYOK)

Premium · Access code · BYOK

AI llms.txt Builder

aibotaccess.com/ai-llms-txt-generator → /ai-builder

What it does: Crawls up to 200 of your pages, reads their actual content, and hands everything to Claude or GPT. The AI writes a full llms.txt with real, tailored, brand-aware descriptions — no templates, no placeholders. You bring your own Anthropic or OpenAI API key, so you pay only the provider's per-token cost (pennies per site, typically).

When to use it

You run a larger site where templated descriptions won't capture the value of each page.
You're an agency producing llms.txt as a deliverable for multiple clients.
You want polished, brand-voice descriptions without writing them by hand.
You need to ship fast and your client's site has 50+ meaningful pages.

Benefits

Real descriptions, not stubs. The model actually reads each page's content and summarizes it in-context.
Brand voice. Reads your homepage to pick up tone and carries it through.
Scalable crawl. Slider from 3 to 200 pages — go deep when sites are content-rich.
Pay pennies, not hundreds. BYOK means you're billed by Anthropic or OpenAI directly — typically $0.05–$1.00 per site depending on model and page count.
Progress bar + Stop & Save. Watch each page complete; stop any time to save a draft from whatever has been fetched.
Self-documenting output. The generated file includes an HTML-comment footer recording the date, page count, and model used.

Before you start

Get an access code from us (contact form or waitlist).
Have an Anthropic API key (sk-ant-...) or OpenAI API key (sk-...) ready. Keys are entered in the tool and sent only to your chosen provider — we never store them.
Know roughly how many pages your site has so you can pick a sensible crawl depth.

How to use it

Visit the AI Builder landing page, enter your access code, and click Launch Builder. Your code is remembered on the device.
On the builder page, enter your site URL (e.g. https://example.com).
Pick your AI provider (Anthropic or OpenAI) and model. Claude Sonnet 4.6 is the recommended default; Haiku 4.5 is faster and cheaper for simple sites; Opus 4.6 is overkill for most cases.
Paste your API key. Leave "Remember key on this device" checked unless you're on a shared machine.
Use the Pages to scan slider — 15 is the sensible default; go higher for large content sites. Watch the Estimated cost box update in real time.
(Optional) Add brand notes or instructions in the text area — e.g., "we're B2B enterprise, avoid marketing fluff" or "emphasize our developer audience."
Click Generate llms.txt with AI →.
Watch the progress bar: homepage → sitemap → crawl N/Total → Claude/OpenAI call → done.
If you need to abort, click ■ Stop & save. Any pages already fetched are saved as a draft template so you don't lose work.
Review the generated llms.txt. Edit freely, then 📋 Copy or ⬇ Download.
Upload the file to your domain root at /llms.txt.

Cost estimate. A typical 15-page site run costs $0.05–$0.15 on Haiku, $0.30–$0.80 on Sonnet. A large 100-page site on Sonnet might run $2–$5. These are your API costs paid directly to Anthropic or OpenAI — we don't mark anything up.

Tips for better output

Use the instructions field to tell the AI what kind of business you are, who your audience is, and any descriptor constraints ("no em-dashes", "British English", etc.).
Scan 20–30 pages for content-heavy sites rather than the default 15 — the model can then group pages into cleaner sections.
Re-run after editing if the output misses a key theme. The 🔄 Regenerate button reuses your settings.
Ship it, then refine. Even a first draft is a huge upgrade over no llms.txt — don't let perfect be the enemy of deployed.

Which tool should I use?

Goal	Best tool
Check if AI can read my site	robots.txt Scanner
Diagnose why a specific bot is blocked	robots.txt Scanner
Get a quick `llms.txt` template for a small site	Free Generator
Write `llms.txt` descriptions by hand with a good starting point	Free Generator
Produce a polished `llms.txt` for a large content site	AI Builder
Run `llms.txt` as an agency deliverable	AI Builder
Generate multiple client `llms.txt` files fast	AI Builder

The recommended workflow

Here's the end-to-end sequence we suggest for any site:

Scan first. Run the robots.txt Scanner. If AI bots are blocked, paste the recommended snippet into your robots.txt and verify with a re-scan. No point writing an llms.txt if bots can't read the site.
Generate a starter llms.txt. Run the Free Generator to get a structurally-correct baseline. For small sites this may be all you need after a round of manual edits.
Upgrade to AI when it matters. For client work, marketing-heavy sites, or sites where description quality has real business impact, run the AI Builder with Sonnet 4.6 and 15–30 pages.
Publish. Upload the final llms.txt to your domain root so it's live at /llms.txt.
Verify. Run the scanner one more time — it'll confirm your new llms.txt is discoverable.
Re-check quarterly. AI crawlers come and go, and your pages evolve. A 5-minute re-scan every 3 months is a small insurance premium on your AI visibility.

Troubleshooting & FAQ

The scanner says my robots.txt is blocking GPTBot, but I didn't write that rule.

Most often this is a bare User-agent: * + Disallow: / combination, or a security plugin/firewall that injects rules. Start by viewing your robots.txt in a browser (yoursite.com/robots.txt) and checking the raw content. If it's empty but bots still fail, suspect a CDN/WAF rule blocking by user-agent at a layer above robots.txt.

Will allowing AI crawlers hurt my Google rankings?

No. AI crawlers (GPTBot, ClaudeBot, etc.) are separate from traditional search crawlers (Googlebot, Bingbot). Allowing one does not affect the other.

The free generator only uses my sitemap — what if I don't have one?

If there's no sitemap at /sitemap.xml or /sitemap_index.xml, the generator builds a template from your homepage alone. You can still edit the result to add key URLs manually.

How do I get an AI Builder access code?

Reach out via the waitlist on the landing page. Codes are limited during early access.

Is my API key safe in the AI Builder?

Your key is sent only to your chosen provider (Anthropic or OpenAI). It's stored in your browser's localStorage if you tick "Remember key on this device" so you don't re-enter it. Our serverless proxy relays the one request to the provider and does not log request bodies or keys. If you prefer, untick "Remember" after each run, or click Sign out to clear everything.

What if the AI-generated llms.txt isn't quite right?

The output is always shown in an editable textarea. Tighten voice, reorder sections, remove pages you don't want featured, or re-run with different instructions. The 🔄 Regenerate button reuses your domain and settings for a fresh pass.

Can I use these tools for client sites?

Yes — especially the AI Builder. Nothing in the workflow identifies you or your clients to us, and the output is yours to ship under your own brand.

How often should I re-generate my llms.txt?

When you add or remove major pages, when your site's messaging changes materially, or every 6–12 months as a refresh. Unlike robots.txt, llms.txt isn't fetched on every crawl — models pull it occasionally as context, so minor page changes don't warrant re-generation.