Skip to main content
AI Visibility Updated 1 March 2026 8 min read Originally published February 2026

AI Discovery Files: New Data Shows 93% of Websites Are Invisible to AI

Research across 1,460 top websites confirms 93.5% have no AI Discovery Files. Of the 6.5% that do, 56% are invalid. A free WordPress plugin and open specifications can fix that.

MM
Mark McNeece Founder & Managing Director, 365i
AI Discovery Files visibility standards showing structured data versus raw HTML for AI systems

New research across 1,460 of the world's top websites confirms what we suspected: 93.5% have no AI Discovery Files. When ChatGPT, Gemini, Claude, or Perplexity visits these sites, they find raw HTML and guesswork. No structured identity. No guidance on services, location, or brand voice. Nothing an AI system can confidently quote or recommend.

For business owners, that's not a warning. It's an open door. If your competitors haven't prepared for AI discovery (and the data says they almost certainly haven't), creating quality AI Discovery Files today puts you in the top 6.5% of websites on the planet. We've built a free WordPress plugin to make it even easier.

Before and after comparison showing structured AI Discovery Files data replacing raw unstructured HTML
Left: what AI sees without AI Discovery Files (raw HTML, confusion). Right: structured data it can trust and cite.

What AI Systems Actually Look For

When an AI assistant answers a question about a business, it doesn't work like Google. There's no page of ten results. There's one answer, maybe two. SOCi's 2026 Local Visibility Index found that ChatGPT recommends just 1.2% of local business locations. Compare that with 35.9% appearing in Google's local 3-pack. AI visibility is, as SOCi's research concluded, "three to 30 times harder to achieve than ranking well in traditional local search."

So what decides which businesses make the cut? AI systems look for structured, machine-readable information they can trust. Your website's HTML gives them some of that, but not much. They need to know who you are, what you do, where you operate, how you want to be described, and what questions your customers commonly ask.

That's what AI Discovery Files do. They're a set of structured text and JSON files placed on your website, each with a specific job:

  • llms.txt - Your business identity written for language models. Not marketing copy. Clear facts: services, expertise, differentiators.
  • identity.json - Canonical business data in structured format. Name, address, services, contact details, social profiles.
  • brand.txt - Rules for how your brand should be represented. Correct name, approved descriptions, things to avoid.
  • faq-ai.txt - Pre-verified Q&A pairs that AI can quote directly when answering questions about your business.
  • ai.txt - Your permissions and preferences for how AI systems should use your content.

Ten file types in total, all documented at the AI Discovery File Specifications. They were created by Mark McNeece at 365i and adopted by platforms including Yoast, Wix, and Cloudflare.

The difference matters. Without these files, an AI system has to guess what your business does by scraping your homepage HTML. With them, it gets structured, verified data it can cite with confidence.

Quality Is What Separates Winners from Wasters

Having the files is only half the story. The Q1 2026 AI Discovery File Adoption Research validated every file against its published specification. The result: 56% of AI Discovery Files on top websites are invalid.

What counts as invalid? Files that don't follow the spec. Auto-generated URL dumps with no real business context. HTML error pages disguised as discovery files (soft 404s). Placeholder content that tells an AI system nothing useful. If you're not sure whether your own files pass, our guide to validating AI discovery files and testing what AI actually sees walks through the process step by step.

Two earlier studies by OtterlyAI and SE Ranking tested llms.txt and found zero measurable impact on AI visibility. But as our analysis of those studies pointed out, neither one checked whether the files actually worked. They counted existence and stopped there. When most files are broken, no wonder they had no effect.

This research took a different approach. Every file was validated and classified as Complete, Minimal, Invalid, or Not Found. The crawler detects soft 404s (servers returning HTTP 200 with an error page rather than the actual file). Other studies count those as "present." This one doesn't. The full validation methodology is published openly for anyone to scrutinise or reproduce.

"Today websites are not just used to provide information to people, but they are also used to provide information to large language models."

Jeremy Howard, Answer.AI

I remember reading this when Jeremy first published the llms.txt specification in 2024. It felt obvious at the time, the way good ideas always do once someone actually says them out loud. But working with UK businesses since then has shown me just how few have acted on it. The research numbers confirm what we were seeing anecdotally: businesses know AI matters, but they haven't changed their websites to reflect that. Not yet.

He was right about the need. The data shows most businesses attempting to meet it are failing. That's your opportunity to do it properly.

WordPress plugin settings interface showing AI Discovery File generation options with toggle switches
The AI Discovery Files WordPress plugin generates structured AI-readable files from your existing site content.

A Free WordPress Plugin to Create Your Files

For the 43% of the web that runs on WordPress, we've built a plugin that handles the whole process.

It's called AI Discovery Files, and it's free on the WordPress plugin repository. The plugin reads your site's existing content (pages, posts, settings, business information) and generates properly formatted AI Discovery Files automatically. No manual file editing. No guessing at JSON structures. It follows the specifications so you don't have to memorise them.

It joins our existing WordPress plugins: 365i Queue Optimizer for background task processing, 365i Performance Optimizer for speed and Core Web Vitals, and 365i Environment Indicator to stop you editing the wrong environment. You can find them all at wordpress.org/plugins/search/365i/.

Free WordPress Plugin

Generate AI Discovery Files from your dashboard

Using WordPress? Install the plugin and create all 10 files in minutes. No coding, no configuration files to edit manually.

Get the Plugin →

Not on WordPress? The specifications are open and free to implement on any platform. If you're on our managed WordPress hosting, we can help you get set up.

The Early Mover Window Is Closing

AI Discovery Files sit at 6.5% adoption among the world's top websites. That already outpaces humans.txt (2.5%, proposed in 2011) and is approaching security.txt (12.7%, RFC published 2022). For a standard published in 2024, that growth rate is fast. We've mapped this trajectory against robots.txt (1994) and sitemaps (2005) in detail.

But here's the number that should focus your attention: only 22 sites out of 1,460 scored AI-Ready on the research's five-tier scale. Shopify, Stripe, Opera, Reed.co.uk, ScotRail, English Heritage, Mailchimp, and SourceForge are among them. Zero sites reached the highest tier. The peak is empty.

Getting your business to AI-Ready status puts you ahead of 98.5% of the top domains on the internet. Not a marginal edge. A structural one. With ChatGPT now serving ads from the first message, organic slots in AI answers carry real commercial value.

AI Discovery File adoption rate chart showing 6.5 percent of top websites with files, 93.5 percent without
AI Discovery Files are at the early growth stage of the adoption curve. The steep climb lies ahead, and early movers capture the advantage.

That advantage won't last forever. Yoast, Wix, and Cloudflare have already built native AI Discovery File support into their platforms. As more CMS tools add it, adoption will accelerate. The window for early movers is measured in months, not years.

"Structured data is one of the most powerful things you can add to your website. It's a direct line to search engines and AI."

John Mueller, Search Advocate, Google Search Central

John's been saying variations of this for years, and it keeps becoming more relevant. When we first started implementing structured data for clients back in 2015 or so, most businesses didn't see the point. Now structured data is table stakes for search. AI Discovery Files are the next step in that same direction: giving machines the structured information they need to represent you accurately. The businesses that move first on each wave of structured data always benefit most.

Check Where You Stand Right Now

Three things you can do today, all free:

  1. Run the checker. The AI Visibility Checker scans your site for all ten AI Discovery File types, validates each one, and shows a live ChatGPT snapshot of how AI describes your business right now. Takes under a minute.
  2. Read the specifications. The full AI Discovery File specifications are published openly. Start with llms.txt and identity.json. These two files give AI systems the most useful structured data about your business.
  3. Get listed. Submit your site to the AI Discovery Files Directory for a verified listing with a dofollow backlink and a badge you can display on your site.

The full research breakdown with all the data tables is on our sister site. And the raw research data is available under CC BY 4.0 at ai-visibility.org.uk/research/.

Your competitors haven't started. 93.5% of the top websites on the internet prove it. The specifications are open, the free WordPress plugin is ready, and the early mover window is still open. Walk through it.

Frequently Asked Questions

What are AI Discovery Files?

AI Discovery Files are structured text and JSON files placed on your website that tell AI systems who your business is, what you offer, how to represent your brand, and what questions your customers commonly ask. There are ten types, including llms.txt, identity.json, brand.txt, and faq-ai.txt. The full specifications are published at ai-visibility.org.uk/specifications/.

Why do AI Discovery Files matter for my business?

AI assistants like ChatGPT, Gemini, Claude, and Perplexity increasingly recommend businesses to users. Without AI Discovery Files, these systems guess about your services from raw HTML. With them, AI gets structured data it can cite confidently. SOCi's 2026 research found that ChatGPT recommends just 1.2% of businesses, so the ones AI does recommend need clear, machine-readable identities.

How many websites currently have AI Discovery Files?

6.5% of the world's top 1,460 websites. That means 93.5% have nothing. Of the 6.5% that do have files, 56% fail validation. Only 22 sites (1.5%) qualified as AI-Ready, and zero achieved the highest tier.

What does the AI Discovery Files WordPress plugin do?

The plugin reads your WordPress site's existing content (pages, posts, settings, business information) then generates properly formatted AI Discovery Files automatically. It creates llms.txt, identity.json, ai.txt, brand.txt, and other file types following the published specifications. No manual file editing or JSON knowledge required.

Where can I get the AI Discovery Files plugin?

The plugin is free on the WordPress plugin repository at wordpress.org/plugins/ai-discovery-files/. Install it directly from your WordPress dashboard by searching for "AI Discovery Files", or download it from the plugin homepage.

How do I check if my website is visible to AI?

Use the free AI Visibility Checker at 365i.co.uk. It scans your website for all ten AI Discovery File types, validates each one against the specification, checks your robots.txt AI policy, and runs a live ChatGPT snapshot showing what AI says about your business. Takes under a minute.

Which AI systems benefit from AI Discovery Files?

Any AI system that visits your website and needs to understand your business. That includes ChatGPT, Gemini, Claude, Perplexity, Microsoft Copilot, and emerging AI agents. Google is already indexing between 30,000 and 60,000 llms.txt files globally, confirming that search engines are consuming the format.

Can I create AI Discovery Files without the WordPress plugin?

Yes. The specifications are open and free to implement on any platform. You can create the files manually by following the documentation at ai-visibility.org.uk/specifications/. Start with llms.txt and identity.json as they provide the most value. The plugin simply automates this for WordPress sites.

Make Your Business Visible to AI

Our managed WordPress hosting includes expert support to help you implement AI Discovery Files and stay ahead of the competition. 7-day support, UK data centres, and a team that understands where the web is heading.

Explore WordPress Hosting

Sources