How to Rank in AI Search: A Technical Guide for ChatGPT, Perplexity, and Gemini

Q: Does ranking on Google page 1 guarantee AI citations?

No. Google AI Mode pulls 50% of its citations from content that doesn't rank on page 1. Structure and semantic relevance matter more than your organic position.

Q: Can a small site compete with big brands in AI search?

Yes, through third-party signals. Brands are 6.5x more likely to get cited by AI platforms via third-party sources than through their own sites. Build your presence on Trustpilot, G2, Capterra, and niche directories. Review platform profiles alone increase AI selection likelihood by 3x.

Q: What does the imakemvps GEO score measure?

It audits four technical layers: schema coverage, llms.txt status, robots.txt crawler access, and citation signals. Each gap gets a specific flag so you know exactly what to fix rather than guessing.

Why AI Search Is a Different Game

AI-referred sessions grew 527% year-over-year in the first five months of 2025. Most sites got zero of that traffic because AI systems can't find a quotable answer anywhere on their pages.

This is not SEO with a new name. The mechanics are different from the ground up.

AI uses RAG: passage-level retrieval, not page-level ranking

Google's crawler ranks pages. AI systems use RAG (Retrieval-Augmented Generation). Before generating a response, the model retrieves specific passages from an external index, scores each one for semantic relevance, and pulls the strongest match into its answer.

Your domain authority doesn't travel with the passage. Each section of your page competes on its own. A well-structured paragraph on page 4 of Google results can still get cited if it answers the query more directly than anything on page 1.

65% of Google SERPs now show AI Overviews

As of March 2026, 65.07% of Google results pages include an AI Overview. That box answers the query before the user sees your title. If your content isn't in it, most users never scroll to your link.

AI-referred sessions grew 527% YoY - and your analytics probably missed it

That 527% growth is almost certainly undercounted. In one documented case, a site received 490 ChatGPT citation references but GA4 recorded only 3 sessions. Standard referral tracking doesn't capture most AI traffic. The sessions are real. Your dashboard just doesn't show them.

ChatGPT, Perplexity, and Gemini Pull From Different Sources

Side-by-side diagram comparing how ChatGPT, Perplexity, and Gemini retrieve and rank content sources

Those missing sessions aren't just a tracking problem. They're a signal that three different AI engines are pulling from three different places. If you optimize for one, you may be invisible to the other two.

ChatGPT: Bing index, 31% of prompts trigger web search, 2 sub-queries per prompt

ChatGPT doesn't search the web on every prompt. It triggers a live web search on about 31% of prompts, and when it does, it runs an average of 2 sub-queries, each around 5-6 words long. Those sub-queries pull from Bing's index, not Google's. If your site has weak Bing presence, ChatGPT's live results won't find you, regardless of where you rank on Google.

Perplexity: live-crawls, over-indexes Reddit and third-party listicles

Perplexity works differently. It live-crawls at query time rather than pulling from a pre-built index. An analysis of 1,400+ Perplexity citations found it over-indexes Reddit threads, third-party listicles, and review platforms relative to brand websites. Your homepage doesn't rank here. Someone else's Reddit comment about your category might. This is worth knowing before you spend more time polishing your own site's copy.

Gemini: Google-indexed entities and E-E-A-T signals - organic rank still matters here

Gemini stays closest to traditional SEO logic. It prioritizes Google-indexed entities and E-E-A-T signals (experience, expertise, authoritativeness, and trustworthiness). Organic ranking still moves the needle for Gemini in a way it doesn't for the other two. But even here, the picture is more complicated than “rank #1, get cited.” Google AI Mode pulls 50% of its citations from content that doesn't appear on page one of traditional search results. Structure and entity clarity matter as much as position.

The practical implication: brands are 6.5x more likely to get cited by AI platforms through third-party sources than through their own websites. That gap is where most brands are leaving coverage on the table.

Robots.txt: Let the AI Crawlers In

That gap closes faster than most people expect. One of the quickest fixes lives in a file you already have.

If your robots.txt blocks AI crawlers, those platforms cannot retrieve your content. Content quality does not matter. Citation potential does not matter. You simply do not exist to them.

Which bots to allow: exact user-agent strings

Copy this block and add it to your robots.txt:

User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /

These seven cover the main platforms that determine how to rank in AI search today: ChatGPT, Claude, Perplexity, and Google's AI features. Each has its own user-agent string. You need all of them listed separately. A wildcard Allow: / does not reliably reach every bot.

What a blocking robots.txt costs you

LLM bots now crawl sites 3.6x more than Googlebot. GPTBot alone went from appearing on 2.9% of sites in 2024 to 4.5% in 2025, a 55% jump in one year. That traffic is only growing.

79% of major publishers currently block AI training bots. Many do it as a reflex carry-over from blocking scrapers. The cost is real: blocked crawlers cannot pull your pages for retrieval-augmented generation, which is how most AI platforms answer questions in real time. Blocking training data is a separate call. Blocking RAG retrieval hurts your visibility directly.

Check your robots.txt today. If any of those seven user-agents hit a Disallow: /, fix it before anything else.

Schema Markup: What Actually Moves the Needle for AI Citations

Pages with well-organized headings are 2.8x more likely to earn citations in AI search results. That single stat should tell you where to focus your schema effort: not on adding every type Google supports, but on making your content structure legible to machines.

FAQPage and Article schema: the two types with direct citation evidence

Most schema advice is generic. Here, two types have actual citation evidence behind them.

FAQPage schema has seen steady growth in adoption specifically because AI systems heavily cite FAQ-formatted content. When your Q&A pairs are marked up, the crawler can pull a precise answer without parsing your prose.

Article schema signals publication date, author, and content type. AI systems use that metadata to assess freshness and source credibility, two factors that influence whether your page gets cited or skipped.

Skip the rest for now. LocalBusiness, BreadcrumbList, and HowTo schema serve other purposes. They do not have the same citation evidence behind them.

Pages with organized headings are cited 2.8x more often

This stat points at structure, not just schema. Pages with 120–180 words between headings receive 70% more AI citations than pages with irregular spacing. AI systems evaluate content at the passage level. Each section gets scored on its own. Long walls of text between headings force the crawler to guess where one idea ends and another begins.

Keep your H2 and H3 sections tight. One idea per heading block.

How to implement: minimal valid markup

Drop this in your <head> for a FAQ section:

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "How do you rank in AI search?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Use FAQPage and Article schema, keep 120–180 words between headings, and make sure your content is crawlable by AI user-agents."
      }
    }
  ]
}

That is the whole thing. Validate it with Google's Rich Results Test before publishing. If it passes there, it is readable by AI crawlers too.

Measure First: Setting a Citation Tracking Baseline

Abstract data visualization showing the gap between AI citation references and analytics-tracked sessions

Once your schema and content are in place, you need to know if AI systems are actually citing you. This is where most teams hit a wall, not because the citations aren't happening, but because their tools can't see them.

Why standard analytics miss AI citations: the 490 ChatGPT references vs 3 GA4 sessions gap

One site recorded 490 ChatGPT references over a measurement period. GA4 logged 3 sessions from that same window. The other 487 interactions left no trace in standard analytics. That gap isn't a bug. It's structural. AI assistants don't pass referrer strings the way browsers do. They pull your content, synthesize an answer, and the user never clicks through. No click means no session. No session means your dashboard shows nothing.

On top of that, 61.7% of LLM citations are ghost citations. Your URL appears as a source link, but your brand name never shows up in the actual response text. You're getting retrieved without getting named.

What to track: referrer strings, UTM gaps, direct dark traffic

Start with what you can catch. Check your referrer logs for strings like chatgpt.com, perplexity.ai, and claude.ai. Watch your direct traffic for spikes that don't correlate with campaigns. That's dark traffic, and a chunk of it is AI-referred. Tag your landing pages with UTMs specific to AI channels so you can separate them from organic. None of this captures everything, but it gives you a floor.

The four KPIs worth tracking consistently: AI Visibility Score, Share of Voice, Average Position within the AI response, and Sentiment. Know your baseline on all four before you change anything.

Running a GEO audit to check schema coverage, llms.txt, and crawler access

To get a structured read on where you stand, run a GEO audit on your URL. The AI Visibility Kit checks schema coverage, your llms.txt status, and whether AI crawlers can actually reach your content. It outputs a GEO score and flags the specific gaps: missing schema types, blocked crawlers, absent llms.txt. That list becomes your next sprint. Fix the flags, re-run, and watch the score move. That's the loop for how to rank in AI search: measure, fix what the audit surfaces, repeat.

llms.txt: Honest Verdict First, Setup Second

The verdict: no major LLM officially uses it, zero crawler visits recorded

As of July 2025, only 951 domains had published a llms.txt file. Server log analysis showed zero visits from GPTBot, PerplexityBot, or ClaudeBot to those files. OpenAI, Google, and Anthropic have not officially adopted the standard. Google's John Mueller compared it to the old keywords meta tag and said “question everything.” Ahrefs put it plainly: “There's no evidence that llms.txt improves AI retrieval, boosts traffic, or enhances model accuracy.”

That's the honest read.

Why you should set it up anyway: defensive positioning

It takes 10 minutes and costs nothing. If major LLMs do adopt the standard, you're already positioned. If they don't, you've lost nothing. Credibility comes from not overselling it, so don't. Just have it.

What to put in it

A minimal working llms.txt lives at your domain root and points to the pages that matter most:

# YourSite llms.txt
> A brief one-line description of what your site covers.

## Key pages
- [Homepage](https://yoursite.com/)
- [About](https://yoursite.com/about/)
- [Best post title](https://yoursite.com/your-best-post/)

Plain text. No special syntax required. Link to your highest-value pages and keep the description factual.

Frequently Asked Questions

Does ranking on Google page 1 guarantee AI citations?

No. Google AI Mode pulls 50% of its citations from content that doesn't rank on page 1. Structure and semantic relevance matter more than your organic position.

How long does it take to appear in AI search results after making changes?

There's no fixed timeline, and no one can give you one honestly. That said, 85% of AI Overview citations come from content published within the last two years, so fresh and updated pages move faster than stale ones.

Can a small site compete with big brands in AI search?

Yes, through third-party signals. Brands are 6.5x more likely to get cited by AI platforms via third-party sources than through their own sites. Build your presence on Trustpilot, G2, Capterra, and niche directories. Review platform profiles alone increase AI selection likelihood by 3x. You don't need a massive domain. You need the right mentions in the right places.

What does the iMakeMVPs GEO score measure?

It audits four technical layers: schema coverage, llms.txt status, robots.txt crawler access, and citation signals. Each gap gets a specific flag so you know exactly what to fix rather than guessing.

Want us to run this for your site?

We audit your GEO coverage, fix the gaps, and set up the tracking. Book a free strategy call to see where you stand.

Book a Free Call →