Keyword monitoring

How AY Rank Finds Keywords That Get Cited by AI Search

We treat keyword research as two parallel inputs: classic Google demand (Ahrefs + SERP) and AI-search demand (real prompts pulled from ChatGPT, Perplexity, Gemini). The output is one ranked backlog scored by citation readiness, not search volume alone.

Cadence: Weekly sweep + ad-hoc
Output: Ranked keyword + prompt backlog
Category: Keyword monitoring

The workflow6 steps · 10 components

Brand site

Competitors

GSC + GA4

Ahrefs DB

Seed harvest

AI prompt extraction

Intent + readiness scoring

Backlog store

Slack digest

Striking-distance alerts

The problem

Most agencies still build keyword lists from Google volume only. That misses 30 to 60% of the queries that drive AI-cited recommendations (long prompts, comparison questions, jurisdiction-specific asks) because they never crossed a 50 MSV threshold in Ahrefs. We rebuild the funnel from both sides.

The workflow.

7 steps · runs at weekly sweep + ad-hoc

01
Seed harvest from the brand
Pull every product, service, persona, pain point, and industry from the client site. We extract entities from page H1s, schema.org/Service blocks, and existing FAQ schema, not from a brainstorm doc that goes stale in two weeks.
02
Ahrefs expansion + competitor gap
For each seed, expand via Ahrefs Keywords Explorer (Matching terms, Related, Questions) and pull the top 5 competitors' organic and paid keywords. Filter by intent + KD vs the client's DR. Surface striking-distance (positions 4–15) as a separate stream.
03
Search Console + GA4 first-party signals
Pull the client's own Google Search Console queries (impressions, CTR, position deltas) and GA4 landing-page conversion data. Queries the brand already ranks for but doesn't convert on, or impressions with no clicks, become the highest-priority backlog candidates because the demand is already proven.
04
AI prompt extraction
For the top 20 seeds, run scripted prompts against ChatGPT, Perplexity, Gemini, and Google AI Overviews. Log every cited source and every follow-up question the model suggests, those follow-ups become the next round of seeds. Repeat until expansion plateaus.
05
Intent classification
Every keyword and prompt is auto-tagged as informational, commercial, navigational, or transactional by an Anthropic model with the brand's ICP as context. Mismatched intent (e.g. a transactional prompt routed to a blog post) is flagged before content brief.
06
Citation readiness score
A 0–100 score per query: combines AI citation density (how many AI engines surface a result at all), source-share entropy (is one source dominating, or is it open), commercial intent, and the client's current presence. We ship the highest-score gaps first.
07
Ship to backlog
Top 20 per sprint land in Supabase tagged with target page, target schema type, and intent. The blog pipeline and programmatic SEO workflows pull from this backlog, keyword work never becomes a static spreadsheet that nobody opens.

The stack.

Ahrefs APIVolume, KD, SERP, gap, striking-distance

SEMrushCompetitor keyword + paid gap data

Google Search ConsoleImpressions, CTR, position deltas per query

Google Analytics 4Landing-page conversion data per query

Google TrendsSeasonality + emerging-query detection

PlaywrightChatGPT / Perplexity / Gemini prompt runs

Brave Search APICited source URLs at scale

Anthropic ClaudeIntent classification + entity extraction

SupabaseKeyword backlog + citation history store

SlackWeekly digest + alerting

A representative cadence.

Schedule

5 weekly

MTWTFSS

MonWeekly
Ahrefs sweep + competitor gap pull
TueWeekly
AI prompt runs across 4 engines, dedupe + tag
WedWeekly
Intent classification + readiness scoring
ThuWeekly
Backlog ranking, brief generation, hand-off
FriWeekly
Slack digest, striking-distance alerts

What you get.

Ranked keyword + prompt backlog in Supabase, refreshed weekly
Striking-distance keyword alerts (positions 4–15) in Slack
Per-query citation readiness score with source-share breakdown
Intent-tagged content briefs ready for the writing pipeline

Free playbook

Get the keyword research playbook.

The Ahrefs pulls, striking-distance filters, intent classification rules, and outreach-grade backlink scoring we run every week.

No spam. Unsubscribe anytime.

Common questions.

Why pull keywords from ChatGPT and Perplexity directly?

Because the queries people type into LLMs do not look like the queries they typed into Google. They are longer, more conversational, more comparison-driven, and frequently below classic search volume thresholds. Pulling real prompt traffic from the models themselves is the only honest input for AI-search content.

How is citation readiness scored?

Four weighted inputs: AI citation density (out of 5 engines, how many cite anyone at all for this query), source-share entropy (is one source dominating or is the slot contested), commercial intent (how close to revenue), and current presence (where the client already ranks or gets cited). The score is a 0–100 number we sort the backlog by.

How often does the backlog get refreshed?

Weekly for active clients. The cadence matters because AI engines re-index citation sources faster than classic search, a Reddit thread or a competitor blog post can take a slot in days, not months. A weekly refresh keeps the backlog honest.

Do you still use traditional Google search volume?

Yes, it is one of four inputs to citation readiness, not the only one. A 2,000 MSV query with zero AI citations beats a 50 MSV query with daily AI Overview placement, and vice versa, depending on commercial intent. The score balances both.

Related workflows.

AI citationsDaily AI citation tracking »AI citationsBrand mention monitoring across the LLM web »AI citationsBacklink building for AI search »

Want this workflow running on your brand?

Book a free GEO audit. We'll tell you whether this play moves the needle for your category and what citation share is realistic in 90 days.

Book a free GEO audit»

How AY Rank Finds Keywords That Get Cited by AI Search

Cadence

Weekly sweep + ad-hoc

Output

Ranked keyword + prompt backlog

Seed harvest from the brand

Pull every product, service, persona, pain point, and industry from the client site. We extract entities from page H1s, schema.org/Service blocks, and existing FAQ schema, not from a brainstorm doc that goes stale in two weeks.

Ahrefs expansion + competitor gap

For each seed, expand via Ahrefs Keywords Explorer (Matching terms, Related, Questions) and pull the top 5 competitors' organic and paid keywords. Filter by intent + KD vs the client's DR. Surface striking-distance (positions 4–15) as a separate stream.

Search Console + GA4 first-party signals

Pull the client's own Google Search Console queries (impressions, CTR, position deltas) and GA4 landing-page conversion data. Queries the brand already ranks for but doesn't convert on, or impressions with no clicks, become the highest-priority backlog candidates because the demand is already proven.

AI prompt extraction

For the top 20 seeds, run scripted prompts against ChatGPT, Perplexity, Gemini, and Google AI Overviews. Log every cited source and every follow-up question the model suggests, those follow-ups become the next round of seeds. Repeat until expansion plateaus.

Intent classification

Every keyword and prompt is auto-tagged as informational, commercial, navigational, or transactional by an Anthropic model with the brand's ICP as context. Mismatched intent (e.g. a transactional prompt routed to a blog post) is flagged before content brief.

Citation readiness score

A 0–100 score per query: combines AI citation density (how many AI engines surface a result at all), source-share entropy (is one source dominating, or is it open), commercial intent, and the client's current presence. We ship the highest-score gaps first.

Ship to backlog

Top 20 per sprint land in Supabase tagged with target page, target schema type, and intent. The blog pipeline and programmatic SEO workflows pull from this backlog, keyword work never becomes a static spreadsheet that nobody opens.

The stack.

Ahrefs APIVolume, KD, SERP, gap, striking-distance

SEMrushCompetitor keyword + paid gap data

Google Search ConsoleImpressions, CTR, position deltas per query

Google Analytics 4Landing-page conversion data per query

Google TrendsSeasonality + emerging-query detection

PlaywrightChatGPT / Perplexity / Gemini prompt runs

Brave Search APICited source URLs at scale

Anthropic ClaudeIntent classification + entity extraction

SupabaseKeyword backlog + citation history store

SlackWeekly digest + alerting

A representative cadence.

Schedule

5 weekly

MTWTFSS

MonWeekly
Ahrefs sweep + competitor gap pull
TueWeekly
AI prompt runs across 4 engines, dedupe + tag
WedWeekly
Intent classification + readiness scoring
ThuWeekly
Backlog ranking, brief generation, hand-off
FriWeekly
Slack digest, striking-distance alerts

Common questions.

Why pull keywords from ChatGPT and Perplexity directly?

How is citation readiness scored?

How often does the backlog get refreshed?

Do you still use traditional Google search volume?