Key takeaways
- To track brand mentions in AI answers, build a fixed set of 10 to 20 buyer-stage prompts, run them across ChatGPT, Perplexity, Gemini, Claude and Copilot on a set cadence, and log mention rate, position, sentiment, and whether you were cited or only named.
- AI answers are non-deterministic and most surfaces send no referrer, so the same prompt can return different brands twice in a row. You measure with a panel of prompts and report ranges, not single numbers.
- Perplexity is the easiest platform to track because every answer shows numbered source links. ChatGPT and Gemini name brands far more often than they link, so you have to read the answer text, not just the citations.
- Google AI Overviews are the outlier worth tracking on their own. They appear inside normal search results rather than a chat box, they cite source links, and they are the one AI surface where Search Console gives you a partial signal, so you track them with manual checks and proxy metrics rather than prompts alone.
- The metric that actually moves client conversations is share of voice against named competitors, plus the “kingmaker” third-party pages AI keeps citing to recommend them.
- Paid trackers (Profound, Otterly, Peec, AthenaHQ, Goodie, SE Ranking) automate the repetitive runs. A point-in-time GEO audit like BlueJar measures where you stand today and hands back a prioritized fix plan. The free manual method below is where everyone should start.
I ran the prompt “best AI visibility tools for agencies” through Perplexity, ChatGPT, Gemini and Claude on the same afternoon last month. Four answers, four different shortlists. One platform named a brand the other three never mentioned. That is the whole problem with trying to track brand mentions in AI search: there is no ranking report, no Search Console, and often no referrer telling you the visit came from ChatGPT at all.
And yet this is where buyers are forming shortlists now. ChatGPT crossed 800 million weekly active users in October 2025, up from 500 million in March, according to OpenAI CEO Sam Altman. When someone asks an assistant “who should I use for X,” the brands that come back become the consideration set before that person visits a single website.
This guide is the practical version. You get a free manual method you can run this week, the handful of metrics worth logging, an honest look at the paid tools (with current pricing and links so you can check it yourself), and what to actually do with the gaps you find. No fluff about the “AI revolution.” Just the system.
What “tracking brand mentions” actually means in AI answers
A brand mention in an AI answer is any time an assistant names your company, product, or domain in a response. That sounds simple until you try to count it, because mentions come in three flavors that carry very different weight.
- Cited: the AI links to a page (usually yours, sometimes a third party) as a source. Perplexity, Google AI Mode, and Google AI Overviews do this constantly. A citation to your own page is the highest-value outcome because it sends a click and signals the model trusts you.
- Recommended: the AI names you as a solution without a link. ChatGPT does this far more than it links. There is no click to measure, but your name just entered someone’s shortlist, which is the point.
- Compared: you show up inside an AI-generated comparison, ranked next to alternatives. These can be flattering or brutal, and both are worth knowing about before a prospect reads them.
The reason people keep asking “is there a quick way to do this” in places like r/aeo and r/SEO_for_AI is that none of the three show up in the analytics they already have. Google Search Console will not tell you that Gemini described your product as “a budget option” last Tuesday. You have to go look.
Why it is genuinely hard (and why one number lies to you)
Before the method, it helps to know what you are fighting, because half the bad advice out there ignores these constraints.
Outputs are non-deterministic. Ask the same question twice and you can get two different brand lists. In commercial categories the variance is large enough that a single run tells you almost nothing. This is why you run a panel of prompts and report a rate, not a yes or no.
There is no native analytics layer. None of these platforms gives brands a dashboard of “you appeared in 4,000 answers this week.” You are reconstructing visibility from the outside by asking questions and reading answers.
Most surfaces send no referrer. Plenty of AI traffic arrives with no UTM and no identifiable source, so server logs and GA4 undercount it badly. Tracking what the AI says is currently more reliable than tracking the clicks it sends.
Answers are personalized. Location, account history and the exact phrasing all shift results. Two people asking the “same” question can see different brands. You reduce this by logging out, using a clean profile, and keeping prompt wording fixed run to run.
The takeaway: a brand-mention program is a measurement discipline, not a single check. The good news is the discipline is simple, and you can start it for free.

The free manual method you can run this week
This is the method I hand to anyone asking “how do I start monitoring brand mentions in ChatGPT” who does not want to buy a tool yet. It costs nothing but an afternoon, and it gives you an unfiltered view of exactly what each assistant says about you. Here is the repeatable version.
- Build a fixed prompt set across funnel stages. Aim for 10 to 20 prompts in three buckets. Brand-direct (“what is [your brand]”, “[your brand] pricing”, “[your brand] reviews”). Category and commercial (“best [category] tools for [use case]”, “top [category] platforms for [audience]”). Problem and comparison (“how do I [solve the problem you solve]”, “alternatives to [main competitor]”). Lock the wording. The prompt set is your measuring stick, so it has to stay the same every run.
- Run every prompt across each platform. ChatGPT, Perplexity, Gemini, Claude and Copilot at minimum. Turn on web access where the platform offers it so you are testing live behavior, not stale training data. For Perplexity, the sources are listed and numbered, so capture them. For ChatGPT and Gemini, read the full answer text, because they name brands they never link to. For your category and problem prompts, also run a plain Google search in an incognito window and record whether an AI Overview appears and whether it names or cites you, because that surface is separate from the Gemini app and behaves differently.
- Log the full result, not just yes or no. For each prompt and platform, record whether you appeared, your position in any list, the mention type (cited, recommended, or compared), the sentiment, which sources were cited, and which competitors showed up alongside you. Paste the actual answer text into a notes column. You will want the receipts later.
- Score the panel. Total your appearances to get a mention rate per platform (“named in 4 of 15 prompts on ChatGPT = 27%”). Count competitor appearances on the same prompts to get share of voice. Note where you are cited versus only recommended. This is the snapshot.
- Set a cadence and re-run. A single audit is one frame of a moving picture. Pick a rhythm you will actually keep: a small high-intent set weekly, the full set monthly, and a targeted re-run two to four weeks after you publish anything new to see if it got picked up. Same prompts, same conditions, every time.
A simple tracking sheet does the heavy lifting. The column structure that has held up for me:
| Prompt | Platform | Mentioned? | Position | Type | Sentiment | Source cited | Competitors present | Date |
|---|---|---|---|---|---|---|---|---|
| best AI visibility tools for agencies | Perplexity | Yes | 2nd | Cited | Neutral | g2.com/categories/… | Competitor A, Competitor B | 2026-05-20 |
| best AI visibility tools for agencies | ChatGPT | No | n/a | n/a | n/a | none | Competitor A, Competitor C | 2026-05-20 |
| alternatives to [Competitor A] | Gemini | Yes | 4th | Recommended | Positive | none (no link) | Competitor A, Competitor D | 2026-05-20 |
If you only have time for one platform, start with Perplexity. Because every answer exposes its numbered sources, it is the fastest place to see not just whether you appear but exactly which pages are getting you there. If you want to go deeper on that specific platform, the mechanics in how to rank in Perplexity AI map directly onto what your tracking sheet will surface.
Google AI Overviews need a slightly different routine. They show up inside ordinary search results, so you track them by searching, not by prompting: run your category and problem queries in an incognito window and watch for an Overview that names or cites you. Search Console helps here in a way it does not for the chat tools, but only as a proxy. Google does not give Overviews their own Search Console filter the way it now does for AI Mode, so the tell is a query that holds its impressions while its click-through rate quietly drops, which usually means an Overview moved in above your result. Getting into Overviews in the first place is a separate project, covered in how to get your site into Google’s AI Overviews.
The metrics worth logging (and the ones that mislead)
Raw answer text is useful for screenshots, but you need numbers to show movement and to brief a client who wants to know if last month’s work did anything. These five do the job. The first four are signal. The fifth is the one most people skip and then wonder why their content never gets cited.
- Mention rate. Share of your prompt set where you appear on a given platform. Track it per platform, because a brand can be strong on Perplexity and invisible on ChatGPT.
- Position. When you do appear in a list, are you first or buried fifth? First-named recall is worth a lot more, so log average position.
- Share of voice. Your mentions divided by all brand mentions (yours plus competitors’) on the same prompts, as a percentage. This is the single most useful number for competitive positioning, and the one clients immediately understand.
- Sentiment. Positive, neutral, or negative, with a one-line reason. “Described as expensive” is a different problem from “not mentioned at all,” and the fix is different too.
- Citation quality. On platforms that link (Perplexity, Google AI Mode, and AI Overviews), are the citations pointing at your pages or at a third party you do not control? If Perplexity recommends you but cites a G2 listing instead of your site, your own content is not citation-ready yet.
One number that lies: a single “AI visibility score” with no range. Given the variance, any honest report shows a band (“share of voice 18 to 26% across three runs”), not a false-precision single figure. If a tool or a consultant hands you one clean number with no confidence interval, be skeptical. For the structural side of why some pages get cited and others do not, citation readiness for GEO covers the on-page signals that move that citation-quality metric.

Find the kingmaker sources doing the citing
Assistants do not invent opinions. They synthesize from training data and, on live-search platforms, from the current web. So when Perplexity recommends a competitor, look at what it cited to get there. A small number of third-party pages tend to do most of the citation work for any category. I call them kingmaker sources, and finding them is usually the single most actionable output of an audit.
For most B2B and local-service categories, the usual suspects are:
- Review aggregators like G2, Capterra, Trustpilot and Product Hunt. AI leans on these for “best of” answers, and many brands are cited from outdated or half-empty profiles.
- Editorial roundups and comparison posts (“best [category] tools”, “[Brand] vs [Competitor]”) from publishers the model already trusts.
- Community threads on Reddit and Quora. Community consensus shapes AI recommendations more than most brands expect, which is exactly why these “is there a good tool for Perplexity” threads exist in the first place.
- Your own structured pages, when they are built to be quoted: clear headings, FAQ blocks, explicit data with sources.
The line that ends most audits I run sounds like this: “AI is citing this specific page to recommend your competitor, and you are not on it.” That sentence turns a fuzzy visibility problem into a concrete to-do list.
The paid tools, and what each one is for
The manual method is free and clarifying, but it does not scale. Running 20 prompts across 5 platforms by hand, every week, gets old by about week three. That is the job paid trackers do: they re-run your panel automatically and chart the trend so you stop copy-pasting answers into a sheet. Here is an honest map of the category as of May 2026. Prices move around and a few vendors hide exact figures behind a sales call, so each row links to the live pricing page so you can check it yourself.
| Tool | Entry price (May 2026) | Model | Best for |
|---|---|---|---|
| Otterly AI | $29/mo (15 prompts), $189 (100), $489 (400) | Prompt-based tracking | Solos and small teams wanting a cheap entry point |
| Peec AI | from EUR89/mo; EUR199 Pro; EUR499+ Enterprise | Prompt-based, daily tracking | European marketing teams, multi-model tracking |
| Profound | from $99/mo (ChatGPT-only); $399 Growth; custom Enterprise | Prompt and credit tracking | Larger brands and enterprises wanting depth |
| AthenaHQ | from $295/mo (3,600 credits = AI responses) | Credit-based (1 credit = 1 response) | Brands comfortable with usage-metered spend |
| Goodie AI | from ~$399/mo (no free tier) | Prompt and action credits | Teams wanting tracking plus optimization in one suite |
| SE Ranking | AI Visibility Tracker as an add-on to the SEO suite | Mentions and links inside a rank tracker | SEO teams already living in SE Ranking |
| BlueJar | Free tier, then per-analysis | Point-in-time GEO audit + fix plan | Agencies and consultants who need a client-ready diagnosis, not just a chart |
A few honest notes on choosing. The pure trackers (Otterly, Peec, Profound, AthenaHQ) are built to answer “how is my mention rate trending,” and they are good at it. The catch is that most run a generic prompt set unless you customize it, and a generic panel measures the wrong queries. Whatever you buy, put the work into the prompt set; the panel is the product.
BlueJar sits in a different spot. It is not a 24/7 monitor that watches the platforms for you. It is a point-in-time GEO audit: you run it, and it measures how your brand shows up across the major AI platforms right now, tells you whether you are cited or only mentioned, surfaces the kingmaker sources doing the citing, estimates the lost opportunity in dollars, and hands back a prioritized fix plan and a client-ready proposal. Think diagnosis and prescription on a cadence you choose, rather than a dashboard you stare at. It also has a genuinely free tier, which most of the trackers above do not. If you are weighing the broader category, the state of AI search in 2026 has the market context behind all of these tools.
Turning mention gaps into GEO actions
Tracking without acting is just a spreadsheet that makes you anxious. Here is how each pattern in your sheet maps to a fix.
- Cited from an aggregator you do not control: claim and complete your G2, Capterra and Product Hunt profiles. A surprising share of brands are recommended from stale listings.
- Missing from category roundups: find the top “best [category]” posts that already rank and are likely feeding the models, then pitch for inclusion or publish a better, more current version yourself.
- Competitors appear on a problem prompt and you do not: that usually means you have no page that directly answers that question. Build it, with proper structure and schema.
- You cover the topic but never get cited: the problem is almost always structure, not effort. Clear H2/H3 hierarchy, FAQ blocks, and explicit sourced data are what make a page quotable. The mechanics in how to get cited by ChatGPT walk through this directly.
- Strong on Perplexity, invisible on Google’s surfaces: different engines weight different signals. Closing an AI Overviews gap is its own project, covered in how to get your site into Google’s AI Overviews, and the conversational AI Mode surface is covered in how to optimize for Google AI Mode in 2026.
If you want a single composite to anchor the before-and-after for a client, a structured rubric like the one behind a GEO score turns the scattered findings into one defensible number you can re-measure after the fixes ship.
Why this is worth the effort
Two numbers make the business case. First, the traffic is real and growing: Semrush reports AI search traffic grew 527% year over year, and projects it may surpass traditional search traffic by 2028. Second, the clicks are getting harder to win the old way: Bain & Company found in 2025 that about 80% of consumers now rely on zero-click results at least 40% of the time, which Bain estimates is cutting organic web traffic by 15 to 25%.
The visits that do come through tend to be better qualified. A 12-month GA4 analysis of 94 ecommerce sites by Visibility Labs, reported by Search Engine Land, found ChatGPT traffic converted at 1.81% versus 1.39% for non-branded organic, about 31% higher, and outperformed organic in 10 of 12 months. Being the brand the assistant names is starting to matter as much as ranking, and you cannot improve what you are not measuring.
Want to see where you stand without building the spreadsheet first? Run a free GEO audit at bluejar.ai to measure how your brand shows up across the major AI platforms and get a prioritized fix plan you can act on this week.
Frequently asked questions
What is the quickest way to track brand mentions in Perplexity?
Run a fixed set of 8 to 10 buyer-intent prompts in Perplexity with web focus on, and log whether your brand appears, your position, and which numbered sources it cited. Perplexity is the easiest platform to track because every answer exposes its sources, so you can see exactly which pages are earning you the mention. Re-run the same prompts weekly to spot movement.
How do I track whether my brand appears in Google AI Overviews?
Because AI Overviews appear inside normal search results, you track them by searching rather than prompting: run your category and problem queries in an incognito window and record whether an Overview appears and whether it names or cites you. For an ongoing signal, watch Search Console for queries that keep their impressions but lose click-through rate, which usually means an Overview moved in above your result. Google does not yet give AI Overviews a dedicated Search Console filter the way it does for AI Mode, so treat this as a proxy and pair it with manual checks.
How do I start monitoring brand mentions in ChatGPT?
Start with a prompt library of 10 to 20 questions across brand-direct, category, and problem-based queries, then run each one in ChatGPT with browsing enabled and record the full answer. ChatGPT names brands far more often than it links to them, so read the response text, not just any citations. Log mention rate and which competitors showed up, then repeat on a set cadence.
Can I track AI brand mentions for free?
Yes. The manual method in this guide costs nothing but time: a prompt set, a tracking spreadsheet, and a regular cadence of running the prompts across ChatGPT, Perplexity, Gemini, Claude and Copilot. Paid tools save the repetitive runs once manual tracking gets too time-consuming, but the free method is the right starting point for almost everyone.
What is the difference between being cited and being mentioned in an AI answer?
Being cited means the AI links to a specific page as a source, which sends a click and signals trust. Being mentioned (or recommended) means the AI names your brand without a link, which still puts you in the buyer’s consideration set. Track both, because they require different fixes: citations are an on-page structure problem, while recommendations are usually a third-party authority problem.
How often should I check my AI brand mentions?
A practical cadence is a small high-intent prompt set weekly, the full set monthly, and a targeted re-run two to four weeks after publishing new content to see whether it gets picked up. Because AI answers vary run to run, consistency matters more than frequency. Keep the prompts and conditions identical so the numbers are comparable.
Does BlueJar continuously monitor my brand in AI search?
No. BlueJar is a point-in-time GEO audit, not an always-on monitor. You run an audit and it measures how your brand appears across the major AI platforms at that moment, shows whether you are cited or only mentioned, identifies the third-party sources driving competitor recommendations, estimates the lost opportunity in dollars, and returns a prioritized fix plan plus a client-ready proposal. You choose when to re-run it, for example after shipping fixes.
Why do I get different brands when I run the same AI prompt twice?
AI answers are non-deterministic, and personalization based on location, account history, and phrasing adds more variation. In commercial categories the same prompt can return noticeably different brand lists between runs. This is why you measure with a panel of prompts and report a range rather than trusting any single answer.
Do AI platforms show up in my Google Analytics?
Often not. Much AI-driven traffic arrives with no identifiable referrer, so GA4 and server logs undercount it. That is why tracking what the assistant says about you, through prompts and answers, is currently more reliable than trying to count the clicks it sends. Treat any AI referral number in analytics as a floor, not the full picture.