GEO is barely 18 months old: how to tell good AI-search advice from hype

Key takeaways

The practice of getting your brand named inside AI answers (ChatGPT, Perplexity, Gemini, Copilot) is roughly 18 months old, so nobody has a multi-year track record yet. The term “Generative Engine Optimization” traces to a November 2023 Princeton-led research paper.
Is GEO worth it? The behavior is real and large: ChatGPT hit 800 million weekly active users by October 2025. But the confident playbooks, including mine, are educated guesses refined over months, not laws.
Three tells of hype: “guaranteed AI rankings,” “our proven system,” and big confident case-study numbers presented as certainties. Nobody controls how engines pick sources, and the rules change every few weeks.
What is genuinely real and worth doing now: the basics (clear answers, structured pages, getting mentioned where AI pulls from) plus measurement discipline. Treat it like an experiment on your own brand, not a recipe.

If you are asking “is GEO worth it,” start with a date. The term Generative Engine Optimization first appeared in a research paper submitted to arXiv on November 16, 2023 by a Princeton-led team. The job most people mean by GEO, getting your brand named inside a live ChatGPT or Perplexity answer, is younger than that. So when someone sells you a “proven, multi-year GEO system,” check the math. The surfaces it claims to have mastered mostly shipped in the last 18 months.

That is not a reason to dismiss GEO. The demand is real. It is a reason to read AI-search advice the way you would read advice about any young field, with interest and a filter. This post hands you the filter. You get the actual timeline with sources, a clear line between what holds up and what is hype, and an honest look at how I run GEO at BlueJar. I build this stuff for a living, and I will tell you which parts are still guesses.

How old is GEO, really

There are two clocks here. The academic one and the practitioner one.

The academic clock started in late 2023. The paper “GEO: Generative Engine Optimization” was submitted to arXiv in November 2023 and later published at the ACM SIGKDD conference in 2024. It coined the term, introduced a benchmark called GEO-bench, and reported that certain optimization methods raised a source’s visibility in generated answers by up to 40% in their tests. That is where the word comes from.

The practitioner clock, the one that matters if you are about to spend money, started later. Getting cited inside a live AI answer was barely a job until the engines could read the live web and put answers in front of millions of people. Those catalysts landed through 2024 and 2025. Count from there and the working practice is roughly 18 to 24 months old.

The timeline, with sources

Here is the sourced version, so you can date any claim you hear against it.

Date	Event	Why it matters for GEO
Nov 2023	The GEO paper is published (Princeton-led, arXiv)	The term is coined. The category gets an academic origin point.
May 14, 2024	Google AI Overviews launch in the US	AI answers appear above classic results for a large share of searches. “Am I in the AI answer” becomes a live query.
Oct 31, 2024	ChatGPT search launches	ChatGPT can read the live web and cite sources. Getting named in its answers becomes a real, repeatable job.
Feb 5, 2025	ChatGPT search reaches all users, including logged-out	The audience for AI answers goes mass-market, not just paying subscribers.
May 20, 2025	Google AI Mode rolls out to everyone in the US	A full conversational search tab ships. New surface, new behavior, barely a year of data.

Look at that last row. A core surface of AI search rolled out about a year ago. Any “proven system” for it has, at most, a year of evidence behind it, and the engines have changed their behavior several times since.

Why nobody has a multi-year track record

The honest version is simple. The engines that decide whether your brand gets named are moving targets. ChatGPT, Perplexity, Gemini, and Copilot all change how they retrieve and rank sources on a regular cadence, sometimes every few weeks. A tactic that earned citations in one quarter can stop working the next, not because you did anything wrong, but because the engine changed which sources it pulls.

I have watched this happen on our own pages. A format that was getting BlueJar named in answers for a cluster of queries quietly stopped, and the content had not changed at all. The engine had. That is the field you are buying into.

So when you read confident GEO advice, including mine, know what it actually is. A set of educated guesses, refined over months of testing, that are probably directionally right. That is genuinely useful. It is not the same as a law of physics, because the field is too young to have laws yet.

I will say this plainly, because it is the point of the whole post. Our playbooks at BlueJar are guesses too. Good guesses, tested against hundreds of prompts per audit and updated when the data moves. But the moment any vendor, us included, tells you their system is “proven” with the confidence of a decade-old discipline, your skeptic filter should switch on.

Tells of hype vs what is real

Skepticism is not cynicism. The goal is not to reject GEO. It is to catch the overclaims and keep the parts that hold up. Here is the side-by-side I use.

Tell of hype	What is actually real
“Guaranteed AI rankings” or “guaranteed citations”	Nobody controls how engines pick sources. You can improve your odds, not guarantee an outcome. Treat any guarantee as a red flag.
“Our proven, multi-year system”	The practice is roughly 18 months old. No multi-year proof exists yet. “Tested over months and updated as engines change” is the honest version.
A big confident case-study number (“we got 300% more mentions”)	Read case studies as early signals, not laws. One brand in one category for one quarter is a data point, not a promise for yours.
“GEO is a totally new discipline, throw out your SEO”	The inputs overlap heavily with good SEO. The new part is the measurement layer, not a brand-new rulebook.
A single visibility score with no ranges or dates	AI answers are non-deterministic. A credible number is a range across many prompts and a date, not one figure.
“Set it and forget it”	Because engines change, GEO is a re-run-on-a-cadence practice. A point-in-time check today does not stay true forever.

The SEO overlap, and the part that is new

The most common skeptic line is “GEO is just rebranded SEO.” It is half right, and the honest answer is more useful than either the hype or the dismissal.

Concede the overlap first, because it is true. What you optimize for AI answers is close to what good SEO already asks for: clear answers high on the page, structured content, consistent entity information, citation-ready statistics, real backlinks, schema. If you have done citation-readiness work, you have already done most of the on-page part of GEO. Do your SEO first. Most “GEO courses” that are really repackaged SEO are not wrong about the fundamentals. They are just overpriced for what they teach.

Now the part that is actually new, and it sits on the measurement side. In SEO you check a search results page and get a clean rank: position 4, done. In GEO, the same query asked twice can return different answers and different sources, because the engines are non-deterministic. A single check is a vanity metric. To know whether anything worked, you need a panel of prompts run across variations and across engines, then you report consistency or a range, not one number. That measurement discipline, plus the off-site work of getting placed in the third-party sources AI keeps citing, is the genuinely new workstream. The deeper GEO-versus-SEO breakdown walks through why the engines diverge.

One more distinction SEO cannot make, and it is the one I find most useful. You can rank number one organically and still be invisible the moment someone asks the query inside an AI tool. Rank is not citation. The click your ranking was built for never happens if the reader gets an answer and never scrolls to a blue link.

What is worth doing right now

None of this means wait. It means act like a scientist, not a shopper. Here is the short list of things that are genuinely real and worth doing today, whichever way the engines drift next.

Do the basics. Clear, direct answers near the top of each page. Structured, scannable content. Consistent entity information across the web. These help in AI answers and in classic search, so the downside is close to zero.
Get mentioned where AI pulls from. Engines lean on a small, repeatable set of third-party sources per topic: comparison posts, roundups, community threads, review sites. Being present in those sources is a parallel job to your own pages. See how to get cited by ChatGPT for the off-site side.
Measure before and after, with discipline. Run a panel of real prompts across engines, record what names you and what does not, then re-run after you change something. One spot check tells you nothing.
Test on your own brand first. Do not trust a generic playbook on faith. Run it on your site, measure what actually moves, keep that, drop the rest. The cost of testing is low and the learning is yours.

That is the whole method. Test, measure, keep what works. It is boring on purpose, and boring beats confident-but-unproven in a field this young.

How we approach this at BlueJar

We built BlueJar around the experiment-first idea, because the alternative, selling certainty in an 18-month-old field, is the exact thing this post is warning you about. An audit runs a structured set of 400-plus prompts across ChatGPT, Perplexity, Gemini, and Copilot and reports where you are named, cited, or invisible, per zone and per intent. It is a point-in-time measurement on purpose. We do not pretend a single number stays true forever, and we tell you to re-run it on a cadence instead of treating it as a one-time verdict.

It will not guarantee you a citation, because nobody can. What it does is swap guessing for measurement on your own brand, which is the one move that survives every engine update. If you want to compare what is on the market before deciding, our roundup of GEO audit tools is a reasonable starting point, and the state of AI search piece has the adoption context.

Want to separate real from hype on your own site? Run a free GEO audit at app.bluejar.ai. It is a point-in-time measurement across ChatGPT, Perplexity, Gemini, and Copilot, so you can see what is actually true for your brand today before you trust anyone’s playbook, mine included.

Frequently asked questions

Is GEO worth it, or is it a fad?

The underlying behavior is real and large. ChatGPT reached 800 million weekly active users by October 2025, and AI answers now sit above classic results for many searches. So GEO is worth doing. The catch is that the playbooks are roughly 18 months old, so do it like an experiment on your own brand rather than buying a “proven system” on faith.

How old is GEO?

The term comes from a November 2023 Princeton-led paper, so the academic origin is about two and a half years old. The practical job of getting named in live AI answers is younger, because its main surfaces (Google AI Overviews in May 2024, ChatGPT search in October 2024, Google AI Mode in May 2025) are recent. The working practice is roughly 18 to 24 months old.

What are the warning signs of GEO hype?

Three tells: anyone promising “guaranteed AI rankings” (nobody controls how engines pick sources), anyone selling a “proven multi-year system” (no multi-year proof exists yet), and big confident case-study numbers presented as certainties rather than early signals. Treat all three as a reason to ask harder queries.

Is GEO just rebranded SEO?

Partly. The inputs overlap heavily: clear answers, structured content, entity consistency, real backlinks, schema. The genuinely new part is the measurement layer. AI answers are non-deterministic, so you measure a panel of prompts across engines and report ranges, not one rank, plus the off-site work of getting into the sources AI keeps citing.

What should I actually do about GEO right now?

Do the basics (clear answers, structured pages, getting mentioned where AI pulls from), then measure before and after with a panel of real prompts. Test any playbook on your own brand first, keep what moves the numbers, and drop what does not. Skip anything that asks you to trust a generic recipe without measuring.

Can any tool guarantee my brand gets cited by ChatGPT?

No. The engines change how they retrieve and rank sources on a regular cadence, so no tool or agency controls the outcome. A credible tool measures your current visibility and shows you the gaps. BlueJar takes a point-in-time, experiment-first approach for exactly this reason: it measures what is true for your brand today rather than promising a citation it cannot control.

Why does GEO advice keep changing?

Because the engines keep changing. ChatGPT, Perplexity, Gemini, and Copilot update their source selection regularly, so tactics that earned citations one quarter can fade the next. This is why a re-run-on-a-cadence approach beats “set it and forget it,” and why early case-study numbers should be read as signals rather than permanent laws.