Synthetic panels vs live audience tests: a decision tree

01, Definitions

What each tool actually is

Most teams conflate three different jobs. They're not interchangeable, and the cost of confusing them is real money spent on the wrong instrument.

RoastIQ score, a creative-quality verdict. Five KPIs (Beat the Skip, Get Noticed, Brand Impact, Sell Proposition, Build Brand), composite, benchmark percentile, and a Scale / Sharpen / Rebuild verdict. Ninety seconds. Answers is this creative likely to perform.
Synthetic Users, a persona-based buyer-rejection diagnosis. Thirty-six LLM-modelled personas walk through the ad, react in their own segment-style language, and surface objection patterns. Answers why a segment would resist.
Live audience panel, real humans, recall and recognition surveys, biometric or eye-tracking measurement. Answers how a real population remembers and reacts. Kantar Link AI is the canonical live methodology here; it is the gold-standard recall instrument.

Three different questions. Three different price tags. The decision tree below is about not paying live-panel prices for a question synthetic users can answer, and not pretending synthetic users can answer a question only humans can.

02, Capability

What synthetic users can do

After roughly a year of running the synthetic pipeline against real briefs, four things are consistently useful:

Segment-specific objection surfacing. A 32-year-old urban-mum persona will refuse a product for different reasons than a 24-year-old early-adopter, and the model is good at speaking each in their own register.
Persona-style language for the re-brief. The most useful output isn't the score; it's the verbatims. They feed straight into a creative re-brief without translation.
Speed and volume. Thirty-six personas in under five minutes. You can test six variants of the same ad against the same panel in an afternoon.
Cheap variants. The marginal cost of a synthetic run is small enough that you can simulate failures without flinching.

"The score told us the ad was Sharpen. The synthetic panel told us why, and gave us the exact phrasing the segment used to reject it."

03, Limits

What synthetic users cannot do

Equally important, and the part most synthetic-panel vendors quietly skip:

Sample real-world heterogeneity. Thirty-six personas are a curated sketch of a segment, not a probabilistic sample of a population. They will miss the long tail of weird, contradictory, situational human reactions that a real panel catches.
Replace ethnographic depth. If the strategic question is "how do mothers in Casablanca actually use this product on a Sunday morning," no language model knows. Go observe.
Predict in-market sales, ROAS, or attributed conversion. Full stop. SaliencyLab validates against public engagement and click intent, not business outcomes.
Replicate biometric responses. No skin conductance, no eye fixation duration, no facial-coding emotion read. A live panel with the right instrumentation does this; we don't.

04, Decision tree

The decision tree

Start with the RoastIQ verdict. The verdict, not your gut, decides whether a synthetic panel is worth the five minutes, and whether a live panel is worth the five figures.

Scale

Composite ≥ 70, no KPI < 55

Ship it. The model is confident, no KPI is dragging, and you have better uses for your panel budget. A synthetic run here is curiosity, not decision support.

Action: ship · no panel

Sharpen

Sharpen + one obvious weak KPI

The diagnosis is already in the score. Fix the weak KPI (e.g. Get Noticed → re-edit the first two seconds), re-score, re-decide. A synthetic panel adds cost without adding information.

Action: fix + re-score · no panel

Sharpen

Sharpen + unclear cause

Multiple KPIs are middling and you can't tell which one is the bottleneck. Run a synthetic panel, the verbatims will tell you which objection is loudest in which segment.

Action: synthetic panel · then revise

Rebuild

Composite < 55 or two+ KPIs < 45

Don't re-brief from a blank page. Run a synthetic panel before re-brief, the rejection language from the personas is what the new brief should respond to. This is where synthetic users earn their keep.

Action: synthetic panel · then re-brief

05, Pay for humans

When you should still pay for a live panel

There are categories where simulation isn't enough and synthetic confidence becomes false confidence. Four cases where I'd still recommend writing the cheque:

Situation	Why simulation is not enough	Verdict
High-stakes category launch	One ad carries multi-million-euro media weight. Cost of being wrong dwarfs panel cost.	Live
Regulated industry messaging	Pharma, finance, kids, claim-recall and comprehension must be measured, not modelled.	Live
Brand-recall hypothesis	If the question is "will the brand be remembered tomorrow," only a delayed-recall instrument with humans answers it.	Live
Multi-market positioning study	Cross-cultural nuance, native speakers, lived experience, synthetic personas are a starting point, not a substitute.	Live + synthetic
A/B variant pre-screen	Synthetic is faster and cheaper; differences live-panels would also catch.	Synthetic
Tactical social cutdown	Decision is reversible and cheap. Score it, ship it, learn live.	Synthetic

06, The Kantar comparison

The Kantar comparison, said honestly

I spent multiple years managing Kantar as a vendor at L'Oréal Groupe. Kantar Link AI is a serious instrument and the live-survey gold standard for recall-and-recognition testing. I respect what it does.

SaliencyLab is not a Kantar replacement for that job. We're a different methodology with a different price point and a different validation surface:

Kantar Link AI, live respondents, survey instruments, validated against recall, recognition, and persuasion constructs. Days-to-weeks turnaround, four to five figures per ad.
SaliencyLab, model prediction, validated against public engagement signals (likes, shares, comments, view counts) and click intent (CTR percentile). Ninety seconds per ad, roughly €0.005 in compute. Held-out OOS Spearman ρ +0.30 to +0.32.

Different construct, different cost, different decision. We exist for the pre-spend filter, the moment before you'd commission a Kantar study. If a creative can't pass our filter, it almost certainly shouldn't be put in front of human respondents at all.

07, Honest caveat

The honest caveat

Synthetic personas are language-model simulations of buyer logic. They are not consciousness, not memory, not lived experience. Treated as a thinking aid that surfaces structured objections, they are extremely useful. Treated as a replacement for ethnographic research, they will mislead you.

Three usage rules we enforce in product, not just in copy:

Every synthetic_panel_runs record carries a from_roastiq_run_id. A panel never runs without a scored ad behind it, the score frames the question the panel is asked to answer.
Verbatims are shown as persona-grounded simulated reactions, never as quotes from real consumers.
The product surface labels synthetic confidence dynamically (e.g. "Directional, n=36 personas") and never as "validated" or "representative."

Synthetic users do not survey real consumers. They are scenario simulation. Anyone telling you otherwise is selling you the wrong instrument.

Keep reading

02 / Hub

Hook rate vs. thumbstop, what each one actually measures

03 / Hub

Brand cue placement: where the logo earns its keep

04 / Hub

Reading your RoastIQ score without overclaiming

05 / Hub

The first two seconds, the only ones that decide skip

06 / Hub

Benchmark credibility, what 500 scored ads can and cannot prove

↺ / Hub

Back to the Creative Analysis hub

Frequently asked

Can synthetic users replace a real consumer panel?: No. Synthetic Users is scenario simulation, 36 LLM-modelled personas that surface buyer objections in the language of a segment. It does not measure recall, recognition, or in-market sales. For those, a live panel methodology (e.g. Kantar Link AI) remains the gold standard.
When should I run a synthetic panel at all?: When the RoastIQ verdict is Sharpen and the cause is unclear, or when the verdict is Rebuild and you need persona-grounded objections before re-briefing. Skip it for Scale verdicts and for cases where a single weak KPI already tells you what to fix.
What is SaliencyLab actually validated against?: Public engagement signals (likes, shares, comments, view counts) and click intent (CTR percentile bands) on TikTok and YouTube. Held-out out-of-sample Spearman ρ of +0.30 to +0.32. We do not validate against sales, ROAS, attributed conversion, or brand recall.
Does a synthetic panel run on its own?: No. Every synthetic_panel_runs record carries a from_roastiq_run_id. Synthetic Users only opens from an existing RoastIQ result, the score frames the question the panel is asked to answer.