Methodology · RoastIQ

Know if the creative scales, before you spend.

RoastIQ is a pre-spend creative diagnostic. Upload an ad, get a scored verdict in under 90 seconds, calibrated against the public engagement and click-intent outcomes of 1,200+ real ads.

Score your ad →See how it works

90sper scored ad

1,200+ads in the calibration pool

ρ +0.30 to +0.32held-out OOS correlation

Live specimen

Beat the Skip82

Sell Proposition61

Brand Impact88

The pre-spend gap

By the time the data comes back, the budget is already gone.

Most creative testing happens after launch. You spend, you wait, you measure, you cut. RoastIQ moves the verdict to the moment the file is exported.

The frozen five-KPI composite is calibrated against public engagement and click-intent outcomes, not customer-reported ROAS. We tell you what real ads with this signature actually got.

Day 0 · Brief
Concept signed off
Decision made on taste, not signal.
Day 7 · Final cut
Production locked
$30k–$300k committed.
Day 8 · Pre-spend
RoastIQ scores it
90 seconds. Verdict before media buy.
Day 14 · Live
In-market data
Now you know, too late to change the cut.

How RoastIQ thinks

From a 60-second file to a scale-or-rebuild call, four layers of analysis.

One pipeline. Four passes. Each layer adds context the next layer needs.

Step 01

Ingest the creative

Drop in MP4, MOV, JPG, PNG, WebP, up to 60 seconds. We extract frames, audio, transcript, and on-screen text so the same model sees what a viewer sees.

Step 02

Predict visual attention

A saliency model predicts where eyes go in the first 0–3 seconds. This is a prediction, not measured eye tracking, calibrated against the gaze patterns of real viewers on similar formats.

Step 03

Score the five frozen KPIs

Beat the Skip · Get Noticed · Brand Impact · Sell Proposition · Build Brand. Five dimensions, fixed weights, never reordered. A multimodal model scores each against the calibration pool for your platform and category.

Step 04

Issue the verdict

The composite maps to one of three calls, Scale, Sharpen, Rebuild, with the evidence trace, confidence label, and benchmark sample size attached. No black box.

Beat the Skip82

25% weight

Get Noticed79

20% weight

Brand Impact88

20% weight

Sell Proposition61

20% weight

Build Brand71

15% weight

Verdict

Scale

Composite 76, strong skip resilience and brand impact, no KPI under 55. Run it.

COMPOSITE 76 / 100

The composite

Five KPIs. Fixed weights. One number you can act on.

The composite is frozen. We don't let the model choose its own scoring weights, that's how scoreboards become marketing.

Beat the Skip

25%

Get Noticed

20%

Brand Impact

20%

Sell Proposition

20%

Build Brand

15%

Composite = Beat the Skip ×0.25 + Get Noticed ×0.20 + Brand Impact ×0.20 + Sell Proposition ×0.20 + Build Brand ×0.15. The Goal Fit secondary lens reweights the same five scores for direct response, brand awareness, and other objectives, without ever touching this composite.

The decision

Three verdicts. One ladder. No ambiguity.

Every RoastIQ run lands on exactly one of these. The rule is mechanical.

composite ≥ 70 · no KPI < 55

Scale

Strong skip resilience, brand impact lands, no broken dimension. Real ads with this signature outperformed the cohort.

Action. Ship it. Allocate spend. Test variants only against this baseline.

composite 55–69

Sharpen

A clear signal is there but at least one KPI is dragging the composite down. The fix is usually one cut, one line, or one CTA.

Action. Iterate. The driver-analysis trace tells you which dimension to attack first.

composite < 55 · or two+ KPIs < 45

Rebuild

Multiple dimensions are broken. Iteration won't fix it, the concept itself is mis-cast for the format or the audience.

Action. Stop. Re-brief. Don't put media weight behind it.

Validation

The score is a prediction. Here's how we know it predicts.

We don't validate against customer-reported ROAS, too noisy, too political, too unfalsifiable. We validate against public engagement and click-intent outcomes on TikTok and YouTube.

Ads in the held-out calibration cohort

ρ +0.00

YouTube view counts · OOS Spearman, n=403

ρ +0.00

TikTok engagement · OOS Spearman, n=700

0×

Top-quintile vs bottom-quintile lift

What this means in plain English. When we rank a fresh batch of ads by predicted RoastIQ score, the top 20% win 6.5× more public engagement than the bottom 20%. Spearman ρ of +0.30 to +0.32 on held-out data is a correlation, not causation, it tells you the score's ranking aligns with how real audiences ranked these ads. Scaling curves (n=400 to n=1,200) show ρ growing 2–4×; we're not at saturation. What this is not. Not a sales-lift prediction. Not a brand-recall measurement. Not a substitute for in-market measurement.

Honest scope

What RoastIQ is, and what it isn't.

Pre-spend creative diagnostics work because they stay in their lane. We are explicit about ours.

RoastIQ does

Predict relative engagement and click-intent on TikTok and YouTube.
Predict visual attention from a saliency model, calibrated, not measured.
Detect creative attributes at ~85% accuracy and trace them to KPI scores.
Benchmark against the calibration pool for your format, platform, and category.
Issue a deterministic Scale / Sharpen / Rebuild verdict from the frozen composite.

RoastIQ does not

Predict in-market sales lift, ROAS, or attributed conversion.
Measure brand recall, that requires recall-survey methodology.
Run a real consumer panel (BuyerLens is scenario simulation, not a panel).
Measure eye tracking, heatmaps are predicted, not observed.
Replace post-launch measurement. It runs before the spend, not instead of it.

The next ad you ship, scale, sharpen, or rebuild?

Drop a file. Get a verdict in 90 seconds. Decide before the media plan goes live.

Score an ad now →See the BuyerLens methodology →