Co5
Buyer's guide · 6 min read

The honest brand monitoring buyer's guide.

The twelve questions buyers wish they had asked — before they signed.

Most buyer's guides in this category are written by the vendors. The criteria they tell you to evaluate just happen to be the criteria they happen to score highest on. This guide is different. It is built from what buyers actually regret after they signed.

We are not paid to flatter anyone. We are not paid by the analyst firms whose reports sit behind paywalls. We built this rubric from what reviewers actually say on G2, TrustRadius, and Capterra — after the contract was signed. The questions below name what they could not.

The vendor knew the answer at sale.

The buyer learned it at renewal.

Now you can ask before.

02The stakes

What buying wrong actually costs.

Every regret pattern in this guide is sourced — drawn from G2, TrustRadius, and Capterra reviews of the major brand monitoring vendors. Each one starts as a small omission in evaluation and lands as a real cost — in dollars, in hours, in the confidence the comms team and their leadership have in the work. The three patterns below are the most common.

The contract trap

The renewal window opened before you knew.

The cost · Dollars

Another year of seat fees, paid for a tool your team has stopped trusting.

We notified them in writing well in advance. They held us to another year anyway.
Pattern

Auto-renewal windows of 60 days to 5 months are buried in click-through ToS. Cancellation requires written submission via a specific portal inside a narrow window. Multiple major vendors share this pattern.

The question to ask

What is the exact cancellation procedure, in writing, and when does the renewal window open and close?

Source · TrustRadius · multi-vendor
The sentiment demo never happened

The accuracy claim went unchallenged.

The cost · Hours

Hours of manual correction every week. The trust in the dashboard, gone.

Their sentiment said 80%. We spot-checked 50 of our own mentions and had to correct 18. That is 64%.
Pattern

Vendors demo sentiment scoring on pre-cleaned tutorial brands. Buyers rarely run it on their own coverage. Real-world accuracy sits in the 60–75% range — sometimes lower — per the vendors’ own documentation.

The question to ask

Run sentiment scoring on 50 mentions of MY brand from the last 30 days, and let me grade it.

Source · vendor help docs
The coverage gap that surfaced after

“Global coverage” had a hole.

The cost · Confidence

The story you needed to see first. The board reads it in the Financial Times before your tool tells you — and the team’s confidence in the dashboard, and the board’s confidence in the team, goes with it.

We bought ‘global coverage.’ Three months in we found out they do not index the Financial Times. That is not global.
Pattern

Each vendor has known coverage holes — paywalled outlets, API-throttled social platforms, regional press categories — that surface only after the contract is signed. None of them lead with this disclosure.

The question to ask

Give me a written list of sources you DO NOT monitor in my industry and region.

Source · G2 reviews · multi-vendor

Every regret pattern in this guide is sourced. Every question is the one those buyers wish they had asked.

03The rubric

The twelve questions.

Three per category. Each backed by a regret pattern. No vendor names — only the questions themselves.

Category A · Mechanics
The contract reality
Q01Mechanics · Question 01
What is the exact cancellation procedure, in writing, and when does the renewal window open and close?
What good answers sound like

Procurement-ready clause sent during evaluation. Specific notice window. Named portal. Written confirmation receipt. Multi-year offer carries an explicit out-clause.

What bad answers reveal

Hedging. “Our team will guide you through that.” Refusal to put it in evaluation documentation. Auto-renewal buried in click-through ToS.

Q02Mechanics · Question 02
What are the total all-in costs — setup, per-seat, professional services, training, contractual escalators — at the 3-year horizon?
What good answers sound like

Itemized line items. Named professional-services SOWs if required to reach demo value. Contractual escalator percentage and trigger.

What bad answers reveal

“Pricing depends on usage.” Quoted seat price only. No escalator disclosed. Services billed separately as “transformation.”

Q03Mechanics · Question 03
Has this product been acquired or had majority ownership change in the last 24 months? Show me the integration roadmap and what changes for me.
What good answers sound like

Honest history. Roadmap with dates. Named SLAs that carry over. Clear answer on whether your tool will be folded into a parent suite.

What bad answers reveal

“Nothing changes for our customers.” Roadmap deferred. SLA changes hidden in renewal paperwork.

Category B · Coverage
The SLA reality
Q04Coverage · Question 04
Give me a written list of sources you DO NOT monitor in my industry and region — paywalled outlets, API-throttled platforms, regional press categories you miss.
What good answers sound like

A specific list. Named paywalled publications. Named regional gaps. Honest about API throttling on Meta/X.

What bad answers reveal

“We cover 1B+ sources.” Refusal to specify gaps. “We can add any source on request.” (You’ll be paying for that.)

Q05Coverage · Question 05
What does Day 1 look like — before the platform has 30 days of my data — and what does the tool admit it can’t yet tell me?
What good answers sound like

Calibrated humility. “Day 1 you get X. Days 7–30 you get Y. The platform tells you what it’s still learning.”

What bad answers reveal

“Day 1 you’re fully operational.” (No platform is. The vendor confident here is the vendor who will confidently make things up.)

Q06Coverage · Question 06
Does the data I download match the dashboard data my executive sees? Show me a side-by-side. Show me the gap.
What good answers sound like

Live side-by-side demo, with the gap explained. Documentation on which dashboard fields are computed versus raw.

What bad answers reveal

Discomfort with the question. “Dashboards optimize for speed; the export is authoritative.” (You’re about to send an exec a misleading screenshot.)

Category C · Sentiment + signal
The AI reality
Q07Sentiment + signal · Question 07
Run sentiment scoring live, on 50 mentions of MY brand from the last 30 days, and let me grade it before we sign anything.
What good answers sound like

Yes — here is the test. Real data. We will note where we disagree with your grading and why.

What bad answers reveal

Refusal. “We can show you our customer testimonials.” Sentiment shown only on staged-demo data.

Why sentiment scoring fails in crisis
Q08Sentiment + signal · Question 08
What’s your false-positive rate on critical alerts in the last quarter, across your customer base? Show me the data.
What good answers sound like

Named percentage. Explanation of how false positives are detected and surfaced. A path to tune precision in the deployment.

What bad answers reveal

“We don’t track that.” Or: “Alerts are configurable.” (Which means: you’ll do this work, not them.)

False alarms vs true escalation
Q09Sentiment + signal · Question 09
What model powers your classification, what was the training data, what’s the documented accuracy, and how often is it retrained against my industry?
What good answers sound like

Named model family. Training-data composition. Public or auditable accuracy benchmark. Retraining cadence per industry vertical.

What bad answers reveal

“Our proprietary AI.” “We can’t disclose model specifics for competitive reasons.” (That means there is no model worth disclosing.)

Category D · Interpretation + decision
The value reality
Q10Interpretation + decision · Question 10
After your tool surfaces an insight, what does it tell me to DO about it — and what’s the doctrine or research grounding that recommendation?
What good answers sound like

A specific action (“stand down,” “respond now,” “amplify”) with a named doctrine (crisis-tier framework, frame-analysis research) grounding the recommendation.

What bad answers reveal

“We surface the insight; you decide.” (Fine — if the vendor admits this stops at insight. Red flag when they call this a feature instead of a limitation.)

Crisis response — the first 90 minutes framework
Q11Interpretation + decision · Question 11
How is “normal” calibrated for MY brand specifically — anchored to what external data — and how does the platform distinguish “Tuesday for this brand” from “something is actually moving”?
What good answers sound like

Per-brand calibration. Named external anchor (reputation index, external research, calibrated baseline). Explicit math on what counts as a deviation from this brand’s normal.

What bad answers reveal

“We benchmark against industry averages.” (Industry averages don’t tell you what is normal for Michael J. Fox versus Tesla. The whole point is per-brand.)

How baseline deviation works
Q12Interpretation + decision · Question 12
From the moment a story breaks and my CCO asks "do we respond?" to the moment I have a defensible answer for the board — what is the typical clock time? Show me three real examples.
What good answers sound like

Stopwatch number with examples. Named customer outcomes ("CCO asked at 8:14am, board-ready answer at 8:46am").

What bad answers reveal

"It depends." "Our tool dramatically accelerates analyst workflows." (Workflows are not decisions.)

Five early signals before a story breaks

Twelve questions. Every regret pattern in this guide names one of them. Run them against any vendor in this category.

04The good answers

What good answers sound like, across every question.

Whichever question you are asking, the same five attributes separate credible answers from vendor-speak. If a vendor's answer to any of the twelve questions does not have these five qualities, the answer is the marketing department — not the engineering.

Five universal attributes
  • Specific.

    Names the math. Names the data. Names the SLA. No “our AI handles that.”

  • Demoable.

    Can be tested on your actual brand, today, with your real data — not staged tutorial data.

  • Documented.

    Backed by documentation, customer references, third-party audits, or research citations the vendor can name.

  • Honest about limits.

    Names what the tool cannot do as readily as what it can. Vendors who admit limits are the ones who actually know their product.

  • Time-boxed.

    Answers a specific question in a specific window — not a “transformation journey.”

If the answer requires three months of professional services to materialize, the tool is the kit. You are the product.

05The bad answers

What bad answers reveal.

Five red flags surface in vendor evaluations. Each one signals the same thing — the vendor's marketing is doing the talking, and the engineering isn't there to back it up.

Five universal red flags
  • “Our AI handles that.”

    Stop. Ask for the model, the training data, the accuracy. Vendors who cannot name them are flagging that there isn’t one worth naming.

  • Hedging on contract mechanics.

    If the vendor will not put the cancellation procedure in writing during procurement, what do you think they will do at renewal?

  • NDA before performance data.

    If their performance numbers require an NDA, the numbers are weaker than their marketing.

  • “Custom solution” language for standard capabilities.

    Translation: this is not built. They will quote you services to build it.

  • Source-count one-upping.

    “We have 1B+ sources.” Fine — show me which ones you do not have in my space.

The pattern across every regret story: the vendor knew the answer at sale. The buyer learned it at renewal.

06The test

The 10-minute test — for any demo.

Three scenarios. Run them live in any vendor demo. If the vendor cannot, the rubric did its job.

Step 01 · 3 min

The Tuesday test.

The test

Pick a brand in your portfolio the vendor has never seen. Ask the vendor: “What is normal for this brand? Show me their last 30 days.”

The tell

A vendor with real calibration will hesitate or admit cold-start limits. A vendor without it will confidently make something up. The confidence-on-no-data move is the tell.

Step 02 · 4 min

The shape test.

The test

Show the vendor 30 days of coverage from a brand you know is in a chronic-negative pattern — constant low-grade complaints, no acute event. Ask: “Is this a problem?”

The tell

Good answer is contextual — “compared to what is normal for them.” Bad answer is a sentiment score. The sentiment-score-as-answer move is the tell.

Step 03 · 3 min

The move test.

The test

Pick one piece of coverage from your portfolio that landed in a real decision last quarter. Ask the vendor: “What would your tool recommend we do here?”

The tell

If the answer is a dashboard, the tool stops at insight. If the answer is a specific action with a doctrine grounding, the tool runs the chain. The dashboard-as-answer move is the tell.

Ten minutes. Three scenarios.

You will learn more than two hours of feature deep-dives.

And one more category

What about AI brand monitoring?

You'll get pitched a “sixth category” in 2026 that the previous five categories didn't include: AI brand monitoring tools(Profound, Athena, Goodie, Brandlight, Peec, and a growing field). They promise to track what ChatGPT, Claude, Perplexity, and Gemini say about your brand — and to “optimize” how AI answers questions about you.

A real Peec user, in an independent review
“The data is useful, but it will not automatically hand you a complete execution plan.”

That's the category's structural limit, in one sentence. The tools surface what LLMs say about your brand. They don't tell you what to do about it. They don't show that what LLMs say is even moving the needle. It's a measurement layer over a signal that may or may not matter, with no built-in path to action.

The deeper limit is technical. Most LLMs answer brand queries from training data that's frozen between model releases — months apart. When they do retrieve from the web, they call the same search engines you've been optimizing for since 2005 (Bing, Google, Brave). There's no separate “AI optimization” lever to pull. Search Engine Land, Digiday, CXL, and Google's own representatives have all said versions of this since early 2025. The full technical analysis is in our companion piece:

Is AI brand monitoring worth it? →

Five questions to ask any AI brand monitoring vendor

If you're still in their pipeline, here's how to evaluate them.

  • 01

    What specific work does this tool replace that you cannot do already?

    The honest answer is usually "citation tracking and snippet diagnostics" — which are features, not a category. If the vendor answers with abstractions, they don't have a clear answer.

  • 02

    What is the cadence of the underlying signal — does it actually change daily?

    Most LLMs reason from training data that's frozen between model releases (months apart). Daily monitoring of a static signal is theater. Ask the vendor to show you a week of data that actually moved.

  • 03

    What is your contract structure and cancellation window?

    Customer-review data across the category shows the loudest complaint is not the product — it's the commercial model. Auto-renewal traps, 60-90 day notice requirements, multi-year contracts. If the vendor will not put the cancellation procedure in writing during procurement, they will not respect it at renewal.

  • 04

    Can you show me one customer who attributes measurable traffic or revenue lift to your tool?

    Otterly.ai's own published testing found "no consistent correlation between AI brand mentions and actual traffic lifts." If the category leader has not produced this correlation in their own data, expect the vendor pitching you cannot either.

  • 05

    If LLMs don't search the web for most queries, what daily action would your dashboard surface?

    This is the question that ends the meeting. Most vendors don't have a clean answer — because most LLM brand answers come from frozen training data, the daily action surface is genuinely thin.

The vendors who answer these questions plainly are the ones worth shortlisting.The vendors who deflect are the ones you should not be buying.

07The template

The twelve questions, RFP-ready.

Take the rubric into procurement. Copy it as a Markdown table, paste it into your RFP system, or hand it to legal. Ungated — no email, no form. The questions are yours.

RFP-TEMPLATE.MD1.4 KB · Markdown
# Brand Monitoring Vendor Evaluation Rubric

Twelve questions, four categories. Sourced from authentic
post-purchase regret patterns. Run all twelve against any
vendor in this category — media monitoring, social listening,
or brand intelligence.

## Category A — Mechanics

1. What is the exact cancellation procedure, in writing, and
   when does the renewal window open and close?

2. What are the total all-in costs — setup, per-seat,
   professional services, training, contractual escalators —
   at the 3-year horizon?

3. Has this product been acquired or had majority ownership
   change in the last 24 months? Show me the integration
   roadmap and what changes for me.

## Category B — Coverage

4. Give me a written list of sources you do not monitor in
   my industry and region…

… (Content truncated for preview. Full 12 questions copied
to clipboard on action.)

Use it in your RFP. Share it with the team. The questions are the questions, whoever asks them.

Common questions

What buyers ask before signing.

How do I evaluate a brand monitoring tool?

expand_more

Run twelve questions across four categories — mechanics (cancellation, total cost, ownership stability), coverage (gaps the vendor admits to, day-one performance, dashboard-vs-export parity), sentiment + signal (live sentiment test, false-positive rate, model transparency), and interpretation + decision (recommended action grounding, brand-specific calibration, time-to-defensible-answer). Vendors who can't answer cleanly during evaluation rarely improve after the contract.

What are red flags in a brand monitoring vendor demo?

expand_more

Refusal to put cancellation terms in writing during evaluation. Vague answers to coverage gaps ("we cover 1B+ sources" without naming what's missing in your industry). Sentiment shown only on staged demo data, not your real coverage. "Our proprietary AI" with no model disclosure. "You decide" as a recommendation framework instead of a doctrine-grounded specific action. Each one is a signal the vendor's confidence outruns what the platform can actually do.

What is the 10-minute calibration test?

expand_more

A reusable diagnostic for any vendor demo. Three scenarios: the Tuesday test (ask the vendor about a brand they've never seen — listen for honesty about cold-start limits), the shape test (show them a brand in a chronic-negative pattern — "is this a problem?" — listen for context, not a sentiment score), and the move test ("what would your tool recommend we do here?" — listen for an action grounded in doctrine, not a dashboard). Ten minutes, run during the demo, reveals more than two weeks of RFP responses.

How is this guide different from a typical RFP?

expand_more

Most RFPs ask product questions ("does it support X feature"). This guide asks reality questions — what the vendor admits to, what their data actually does when graded by you, what their AI is, and what their tool tells you to DO. The honest brand monitoring buyer's guide is built from the regret patterns of customers who already signed, not from feature lists vendors push.

How long does evaluating brand monitoring tools usually take?

expand_more

Two to four weeks if you run a proper bake-off — initial demos (one week), live sentiment + 10-minute calibration test on real data (one week), reference calls with named customers (one week), procurement and contracting (one to two weeks). Faster paths are usually a vendor steering you past the test phases — the very phases that surface their limitations.

What's the difference between media monitoring, social listening, and brand intelligence?

expand_more

Media monitoring counts coverage and tracks reach. Social listening scores sentiment across social. Brand intelligence reads what the coverage means for this specific brand against its own history and recommends a move grounded in research. The first two are data categories; brand intelligence is a comprehension category that sits on top of either or both.

Why should the data I export match what executives see in the dashboard?

expand_more

Most platforms compute dashboard fields differently from raw export fields — dashboards optimize for visual clarity, exports return underlying data. When the two diverge silently, you end up sending the boardroom a screenshot the export contradicts. The right vendor will show you the gap live during evaluation and document which fields are computed vs raw.

08Where to go next

Every regret story above started the same way.

The buyer didn't ask.

Now you can.

Take the rubric into your next renewal. Into the next demo on your calendar. The vendors that answer well are the ones worth keeping — and the ones that can't will tell you why they couldn't, without meaning to.