Methodology

How Mirr scores AI brand visibility.

Every number in a Mirr report is either computed deterministically from public signals or scored by a named model against an explicit rubric. This page documents the full pipeline, every formula, and the rationale behind each design choice, so anyone can reproduce or challenge a result.

The four-stage pipeline

An audit runs through four stages. Stages 1 and 2 are deterministic: given the same brand name, website URL and competitor list, they return the same numbers every time. Stages 3 and 4 use AI, but operate on the deterministic data from stages 1 and 2 as ground truth, which keeps the final output anchored to facts.

Stage 1: Multi-LLM research

We run 15 standardised customer-discovery queries (for example, “best AI visibility audit tools for SaaS in 2026”) on four AI platforms: ChatGPT (OpenAI gpt-4o-mini), Perplexity (sonar), Claude (Haiku 4.5) and Gemini (1.5 Flash). That is 60 API calls per audit. Every response is stored verbatim so you can read them in the report's Raw Responses appendix.

Stage 2: Web presence audit

We check nine signals AI crawlers rely on: a Wikipedia page (via the official Wikipedia REST API, not inference), a verified LinkedIn company page, Schema.org structured data on your homepage, the meta title, meta description, Open Graph tags, Twitter card tags, canonical URLs, and presence on eight review platforms (G2, Capterra, Trustpilot, Klantenvertellen, Clutch, DesignRush, Google Business and AlternativeTo). Every check is a yes or no; there is no AI inference in this stage.

Stage 3: Perception scoring

Claude Haiku 4.5 reads the aggregated research output from stage 1 and scores the brand on five category dimensions, each against an explicit rubric: Awareness (does AI know you exist), Consideration (are you on the shortlist for relevant queries), Decision (do recommendations end with you as the choice), Sentiment (is the tone positive) and Cultural relevance (does AI connect you to the right cultural signals). Each category returns 0-100. The Visibility Score is the average.

Stage 4: Strategic synthesis

Claude Opus 4.7 combines the stage 1 research, the stage 2 signals and the stage 3 scores into an identity gap analysis (six dimensions, 0-10 each), a competitor benchmark with per-competitor strengths and weaknesses, and a seven-step action plan. Every action in the plan must reference a specific finding; generic advice is rejected by the prompt contract.

Formulas

Every metric in a Mirr report is one of these:

  • Coverage % = (prompts mentioning your brand ÷ 15 total prompts) × 100. Regex word-boundary match against normalised text, case-insensitive. Deterministic.
  • Share of Voice % = (your brand mentions ÷ total brand mentions across all responses) × 100. Counts every occurrence of every provided brand name. Deterministic.
  • Web Presence Score = Wikipedia (25) + Schema.org (15) + LinkedIn (15) + meta title (10) + meta description (10) + Open Graph (8) + reviews (up to 8) + Twitter card (5) + canonical URL (4). Maximum 100. Deterministic.
  • Visibility Score = mean of five category scores (Awareness, Consideration, Decision, Sentiment, Cultural). Each category 0-100, scored by Claude Haiku against a rubric. Ranges: 0-20 Invisible, 20-50 Weak-Emerging, 50-75 Established, 75-100 Dominant.
  • Identity Gap = six 0-10 scores for Tone, Values, Audience, Market Position, Brand Promise, Emotional Association. Scored by Claude Sonnet comparing your intended positioning (optional input) to actual AI perception.
  • Per-LLM Coverage = same Coverage formula run independently per platform. Reveals whether your visibility is broad (balanced across ChatGPT, Perplexity, Claude, Gemini) or fragile (concentrated in a single corpus).

Why these four LLMs

Between them, ChatGPT, Perplexity, Claude and Gemini account for more than 90 percent of AI-assisted brand discovery today. Each has a different training corpus and update cadence, so a brand that appears in one may be invisible in another. Measuring all four is the only honest way to describe your AI visibility.

  • ChatGPT (OpenAI gpt-4o-mini) — largest consumer base, trained on a curated web corpus with periodic refreshes.
  • Perplexity (sonar) — research-first, fetches live web, cites sources. Strongest signal for freshness.
  • Claude (Haiku 4.5) — used widely by developers and enterprises, high reasoning quality.
  • Gemini (1.5 Flash) — integrated with Google Search and Android. Essential for consumer discovery.

Why 15 queries

The 15 queries are generated by a fixed template that asks each brand category five discovery-intent questions (for example, “what are the top tools for X”, “best alternatives to Y”), three comparison questions, three decision-stage questions, and four edge-case or long-tail questions. Running fewer queries produces unstable Coverage numbers; running more produces diminishing returns but tripled cost. Fifteen is the smallest number at which rerun variance drops below five points for most brands.

What varies between runs

Deterministic signals (web presence, coverage counts) are stable to within one point across reruns. AI-scored metrics (Visibility Score, identity gap) have run-to-run variance of roughly five points, which is why we quote the Visibility Score as a band (for example “Emerging”) rather than a single number. The honest number is the band; the specific integer is a directional indicator.

When scores move after changes

Deterministic web presence changes (canonical URLs, Schema.org, meta tags) are reflected the moment crawlers refetch your site, typically within 7 days. Perplexity picks up fresh content in 24 to 48 hours. ChatGPT, Claude and Gemini use training snapshots updated on their own schedules; expect LLM coverage gains to lag deterministic gains by 4 to 8 weeks.

Verifiability

Every audit PDF includes a Raw Responses appendix containing the first eight queries in full, with every LLM response. The numbers in the report are computable from these responses with nothing more than a regex match. If a result looks wrong, you can check the source material directly. This is deliberate: we score brands for AI visibility, so the report itself should be accountable to the same standard of verifiability it measures.

Last updated

22 April 2026

See the methodology applied to your brand

Run a full Mirr audit in under 5 minutes. 15-page PDF. €99.

Start your audit →