Data integrity

How we keep ScoreView’s data trustworthy

Audit completed .

01 / Why this matters

Housing professionals decide based on what ScoreView shows them.

Regulatory positioning, complaint-handling priorities, and board decisions all rest on these figures. We owe you confidence that every number is what it claims to be.

This page explains what we checked, what we changed, and how the platform defends itself against the kind of silent errors that creep into any data product.

02 / What we checked

Three layers, end to end.

  1. Ingestion. How Housing Ombudsman determinations move from the public record into our database.
  2. Analytics. How individual records become benchmarks, trends, and peer comparisons.
  3. Presentation. How those numbers, and the AI briefings derived from them, reach you.

Across these layers we surfaced more than 30 specific risks: a number could be misread, a record could be silently mis-categorised, or an AI-generated briefing could be mistaken for an Ombudsman quote.

03 / What we fixed

Six changes, one principle: never let a failure look like a finding.

Silent defaults killed

Previously, if our scraper failed to extract a determination’s outcome, the record would fall back to “no maladministration”, making an extraction failure indistinguishable from a landlord being cleared. Likewise, a missing publication date would silently become today’s date, surfacing the record in trend charts as fresh news. Both fallbacks are gone. Extraction failures now carry an explicit parse error flag and are excluded from every analytics view by invariant.

Failures filtered by enforced rule

Every analytics query (sector outcome rates, landlord rankings, time-to-determination percentiles, category-level remedy rates) filters out parse-error records. An automated check runs on every code change to prevent any new query from forgetting this filter.

Small samples are labelled

Benchmarks and peer comparisons built on fewer than 25 records are labelled. Fewer than 10: low‑confidence. Fewer than 5: very‑low‑confidence. AI briefings cannot declare sector trends from underpowered samples and must flag when the most recent quarter is still in progress.

Provenance you can quote

Every CSV, PDF, PowerPoint, and Word export carries a methodology page: the corpus snapshot date, the exact filter set used (as a reproducible hash), and a link back to the Housing Ombudsman source for every record. Email digests and the determination detail page do the same. AI-generated summaries are labelled clearly. They are ScoreView’s interpretation of structured metadata, never the Ombudsman’s wording.

Staleness is visible

Every dashboard shows when the corpus was last refreshed. If our scheduled ingestion fails or stalls beyond a 10-day window, a red warning appears on screen. You will know the data is stale before you cite it.

Source links on every record

Every determination on every screen (search results, detail pages, exports, alert emails, sector digests) links back to the original record on the Housing Ombudsman website. Crown copyright is respected: we store structured metadata only, never the determination text.

04 / What we proved at launch

664records corrected, recategorised, regenerated, or excluded as unparseable

We applied the audit retroactively to every existing record.

  1. 50Outcome corrected
  2. 245Recategorised
  3. 191Summaries regenerated
  4. 178Honestly excluded

05 / How we keep it true

The audit was a moment. These controls are permanent.

Seven controls are now built into the platform:

06 / How to read what you see

Every figure on this platform is five things.

When you read a ScoreView number, you can take it as:

AI-generated content (synthesis briefings, summaries, weekly digests) is labelled, and works only from cleaned structured metadata. It is not the Ombudsman’s wording, and it is not a legal opinion.

07 / Questions and corrections

If a number looks wrong, we want to know.

Contact us through your account page. Quote the URL and the date. We will trace the figure back to source and respond.

For the underlying source data we draw from, see our Data Sources and Methodology page.