The OfferWise methodology — How we analyze California disclosures

01 — The corpus

What we trained on.

Every prediction OfferWise makes leans on a labeled training corpus. We didn't scrape a generic real estate dataset — we built ours from the ground up, focused on the California buyer.

257,804

Labeled findings

1,324

Contradiction pairs

Active municipal portals

Finding-category baskets

The corpus combines three data sources:

Public-records crawls from 13 California municipal portals — permit history, code violations, NFIP flood claims, tax records. Refreshed on a rolling schedule.
Inspection report findings labeled with category, severity, and repair cost ranges. These are extracted text findings — not scraped from third-party report providers — annotated against a fixed schema of 40 finding-category baskets.
Disclosure-vs-inspection contradiction pairs — 1,324 hand-labeled cases where what the seller disclosed and what the inspector found don't agree. This is the dataset behind our contradiction model.

We do not buy or scrape inspection reports from any third-party provider. Findings in the corpus come from publicly available documents, code-enforcement records, and user-uploaded reports analyzed with the user's permission.

02 — The models

What runs on every analysis.

An OfferWise analysis is not a single model. It's a pipeline of specialized models, each doing one job well. The result is then synthesized into the buyer report.

Category classifier. Sorts each finding into one of 40 baskets (roof, foundation, plumbing, electrical, etc.). Built on sentence-transformer embeddings + XGBoost.
Severity ranker. Predicts how serious each finding is on a calibrated scale, accounting for inspector phrasing patterns (inspectors tend to underrate or overrate consistently).
Repair-cost estimator. Predicts the dollar cost range for the finding, calibrated by region and home age. Outputs confidence intervals, not point estimates.
Contradiction detector. Compares every disclosure claim against every inspection finding to surface mismatches.
Public-records cross-reference. Pulls permits, code violations, NFIP claims, and assessor records for the property and matches them against the inspection and disclosure.
Property similarity profile. Encodes the property as a high-dimensional vector so the system can find comparable homes in the corpus and ground predictions in actual comparable data.

Each model is trained separately and evaluated against held-out portions of the corpus. We don't ask a single language model to do everything — and we don't trust a single language model with any of it.

03 — What we measure

Honest accuracy numbers.

These are the numbers from our most recent training cycle, measured on held-out data:

85.8%

Category accuracy

0.82

Cost R² (MAE $1,915)

97.1%

Contradiction precision

65.4%

Severity accuracy

The numbers we are most proud of, and the numbers we are honest about:

Category accuracy at 85.8% means roughly 1 in 7 findings ends up in the wrong category at first pass. We surface these to the buyer with category-confidence indicators so they know which findings to double-check.
Cost estimation MAE of $1,915 means our typical repair-cost estimate is within ~$2,000 of the true value. For a $30,000 roof, that's tight. For a $500 caulking fix, the error band is proportionally larger.
Contradiction precision at 97.1% means when our model flags a disclosure-vs-inspection mismatch, it is almost always real. We tune this model to favor precision over recall — we'd rather miss some real contradictions than cry wolf.
Severity accuracy at 65.4% is our weakest model. Severity is genuinely subjective — a "moderate" foundation crack means different things to different inspectors. We use this as a sanity check rather than a primary input to the offer math.

We retrain the corpus on a rolling schedule. As more properties are analyzed, the training data grows and accuracy compounds. The numbers above reflect the most recent training cycle.

04 — What we don't do

Honest limitations.

OfferWise is a strong tool for what it does. We try to be clear about what it doesn't do.

We are not licensed inspectors. We don't replace a physical home inspection. We make the report you already have more useful.
We are not lawyers, agents, or contractors. Our cost estimates and findings are inputs to your decision, not legal advice or contractor quotes.
Vision-based image analysis is partial. When inspection PDFs require visual extraction (scanned reports, image-heavy documents), our system describes embedded photos in detail. Most modern text-extractable PDFs bypass this path — image analysis on every report is on the roadmap, not the current default.
We are strongest in California. The corpus is California-focused. Out-of-state analyses run on the same models but with less locality-specific training data. National coverage is growing.
Predictions, not certainty. Every output is a probability or a range. We try to be calibrated, not confident. Use our reports as the start of your due diligence, not the end of it.

How we turn 50 pages of paperwork into an offer strategy.

What we trained on.

What runs on every analysis.

Honest accuracy numbers.

Honest limitations.