Political · Published 2026-05-24

Find your campaign's 50 lookalike counties in 30 seconds

Pick a reference county where your campaign or program is already strong, then ask which OTHER U.S. counties demographically resemble it — comparable population, income, education, age — so the next ad buy or canvass plan can target lookalikes. One call to find_similar_areas returns the ranked list.

What "similar" means here

You pick the variables. find_similar_areas z-normalizes them across the whole collection, then computes cosine distance from a reference row to every other row. Lower distance = more similar. For lookalike modeling in politics the canonical set is population + median household income + bachelor's-or-higher + median age. For retail it's population + commute time + employment mix. For healthcare it's uninsured-rate + median age + low-income households.

Step 1 — Materialize the variables once

The system Census cache covers the headline variables (B01001_001E population, B19013_001E median income, B15003_022E bachelor's, B01002_001E median age — and ~16 others). When the cache is warm, this step is read-only.

mcp.call("census_acs", {
  collection: "us-counties", version: "tiger-2024", year: 2023,
  variables: ["B01001_001E", "B19013_001E", "B15003_022E", "B01002_001E"],
  slug: "acs-counties-2023-headline",
})
  → 3,143 counties × 4 variables → workspace dataset

Step 2 — Find the lookalikes (one call)

mcp.call("find_similar_areas", {
  dataset_slug: "acs-counties-2023-headline",
  reference_external_id: "",
  variables: ["B01001_001E", "B19013_001E", "B15003_022E", "B01002_001E"],
  metric: "cosine",
  limit: 50,
})
  // → reference: { external_id, name, raw_values: { ...the input variables } }
  // → hits: [ { external_id, name, distance }, ... ]
  //   sorted ascending by cosine distance, capped at limit

That's the full pipeline. One premium MCP call returns the ranked list.

Step 3 — Render the lookalikes as a choropleth

Synthesize a derived dataset where each county's value is 1 - distance (so higher = more similar), then render as a sequential choropleth.

mcp.call("ingest_dataset", {
  slug: "wake-lookalikes-2023", name: "Wake County NC lookalikes",
  records: hits.map((h, i) => ({
    external_id: h.external_id,
    similarity: 1 - h.distance,
    rank: i + 1,
  })),
  collection: "us-counties", version: "tiger-2024",
})

mcp.call("render_map", {
  collection: "us-counties", version: "tiger-2024",
  dataset: "wake-lookalikes-2023", field: "similarity",
  palette: "sequential",
  classify: { method: "quantile", bins: 5 },
  title: "Lookalikes for Wake County, NC",
  subtitle: "Cosine distance over pop, income, bachelor's, median age (ACS 2023)",
  pin_overlays: [{
    name: "Reference",
    color: "#dc2626",
    points: [{ lng: -78.6382, lat: 35.7796, label: "Wake County, NC" }],
  }],
})
  → SVG + breaks + categories

Step 4 — Publish for citation

mcp.call("publish_map", {
  view_id: …,
  freeze: true,
})
  → /v/wake-lookalikes-2023   (iframe · PNG · citation.json with the exact
                                filter that produced it)

Why this matters

Lookalike modeling on the back of a Python notebook is a week of work. Pulling four ACS variables, joining to counties, z-normalizing, computing cosine distance, sorting, joining back to county geometries for a choropleth, re-running across vintages without breakage. One MCP call replaces all of it. The math is deterministic — same input → same output, same colors, every time. Citation-grade.

Try it yourself

Free tier — 50 calls / month, no card.

Get an API key

Tools used: census_acs, find_similar_areas, ingest_dataset, render_map (with pin_overlays), publish_map. find_similar_areas is premium-gated.