Pick a reference county where your campaign or program is already strong, then ask which OTHER U.S. counties demographically resemble it — comparable population, income, education, age — so the next ad buy or canvass plan can target lookalikes. One call to find_similar_areas returns the ranked list.
You pick the variables. find_similar_areas z-normalizes them across the whole collection, then computes cosine distance from a reference row to every other row. Lower distance = more similar. For lookalike modeling in politics the canonical set is population + median household income + bachelor's-or-higher + median age. For retail it's population + commute time + employment mix. For healthcare it's uninsured-rate + median age + low-income households.
The system Census cache covers the headline variables (B01001_001E population, B19013_001E median income, B15003_022E bachelor's, B01002_001E median age — and ~16 others). When the cache is warm, this step is read-only.
mcp.call("census_acs", {
collection: "us-counties", version: "tiger-2024", year: 2023,
variables: ["B01001_001E", "B19013_001E", "B15003_022E", "B01002_001E"],
slug: "acs-counties-2023-headline",
})
→ 3,143 counties × 4 variables → workspace dataset
mcp.call("find_similar_areas", {
dataset_slug: "acs-counties-2023-headline",
reference_external_id: "",
variables: ["B01001_001E", "B19013_001E", "B15003_022E", "B01002_001E"],
metric: "cosine",
limit: 50,
})
// → reference: { external_id, name, raw_values: { ...the input variables } }
// → hits: [ { external_id, name, distance }, ... ]
// sorted ascending by cosine distance, capped at limit
That's the full pipeline. One premium MCP call returns the ranked list.
Synthesize a derived dataset where each county's value is 1 - distance (so higher = more similar), then render as a sequential choropleth.
mcp.call("ingest_dataset", {
slug: "wake-lookalikes-2023", name: "Wake County NC lookalikes",
records: hits.map((h, i) => ({
external_id: h.external_id,
similarity: 1 - h.distance,
rank: i + 1,
})),
collection: "us-counties", version: "tiger-2024",
})
mcp.call("render_map", {
collection: "us-counties", version: "tiger-2024",
dataset: "wake-lookalikes-2023", field: "similarity",
palette: "sequential",
classify: { method: "quantile", bins: 5 },
title: "Lookalikes for Wake County, NC",
subtitle: "Cosine distance over pop, income, bachelor's, median age (ACS 2023)",
pin_overlays: [{
name: "Reference",
color: "#dc2626",
points: [{ lng: -78.6382, lat: 35.7796, label: "Wake County, NC" }],
}],
})
→ SVG + breaks + categories
mcp.call("publish_map", {
view_id: …,
freeze: true,
})
→ /v/wake-lookalikes-2023 (iframe · PNG · citation.json with the exact
filter that produced it)
Lookalike modeling on the back of a Python notebook is a week of work. Pulling four ACS variables, joining to counties, z-normalizing, computing cosine distance, sorting, joining back to county geometries for a choropleth, re-running across vintages without breakage. One MCP call replaces all of it. The math is deterministic — same input → same output, same colors, every time. Citation-grade.
Tools used: census_acs, find_similar_areas, ingest_dataset, render_map (with pin_overlays), publish_map. find_similar_areas is premium-gated.