Implementing H3 Resolution Scaling for City-Level Geofences

Mapping a whole city to one fixed H3 resolution forces an unwinnable choice: a fine grid blows up memory and rasterization time, while a coarse grid leaks events across the municipal boundary. This page sits under Uber H3 Hexagon Indexing for Mobility and the broader spatial index lookup layer, and it answers the narrow operational question mobility and logistics teams hit in production: how do you index a 1,000+ km² boundary so that runtime containment stays a sub-millisecond integer test while perimeter accuracy stays sub-meter? The technique is resolution scaling — using H3’s native parent/child hierarchy to index the calm interior coarsely and the compliance-critical edge finely. Adaptive resolution is not an optimization here; for pipelines sustaining 10,000+ checks per second it is the difference between flat tail latency and recurring OOM kills.

Concept and specification

H3 rasterizes a polygon by enumerating every cell whose centroid (or coverage, depending on mode) falls inside it. Cell area shrinks by a factor of seven for each resolution step, so the cell count needed to cover a fixed area $A$ grows geometrically with resolution. Relative to a baseline resolution $r_{0}$ , the count at resolution $r$ scales as:

$N_{cells} \overset{◯}{R} \approx \frac{A}{a _{0}} \cdot 7^{, r - r_{0}}$

where $a_{0}$ is the average hexagon area at $r_{0}$ . The consequence is brutal: moving a whole-city fill from resolution 7 to resolution 10 multiplies the cell set by roughly $7^{3} \approx 343 \times$ . That is the explosion that pushes h3.polygon_to_cells past container memory limits and turns offline builds into multi-minute jobs.

Resolution scaling exploits the fact that this fine granularity is only needed where it changes a decision — along the boundary. The interior is decomposed into a coarse core set; only a thin perimeter buffer is rasterized finely. The runtime then membership-tests a query against both sets. Containment at a query resolution $r_{q}$ is decided by truncating the query cell to each set’s resolution via cell_to_parent, so a single fine query can probe a coarse core in O(1).

Parameter	Symbol	Typical value	Role
Core resolution	$r_{core}$	7	Coarse fill of the calm interior
Edge resolution	$r_{edge}$	10–11	Sub-meter rasterization of the boundary band
Query resolution	$r_{q}$	9	Resolution GPS fixes are encoded at
Edge band width	—	1–2 rings	How deep the fine buffer reaches inward

The H3 invariant that makes the cross-resolution probe sound: a child cell’s parent at any coarser resolution is deterministic bit arithmetic, so cell_to_parent(cell, r_core) of a query fix is exactly the core cell that would contain it. Resolutions must therefore satisfy $r_{core} \leq r_{q} \leq r_{edge}$ , otherwise the parent/child truncation is undefined and lookups silently miss.

Step-by-step implementation

Prerequisites. Python 3.11+, h3-py >= 4.0 (the v4 API: h3.polygon_to_cells, h3.latlng_to_cell, h3.cell_to_parent, h3.str_to_int), and numpy >= 1.24. Boundaries arrive as h3.LatLngPoly objects (GeoJSON rings converted once at ingest). The build runs offline; the runtime only reads its artifacts.

Split geometry into core and edge sets offline. Rasterize the boundary at both resolutions, then store each as a sorted array of 64-bit integers. Doing the sort at build time lets the runtime memory-map the files read-only and binary-search them with zero allocation.

python

import numpy as np
import h3
from pathlib import Path

def precompute_city_hierarchy(
    boundary: "h3.LatLngPoly",
    core_res: int = 7,
    edge_res: int = 10,
    out_dir: Path = Path("/data/h3"),
) -> tuple[Path, Path]:
    """Decompose a municipal boundary into core/edge hex sets and persist
    them as sorted .npy arrays the runtime can mmap and binary-search."""
    core_hexes = h3.polygon_to_cells(boundary, core_res)
    edge_hexes = h3.polygon_to_cells(boundary, edge_res)

    # Contiguous uint64 for cache locality; str_to_int avoids Python str keys.
    core_ints = np.fromiter((h3.str_to_int(h) for h in core_hexes), dtype=np.uint64)
    edge_ints = np.fromiter((h3.str_to_int(h) for h in edge_hexes), dtype=np.uint64)

    # The edge set only needs cells NOT already covered by a core parent.
    core_set = set(core_ints.tolist())
    edge_only = np.fromiter(
        (e for e in edge_ints.tolist()
         if h3.str_to_int(h3.cell_to_parent(h3.int_to_str(int(e)), core_res)) not in core_set),
        dtype=np.uint64,
    )

    # Sort BEFORE persisting so the runtime never has to.
    core_ints.sort()
    edge_only.sort()

    out_dir.mkdir(parents=True, exist_ok=True)
    core_path, edge_path = out_dir / "core.npy", out_dir / "edge.npy"
    np.save(core_path, core_ints)
    np.save(edge_path, edge_only)
    return core_path, edge_path

Gotcha: deduplicating the edge band against core parents (not raw cells) is what prevents double-storing the boundary. A naive np.setdiff1d of same-resolution ids keeps every fine edge cell even when its res-7 parent is already in the core, inflating storage by the full $7^{Δ r}$ factor you were trying to avoid.

Memory-map the artifacts in the runtime. Loading with mmap_mode="r" keeps the arrays in the OS page cache shared across worker processes, so N pods do not each duplicate the city in heap.

python

class GeofenceValidator:
    __slots__ = ("core", "edge", "core_res", "edge_res")

    def __init__(self, core_path: Path, edge_path: Path,
                 core_res: int = 7, edge_res: int = 10) -> None:
        self.core = np.load(core_path, mmap_mode="r")   # sorted uint64
        self.edge = np.load(edge_path, mmap_mode="r")
        self.core_res, self.edge_res = core_res, edge_res

    @staticmethod
    def _contains(arr: np.ndarray, value: np.uint64) -> bool:
        idx = int(np.searchsorted(arr, value))
        return idx < arr.size and bool(arr[idx] == value)

Probe both resolutions on the hot path. Encode the fix once at the query resolution, truncate to each set’s resolution with cell_to_parent, and binary-search. No Python H3 string survives past the encode.

python

    def check(self, lat: float, lon: float, query_res: int = 9) -> bool:
        cell = h3.latlng_to_cell(lat, lon, query_res)
        core_parent = np.uint64(h3.str_to_int(h3.cell_to_parent(cell, self.core_res)))
        if self._contains(self.core, core_parent):
            return True
        # Edge band is finer than the query; truncate query up to edge_res only
        # when query_res >= edge_res, otherwise compare at query_res parents.
        edge_key = np.uint64(h3.str_to_int(
            h3.cell_to_parent(cell, min(query_res, self.edge_res))))
        return self._contains(self.edge, edge_key)

Gotcha: if a fix is encoded at a resolution coarser than edge_res, you cannot recover the exact fine edge cell — the parent is ambiguous. Pin one query resolution across the fleet and assert core_res <= query_res at load time; mismatched resolutions return empty matches silently, not errors.

Parallelize the build, never the lookup. Rasterizing dozens of cities is embarrassingly parallel — distribute boundaries across a multiprocessing.Pool so each worker owns a city and the GIL never serializes the heavy polygon_to_cells calls. The runtime stays single-threaded per request and lock-free.

Benchmark and verification

The win is moving cell expansion off the request path entirely. Representative figures for a ~1,500 km² metropolitan boundary on a single vCore:

Metric	Fixed res-10 fill	Resolution-scaled (core 7 / edge 10)
Cell set size	~6.1M cells	~190k core + ~310k edge
Storage footprint	~520 MB	~12 MB core + ~85 MB edge (<100 MB)
Offline build time	~140 s	~9 s
P50 `check()`	~340 µs	~6 µs
P99 `check()` under 50k QPS	~140 ms (GC-bound)	< 0.9 ms
Sustained QPS per vCore	~3k (OOM-limited)	~50k

Verify correctness before trusting the speedup. Run a shadow worker that resolves every payload through both the scaled validator and a full-resolution reference polygon_to_cells set, and diff the verdicts. Promote the scaled path only at >99.9% verdict parity; residual disagreements should land exclusively in the perimeter band and shrink as you widen the edge ring. Profile with py-spy record to confirm time concentrates in latlng_to_cell (the only unavoidable cost), tracemalloc to confirm the hot path allocates nothing, and gc.get_stats() to confirm collection frequency stays below 0.01% of wall time.

Failure modes and edge cases

Boundary drift at coarse resolutions. Indexing the perimeter too coarsely (res 6–7) produces false negatives along the municipal edge from hexagonal approximation, so legitimate ride-hailing or delivery events bypass compliance filters. The edge band exists precisely to absorb this; widen the buffer to 2 rings in jurisdictions with strict edge SLAs, and cross-check dropped fixes against Fallback Routing for GPS Dropouts.
NaN or out-of-range coordinates. A corrupt fix (lat=NaN, lon outside ±180) makes latlng_to_cell raise. Validate at ingest and route bad fixes to a dead-letter path so one ping cannot abort a batch.
Pentagon cells. Near H3’s 12 pentagons, ring enumeration returns a short, distorted neighbourhood. Never hard-code edge-band ring sizes; treat the returned iterable as authoritative when building the buffer.
Resolution mismatch. Mixing core, edge, and query resolutions that violate $r_{core} \leq r_{q} \leq r_{edge}$ yields silent empty matches — no exception. Assert the ordering at load and pin all three constants in one place.
GC pressure and GIL contention. Pure-Python orchestration over millions of pings re-enters the interpreter even though the H3 C extension releases for core primitives. Keep the hot path allocation-free (uint64 arrays, __slots__), move batch rasterization into ProcessPoolExecutor workers, and tune gc.set_threshold / gc.freeze() to suppress stop-the-world pauses during peak ingestion. This is the same heap discipline detailed in Reducing P99 Latency in Python Geofence Services.
Emergency bypass. If the scaled path degrades — P99 > 150 ms for over 60 s, or an upstream geometry rebuild fails — trip a circuit breaker to a precomputed res-6 fallback grid, accepting a 0.8–1.2% false-positive rate for a ~90% latency cut. Keep routing raw fixes to a background shadow validator for reconciliation, hold a 5-minute cooldown, and confirm cache warm-up before re-enabling edge-resolution checks. Never let the fallback block the event loop; gate it behind feature flags with atomic state transitions.

Uber H3 Hexagon Indexing for Mobility — parent overview of the H3 address layout and constant-time containment this page builds on
Comparing Geohash vs H3 for Low-Latency Routing — choosing the encoding primitive before you tune its resolution
Reducing P99 Latency in Python Geofence Services — the GC and allocation discipline the runtime path depends on
Handling Polygon Edge Cases in High-Frequency Telemetry — degenerate-geometry handling for the rasterization stage

Implementing H3 Resolution Scaling for City-Level Geofences

Concept and specification #

Step-by-step implementation #

Benchmark and verification #

Failure modes and edge cases #

Related #

Concept and specification

Step-by-step implementation

Benchmark and verification

Failure modes and edge cases

Related