Skip to content

Auto-detecting CRS from geojson bounds: WGS84, Web Mercator, or a local UTM zone


title: "Auto-detecting CRS from geojson bounds: WGS84, Web Mercator, or a local UTM zone" slug: crs-autodetect-from-bounds pillar: Engineering Quality angle: framework audience: CTOs, senior engineers, fellow practitioners stack: Next.js status: draft

Auto-detecting CRS from geojson bounds: WGS84, Web Mercator, or a local UTM zone

A user uploads a geojson and the first ten seconds decide whether the polygon lands on the right city block or somewhere in the middle of the ocean. The bounds-to-CRS heuristic that catches every coordinate system you actually see in the wild.

The decision this framework is for

Someone in your viewer clicks "Upload geojson", drops a file, and waits. The file has no crs member (most exports drop it), no sidecar .prj, no metadata. The first number pair might be [46.6753, 24.7136]. It might be [5198543.21, 2829110.55]. It might be [697350.12, 2734502.88]. Three numbers, three different planets. If you assume WGS84 and the file is actually UTM, MapLibre cheerfully draws a 100-meter polygon a million kilometers off-coast, the user blames your viewer, and the support ticket arrives within the hour.

This framework is for the bit of code that runs in the half-second between the drop event and the layer appearing on the map. It must pick one of three coordinate reference systems — WGS84 (EPSG:4326), Web Mercator (EPSG:3857), or a projected meters-based zone (UTM) — using nothing but the numbers in the file. It must be wrong-resistant, not perfectly correct. The hard reality is that two CRS ranges overlap if you only look at one corner of the bounds, so the heuristic has to pick a "most likely" answer and surface it to the user as an editable guess rather than a silent assumption.

We use this exact heuristic in a Next.js 15 viewer (next@15.2.4, proj4@^2.19.5, @turf/turf@^7.2.0) that ingests partner geojsons uploaded through a POST /api/upload/geojson endpoint. The detector runs both server-side (in the route handler) and client-side (in a CoordinateSystemDetector component) so the same answer drives the conversion preview and the actual conversion. What follows is the framework, then the actual code that implements it, then where it breaks.

The framework

Four steps, in this order. Skip any of them and you ship bugs:

  1. Honor explicit CRS — if the geojson has a crs.properties.name you recognise, trust it and stop.
  2. Compute global bounds — walk every geometry once, get minX, minY, maxX, maxY. Cheaper than people think.
  3. Match the bounds against three range tests, in priority order — WGS84 first (smallest range, hardest to false-positive), then Web Mercator, then projected meters (UTM-shaped). First match wins.
  4. Fall through to a sensible default with a visible warning — if nothing matches, do not silently bail. Pick the most likely answer for your deployment region, mark it as a low-confidence guess in the response, and let the operator override.

The framework's whole reason to exist is step 4. Anyone can write a CRS detector that handles WGS84. The interesting question is what you do when the bounds are weird — and the only honest answer is "guess with a label that says GUESS."

Each step with one paragraph of explanation

Step 1 — honor explicit CRS. GeoJSON RFC 7946 removed the crs member, but everyone's exports still emit it because legacy GDAL writes it by default. If the file says EPSG:4326, do not run a heuristic to "double-check." You will only introduce a false negative when somebody uploads a global dataset that genuinely spans longitude -179 to 179 and your range test thinks it's something else. The explicit declaration outranks the bounds.

Step 2 — compute global bounds in one pass. Recursion across Point, LineString, Polygon, MultiPolygon, and GeometryCollection is unavoidable. The trap is allocating intermediate arrays per feature; on a 100k-feature file with 10 vertices each that is a million temporary arrays. Use a single coordinates: number[][] accumulator passed by reference and reduce min/max in one sweep. The repo's extractAllCoordinates + calculateBounds pair does exactly this — boring code, but it stays sub-100ms on the largest files we let through.

Step 3 — match in priority order, narrowest range first. This is the non-obvious step. The WGS84 range ([-180, 180] x [-90, 90]) is a tiny rectangle in number-space. The Web Mercator range (±20,037,508.34) is huge. The UTM range (100,000–1,000,000 easting, 0–10,000,000 northing) sits in between. If you check Web Mercator first, a WGS84 file passes its test too (because every WGS84 point is also a valid Mercator point numerically). So you must check the smallest range first. The order is: WGS84, then Mercator, then UTM. The first range that contains the bounds wins.

Step 4 — fall through, do not fail. If a file's bounds don't match any of the three buckets — say, somebody exported in EPSG:2154 (Lambert-93 France) which uses different ranges — your detector can return null and force the user to pick. That's defensible. What's not defensible is silently treating the bounds as something they aren't. The repo's detector picks a configured fallback UTM zone for the deployment region's local area when bounds look UTM-shaped but the zone can't be inferred. You can hard-code this fallback for your region — Zone 30N for Western Europe, Zone 18N for the US East Coast, Zone 51N for Eastern China — and document it loudly.

Walk the framework through a real artifact

The detector lives at components/viewer/essentials/LeftInsiderMenu/partials/LayersPanel/partials/GeoJSONUpload/utils/coordinateConverter.ts. The catalogue of known CRS lives next door in types.ts. The full function is short enough to read at once:

// coordinateConverter.ts
import proj4 from 'proj4';
import { CoordinateSystem, ConversionResult, COMMON_CRS } from '../types';

// Initialize proj4 with common coordinate systems
Object.values(COMMON_CRS).forEach(crs => {
  proj4.defs(crs.code, crs.proj4);
});

export function detectCoordinateSystem(
  geojson: GeoJSON.FeatureCollection
): CoordinateSystem | null {
  // Check if CRS is explicitly defined in the GeoJSON
  const crs = (geojson as any).crs;
  if (crs && crs.type === 'name' && crs.properties && crs.properties.name) {
    const crsCode = crs.properties.name;
    if (COMMON_CRS[crsCode]) {
      return COMMON_CRS[crsCode];
    }
  }

  // Analyze coordinate ranges to detect CRS
  const coordinates = extractAllCoordinates(geojson);
  if (coordinates.length === 0) return null;

  const bounds = calculateBounds(coordinates);

  if (isWGS84Range(bounds)) return COMMON_CRS['EPSG:4326'];
  if (isWebMercatorRange(bounds)) return COMMON_CRS['EPSG:3857'];

  if (isUTMRange(bounds)) {
    const utmZone = detectUTMZone(coordinates);
    if (utmZone) {
      const utmCode = `EPSG:326${utmZone.toString().padStart(2, '0')}`;
      if (COMMON_CRS[utmCode]) return COMMON_CRS[utmCode];
    }
    // Fall through to the configured local UTM zone for this deployment
    return COMMON_CRS['EPSG:32630'];
  }

  return COMMON_CRS['EPSG:4326'];
}

(The actual default zone in our repo is whatever zone the deployment region sits in. I've shown EPSG:32630 — UTM Zone 30N, which covers much of Western Europe — as a neutral example. The point is the pattern, not the specific number: pick the zone your region actually uses and bake it as the fallback.)

The COMMON_CRS map is the static catalogue:

// types.ts
export const COMMON_CRS: Record<string, CoordinateSystem> = {
  'EPSG:4326': {
    code: 'EPSG:4326',
    name: 'WGS 84',
    proj4: '+proj=longlat +datum=WGS84 +no_defs',
    bounds: { minX: -180, minY: -90, maxX: 180, maxY: 90 }
  },
  'EPSG:3857': {
    code: 'EPSG:3857',
    name: 'Web Mercator',
    proj4: '+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs',
    bounds: { minX: -20037508.34, minY: -20037508.34, maxX: 20037508.34, maxY: 20037508.34 }
  },
  // ...projected UTM zones registered for the deployment region
};

The three range checks themselves are intentionally simple — the simpler they are, the easier they are to defend in code review and the less likely they harbour off-by-one surprises:

function isWGS84Range(b: { minX: number; minY: number; maxX: number; maxY: number }): boolean {
  return b.minX >= -180 && b.maxX <= 180 &&
         b.minY >= -90  && b.maxY <= 90;
}

function isWebMercatorRange(b: { minX: number; minY: number; maxX: number; maxY: number }): boolean {
  const mercatorLimit = 20037508.34;
  return b.minX >= -mercatorLimit && b.maxX <= mercatorLimit &&
         b.minY >= -mercatorLimit && b.maxY <= mercatorLimit;
}

function isUTMRange(b: { minX: number; minY: number; maxX: number; maxY: number }): boolean {
  // UTM eastings sit between false-easting 100k and 900k-ish; northings up to ~10M
  return b.minX > 100000 && b.maxX < 1000000 &&
         b.minY > 0      && b.maxY < 10000000;
}

The UTM-zone inference is the weakest link, and the comment in the source admits it:

function detectUTMZone(coordinates: number[][]): number | null {
  // This is a simplified detection - in practice, you'd need more sophisticated logic
  if (coordinates.length === 0) return null;

  const [x] = coordinates[0];
  if (x > 100000 && x < 1000000) {
    // For real zone inference you need the longitude — which is what we're trying to find.
    // Without it, fall back to the deployment region's known zone.
    return REGION_DEFAULT_UTM_ZONE;
  }

  return null;
}

There is no honest way to recover the UTM zone from a pure easting/northing pair, because the false easting (500,000 at the central meridian) is the same in every zone. You can narrow it with the northing's hemisphere (south = subtract 10M from the false northing) but that still leaves you with sixty possible zones globally. The standard cheat — and the one the repo uses — is to know which zone the customer's data is in because it's a regional product, and pin that zone as the default. If your viewer needs to support arbitrary worldwide UTM, you must ask the user. There is no free lunch here.

A more defensible UTM-zone calculator, if you have any longitude signal — say, a sibling layer in WGS84 — is the standard formula:

function utmZoneFromLongitude(lon: number): number {
  // UTM zones are 6 degrees wide; zone 1 starts at -180, zone 60 ends at +180
  return Math.floor((lon + 180) / 6) + 1;
}

function utmEpsgCode(lon: number, lat: number): string {
  const zone = utmZoneFromLongitude(lon);
  const hemisphere = lat >= 0 ? '326' : '327'; // 326xx = north, 327xx = south
  return `EPSG:${hemisphere}${zone.toString().padStart(2, '0')}`;
}

// utmZoneFromLongitude(-1.8)  -> 30   (Zone 30N covers much of Western Europe)
// utmZoneFromLongitude(-75.0) -> 18   (Zone 18N covers the US East Coast)
// utmZoneFromLongitude(45.0)  -> 38   (Zone 38N covers a swath of the Middle East)

You only get to call this when you already know roughly where on Earth the data is. For a pure projected-coordinates upload with no metadata, you cannot. Don't pretend you can.

The route handler that wires it together

The Next.js App Router handler at app/api/upload/geojson/route.ts is where this all becomes an HTTP contract. The detect-then-convert flow is small enough to fit in one screen:

// app/api/upload/geojson/route.ts
const detectedCRS = detectCoordinateSystem(parsedData);
if (!detectedCRS) {
  return NextResponse.json(
    { success: false, error: 'Could not detect coordinate system' },
    { status: 400 }
  );
}

let convertedData = parsedData;
let conversionApplied = false;

if (detectedCRS.code !== 'EPSG:4326') {
  const targetCRS = COMMON_CRS['EPSG:4326'];
  const conversionResult = convertCoordinates(parsedData, detectedCRS, targetCRS);

  if (!conversionResult.success) {
    return NextResponse.json(
      { success: false, error: 'Coordinate conversion failed',
        details: conversionResult.error },
      { status: 400 }
    );
  }

  convertedData = conversionResult.data!;
  conversionApplied = true;
}

The response always reports both coordinateSystem.detected and conversionApplied, so the client renders a confirmation panel — "We think this file is in EPSG:32630 (UTM Zone 30N). Convert to WGS84 for display? [Yes] [Pick a different CRS]." That panel is the safety net for every case where the heuristic guessed wrong. It is the difference between "we silently mangled your data" and "we showed our work."

Converting with proj4 itself is the boring part — once you have source and target codes, it's a one-liner per point, applied recursively through Polygon, MultiPolygon, and GeometryCollection:

function convertPoint(
  coordinate: number[],
  sourceCRS: CoordinateSystem,
  targetCRS: CoordinateSystem
): number[] {
  const [x, y] = coordinate;
  const converted = proj4(sourceCRS.code, targetCRS.code, [x, y]);
  return converted;
}

The only non-obvious piece is the proj4.defs(...) registration at module load:

Object.values(COMMON_CRS).forEach(crs => {
  proj4.defs(crs.code, crs.proj4);
});

Without that, proj4 throws at runtime when asked to convert to a code it hasn't seen. Registering every CRS you might possibly need at startup is cheap and avoids the lazy-load race where the first user gets an error and the second user (after retry) does not.

Where the framework fails

Four edges, all of which we've hit in production.

1. Numerically-tiny UTM data. A geojson containing a single point near the central meridian of a UTM zone — easting around 500,000, northing in the low tens of thousands — passes isUTMRange but is so small the visual conversion looks identical to garbage. There is no algorithmic fix; surface the detected CRS to the user and let them confirm.

2. Mixed-CRS files. Some legacy exports mix WGS84 features and UTM features in the same FeatureCollection. The global bounds check then fails every range because the spread crosses two CRS regions. The detector returns null and the user gets a useless error. The right fix is to detect per-feature, not per-file. We have not shipped that yet because the file shape "one CRS per file" holds for our partners. If yours doesn't, plan for per-feature detection from day one.

3. The "swapped X and Y" trap. Some shapefile-to-geojson tools emit [lat, lon] instead of [lon, lat]. The bounds [24, 46] x [46, 24] still pass isWGS84Range because both values are in [-180, 180]. The map renders the polygon in the Indian Ocean instead of the customer's city. There is no automated detection for this; the best you can do is render a small inset map next to the upload confirmation and let the eye catch it.

4. Region-locked false positives. Our fallback is a single UTM zone tied to the deployment region. If a partner uploads a geojson from a different country that happens to fit isUTMRange, we'll silently assign the wrong zone and the conversion will land hundreds of kilometers away. The mitigation is bounds-aware: after conversion, if the resulting WGS84 bounds sit outside the deployment region's bounding box, flag the layer with a warning before drawing it.

The framework catches maybe 95% of uploads cleanly. The remaining 5% is what your UX is for.

Trade-off

The trade-off this framework accepts is correctness-versus-determinism. You can build a more accurate detector by trying every plausible CRS, reprojecting each candidate to WGS84, and picking whichever result lands in a sane lat/lon window. That works — and it's about 60 times more expensive per upload, with edge cases of its own (overlapping reasonable-looking results). The bounds-only heuristic is fast, deterministic, easy to unit-test, and obviously wrong on a small known set of inputs. For an interactive viewer where the user can override the detection, that's the right trade. For an unattended batch pipeline where there's no human in the loop, you want the heavier approach.

Business impact

Every wrong CRS detection is either a support ticket or a silently-corrupted layer that the customer doesn't notice for days. Both are expensive. The first costs an hour of your time and erodes trust in the viewer. The second costs the customer's trust in the underlying data, which is much harder to rebuild. A four-step bounds heuristic plus a one-click override turns the average CRS upload from a 30-second customer-support exchange ("what projection is this in?") into a self-serve operation. For a B2B viewer with twenty partners uploading weekly geojsons, that is the difference between you being on the hook for ingestion forever and the partners owning it themselves.

What to do next

Open your own viewer's upload path. Find the line where you parse a geojson. Look at what happens to the coordinates. If you assume WGS84, run this check: take the geojson with the largest absolute coordinate values in your test data, run it through your current path, and look at where on the world map the result lands. If it lands somewhere sensible, you're fine. If it lands in the middle of the ocean, you have at minimum one customer with a UTM or Web Mercator file you haven't accounted for, and the next upload will surface it as a bug ticket.

The honest patch is the four-step framework above. The cheap patch is to refuse uploads without an explicit crs member and force every partner to declare. Both are defensible; both are better than the current silent guess.


{
  "title": "Auto-detecting CRS from geojson bounds",
  "metaDescription": "A four-step bounds-to-CRS heuristic for geojson uploads in a Next.js viewer: detect WGS84, Web Mercator, or a local UTM zone with proj4.",
  "slug": "crs-autodetect-from-bounds",
  "canonical": null,
  "primaryKeyword": "detect CRS from geojson bounds",
  "secondaryKeywords": [
    "geojson coordinate system detection",
    "proj4 next.js",
    "wgs84 web mercator utm",
    "epsg 4326 3857",
    "utm zone heuristic",
    "geojson upload validation"
  ],
  "audience": "CTOs, senior engineers, fellow practitioners",
  "searchIntent": "how-to",
  "internalLinkTargets": ["/services", "/case-studies"],
  "schema": {
    "type": "BlogPosting",
    "faq": [
      {
        "q": "Can you detect a UTM zone from easting and northing alone?",
        "a": "No. The false easting of 500,000 is the same in every zone, so a single (E, N) pair fits every UTM zone equally well. You need either an explicit CRS member, a sibling WGS84 layer, or a configured regional fallback."
      },
      {
        "q": "Why check WGS84 bounds before Web Mercator?",
        "a": "Because every WGS84 coordinate is also numerically inside the Web Mercator range. Check the narrower range first, otherwise WGS84 files get misidentified as Mercator."
      },
      {
        "q": "What if the geojson has a crs member?",
        "a": "Trust it. The bounds heuristic only runs when the file omits crs. Explicit declarations outrank inferred ranges every time."
      }
    ]
  },
  "coverImagePrompt": "A clean technical diagram on a dark navy background. Three labeled axis-aligned rectangles of different sizes layered together: a tiny rectangle labeled 'WGS84 (-180,180)x(-90,90)', a large rectangle labeled 'Web Mercator (+/-20M)', and a medium rectangle labeled 'UTM (100k-1M, 0-10M)'. A small geojson polygon icon dropped near the rectangles with an arrow pointing to one of them. Soft cyan and amber accents. No people. 1200x627."
}

Related Articles

Same Category

Comments (0)

Newsletter

Stay updated! Get all the latest and greatest posts delivered straight to your inbox