Uploading a geojson is not a file write — it is a five-step ingest

2/19/2025Backend Development•Geospatial Urban Analytics Platform•489 views•15 min read•by Kuray Karaaslan

Uploading a geojson is not a file write — it is a five-step ingest

POST /api/upload/geojson is the easy part. Validate the feature collection, reproject to web mercator, persist the layer config, generate the renderable source, hand it to the map without a refresh. Five steps, one HTTP request.

A geospatial app that accepts user-uploaded geometry has one user-visible affordance — "drag a file here" — and roughly five things that must happen between the drop and the polygon showing on the map. Skip any of them and you get the same symptom: nothing renders, no error in the console, and a support ticket that says "the upload doesn't work." This post walks the actual upload route in a Next.js 15 / proj4 / MapLibre stack and treats the HTTP handler as what it really is: a small pipeline with five named stages, each of which can fail loudly.

The decision this framework is for

You are building or auditing a feature where a non-technical user uploads geometry — districts, parcels, transit lines, a heatmap grid — and expects it to appear on a basemap inside a dashboard. The temptation is to wire the file input to a POST /upload endpoint, save the bytes, return 200, and call it done. Then the user uploads a shapefile-derived .geojson that the GIS team exported in UTM and nothing renders, because the map only speaks WGS84 / Web Mercator and the polygon is now sitting at coordinates (723000, 2740000) on a globe that thinks longitude tops out at 180.

The five-step framework forces the right questions in the right order. It also gives you five concrete places to attach error messages a user can act on — which is the difference between "the upload failed" and "your file is in EPSG:32638, we converted it to EPSG:4326."

The framework

The route file at app/api/upload/geojson/route.ts is the canonical reference. The five stages, in order:

Validate the envelope — multipart parse, file presence, extension, size limit.
Validate the GeoJSON structure — FeatureCollection, features array, geometry types, coordinate shape.
Detect the coordinate reference system — explicit CRS header, then bounds-based heuristics over proj4.
Reproject to the map's CRS — convert every point through proj4 if the source is not already WGS84.
Build the renderable layer config — pick a layer type from the geometry types, generate paint and metadata, return the converted feature collection.

Stage 6, technically — handing the layer to the map without a page reload — happens client-side in index.tsx and matters because the framework is not done at HTTP 200; it is done when the polygon renders.

Each stage owns one failure mode and returns a different error. Conflating them is the most common mistake.

Each step with one paragraph of explanation

Envelope validation is the cheap stage. The handler parses the multipart body, confirms a file came in, checks the extension and the byte size against DEFAULT_PROCESSING_OPTIONS.maxFileSize (80 MB in this repo). The point is not security — it is to reject obvious nonsense before reading the file into memory. A 600 MB upload should not become a JSON parse problem.

Structure validation runs the file content through validateGeoJSON, which walks every feature and every coordinate. It refuses to trust JSON.parse succeeding as evidence of a valid GeoJSON, because a syntactically valid JSON can still have type: "Polygon" with no coordinates, or features whose geometry is null. This is the stage where you decide whether GeometryCollection is supported and whether warnings are returned alongside errors.

CRS detection is where most teams get burned. detectCoordinateSystem first looks for an explicit crs field on the FeatureCollection, then falls back to bounds analysis: if every coordinate fits in [-180, 180] x [-90, 90], it is WGS84; if it fits in the Mercator envelope, it is Web Mercator; if the numbers look like meters in the six-digit range, it is a UTM zone. The heuristic is wrong sometimes — that is the trade-off.

Reprojection is mechanical once the CRS is known. proj4 is initialized with every CRS definition in the COMMON_CRS table at module load, then convertCoordinates walks the geometry tree and rewrites every point. The result is a new feature collection in EPSG:4326. If the source already is EPSG:4326, the stage is a no-op and returns the input unchanged — important so you do not pay the cost twice.

Layer configuration is the stage where the API stops thinking like a parser and starts thinking like a map. It collects the distinct geometry types in the converted data, picks a MapLibre layer type (fill, line, or circle), generates a default paint, and stamps metadata onto the layer so the UI can later mark it as user-uploaded. The map cares about layer type, not about the file the user uploaded.

Walk the framework through a real artifact in the target repo

Here is stage 1, the envelope check, copied from the route:

// app/api/upload/geojson/route.ts
const formData = await request.formData();
const file = formData.get('file') as File;

if (!file) {
  return NextResponse.json(
    { success: false, error: 'No file provided' },
    { status: 400 }
  );
}

if (!file.name.toLowerCase().endsWith('.geojson') && !file.name.toLowerCase().endsWith('.json')) {
  return NextResponse.json(
    { success: false, error: 'Invalid file type. Only GeoJSON files are supported.' },
    { status: 400 }
  );
}

if (file.size > DEFAULT_PROCESSING_OPTIONS.maxFileSize) {
  return NextResponse.json({ success: false, error: `File size exceeds ...` }, { status: 400 });
}

Note that the size limit comes from a shared constant, not from a magic number in the route. DEFAULT_PROCESSING_OPTIONS is defined in GeoJSONUpload/types.ts and reused on the client to disable the submit button before the request ever leaves the browser. Two surfaces, one source of truth.

Stage 2 — structure validation — is delegated to validateGeoJSON in geojsonValidator.ts. The interesting part is that it does not just check the top-level shape; it walks every feature:

// components/.../GeoJSONUpload/utils/geojsonValidator.ts
function validateGeoJSONStructure(data: any): string[] {
  const errors: string[] = [];

  if (data.type !== 'FeatureCollection') {
    errors.push('Invalid GeoJSON: must be a FeatureCollection');
    return errors;
  }

  if (!Array.isArray(data.features)) {
    errors.push('Invalid GeoJSON: features must be an array');
    return errors;
  }

  data.features.forEach((feature: any, index: number) => {
    const featureErrors = validateFeature(feature, index);
    errors.push(...featureErrors);
  });

  return errors;
}

The bet here is that you would rather pay the linear cost of walking every feature on upload than discover a malformed coordinate at render time, deep inside MapLibre, with a stack trace that says nothing useful. The cost is real on a 100k-feature file. The benefit is that bad rows get reported to the user as Feature 4729: invalid Polygon coordinate at ring 0, index 12 — actionable, specific, and produced by code you control.

Stage 3 — CRS detection — is where the heuristics live:

// components/.../GeoJSONUpload/utils/coordinateConverter.ts
export function detectCoordinateSystem(
  geojson: GeoJSON.FeatureCollection
): CoordinateSystem | null {
  const crs = (geojson as any).crs;
  if (crs && crs.type === 'name' && crs.properties && crs.properties.name) {
    const crsCode = crs.properties.name;
    if (COMMON_CRS[crsCode]) {
      return COMMON_CRS[crsCode];
    }
  }

  const coordinates = extractAllCoordinates(geojson);
  if (coordinates.length === 0) return null;

  const bounds = calculateBounds(coordinates);

  if (isWGS84Range(bounds)) return COMMON_CRS['EPSG:4326'];
  if (isWebMercatorRange(bounds)) return COMMON_CRS['EPSG:3857'];
  // ... UTM fallback
  return COMMON_CRS['EPSG:4326'];
}

The order matters. Explicit CRS metadata wins, because it is the only signal that cannot be confused. Bounds analysis is the fallback, with the cheapest test first (WGS84 envelope), then Mercator, then UTM. The final return COMMON_CRS['EPSG:4326'] is the "assume WGS84" escape hatch, and it is the line that will, eventually, mis-classify a file and make a polygon appear in the wrong ocean. It is also the line that keeps the happy path simple. That is the trade-off, made explicit on one line.

The CRS table itself is the schema that holds the framework together:

// components/.../GeoJSONUpload/types.ts
export const COMMON_CRS: Record<string, CoordinateSystem> = {
  'EPSG:4326': {
    code: 'EPSG:4326',
    name: 'WGS 84',
    proj4: '+proj=longlat +datum=WGS84 +no_defs',
  },
  'EPSG:3857': {
    code: 'EPSG:3857',
    name: 'Web Mercator',
    proj4: '+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 ...',
  },
  'EPSG:32638': {
    code: 'EPSG:32638',
    name: 'UTM Zone 38N',
    proj4: '+proj=utm +zone=38 +datum=WGS84 +units=m +no_defs',
  },
  // ... EPSG:32639
};

Every CRS the system can detect, every CRS it can project to and from, is in this object. proj4.defs(crs.code, crs.proj4) is called for each entry at module load. Adding support for another zone is a five-line PR; supporting an arbitrary EPSG code on demand requires either bundling the full EPSG database or hitting epsg.io at runtime, both of which the team decided not to do. The closed table is intentional.

Stage 4 — reprojection — is structurally boring and that is the point:

// components/.../GeoJSONUpload/utils/coordinateConverter.ts
export function convertCoordinates(
  geojson: GeoJSON.FeatureCollection,
  sourceCRS: CoordinateSystem,
  targetCRS: CoordinateSystem
): ConversionResult {
  try {
    if (sourceCRS.code === targetCRS.code) {
      return { success: true, data: geojson, originalCRS: sourceCRS.code, targetCRS: targetCRS.code, conversionApplied: false };
    }

    const convertedFeatures = geojson.features.map(feature => ({
      ...feature,
      geometry: convertGeometry(feature.geometry, sourceCRS, targetCRS),
    }));

    return {
      success: true,
      data: { type: 'FeatureCollection', features: convertedFeatures },
      originalCRS: sourceCRS.code,
      targetCRS: targetCRS.code,
      conversionApplied: true,
    };
  } catch (error) {
    return { success: false, error: error instanceof Error ? error.message : 'Unknown conversion error', /* ... */ };
  }
}

Two things to notice. First, the early return when source equals target — no allocation, no walk, just hand the object back. Second, the entire operation is wrapped in a try/catch that converts a proj4 throw into a structured ConversionResult. The route never sees a raw exception from the projection library; it sees { success: false, error: "..." } and decides what HTTP code to send.

Stage 5 — layer config — collapses the geometry types into a single MapLibre layer type:

// app/api/upload/geojson/route.ts
function generateLayerConfig(fileName: string, geojson: GeoJSON.FeatureCollection, originalCRS: any) {
  const layerId = `uploaded_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
  const geometryTypes = new Set<string>();

  geojson.features.forEach(feature => {
    if (feature.geometry && feature.geometry.type) {
      geometryTypes.add(feature.geometry.type);
    }
  });

  let layerType = 'fill';
  if (geometryTypes.has('Point') || geometryTypes.has('MultiPoint')) {
    layerType = 'circle';
  } else if (geometryTypes.has('LineString') || geometryTypes.has('MultiLineString')) {
    layerType = 'line';
  }

  return {
    id: layerId, name: fileName.replace(/\.(geojson|json)$/i, ''), source: layerId,
    type: layerType, paint: generateLayerPaint(layerType), layout: generateLayerLayout(layerType),
    metadata: { isUploadedLayer: true, originalFile: fileName, coordinateSystem: originalCRS.code,
                featureCount: geojson.features.length, geometryTypes: Array.from(geometryTypes) },
  };
}

The metadata block is the bit that future-you will be grateful for. isUploadedLayer: true is what lets the layers panel render a delete button next to user-uploaded layers but not next to baseline layers shipped with the app. coordinateSystem records the original CRS, even though the data has already been reprojected — useful when a user asks why their export looks slightly off and the answer is "we converted from a zone that was never quite the right one for your bounding box."

The client-side companion in GeoJSONUpload/index.tsx is what actually mounts the result on the map without a page reload:

// components/.../GeoJSONUpload/index.tsx
async function addLayerToMapImmediate(map: any, layerConfig: LayerConfig, data: GeoJSON.FeatureCollection) {
  if (!map) throw new Error('Map instance is not available');

  const sourceId = layerConfig.source;
  if (map.getSource(sourceId)) {
    (map.getSource(sourceId) as any).setData(data);
  } else {
    map.addSource(sourceId, { type: 'geojson', data });
  }

  if (!map.getLayer(layerConfig.id)) {
    map.addLayer({
      id: layerConfig.id,
      type: layerConfig.type,
      source: sourceId,
      paint: layerConfig.paint,
      layout: layerConfig.layout,
      metadata: layerConfig.metadata,
    });
  }
}

The getSource then setData branch is the layer-cache invalidation story. If the user re-uploads the same logical layer, the existing MapLibre source is updated in place; only the data buffer changes, the layer node stays, and the map repaints. No removeLayer/addLayer flicker, no source ID collisions.

Where the framework fails

The CRS detection heuristic is the obvious soft spot. A small file with all coordinates in a tiny patch of [-180, 180] will be classified as WGS84 even if the source was technically Mercator with very small values. The "Default to zone 38" branch in detectUTMZone is a hard-coded fallback that assumes a particular operational region; deploy the same code in a different part of the world and that branch is wrong by hundreds of kilometers. The mitigation in this codebase is the explicit crs check first — files exported with a proper CRS header bypass the heuristic entirely. The advice to your GIS team is: always export with the CRS header set.

The second failure is the 100k-feature ceiling encoded in DEFAULT_PROCESSING_OPTIONS.maxFeatures. The structure validator walks every feature, the reprojection step walks every coordinate, and the API serializes the converted FeatureCollection back to the client. At a million features you are not bottlenecked by HTTP; you are bottlenecked by JSON.stringify and by MapLibre's GeoJSON source not being designed for that scale. The honest answer for that scale is vector tiles — either pre-baked on a worker (a BullMQ job that calls tippecanoe and writes .mbtiles somewhere) or a tile server. The current pipeline is calibrated for "user drops a custom overlay on top of a dashboard," not "user wants to host a basemap."

Third — and this is a discipline failure, not a code failure — the route returns the entire converted feature collection in the HTTP response body. The client then hands that object back to MapLibre. That is fine when the file is 5 MB. It is wasteful when the file is 60 MB, because you are now shipping the same payload over the wire twice (upload, then download). A future iteration probably persists the converted FeatureCollection on the server (S3, the database as a Project-scoped artifact in schema.prisma, your choice) and returns a URL the client fetches.

CTA

The prompt that triggers this framework when reviewing your own upload route: open the route file and count the explicit error returns. If there are fewer than four, you are conflating stages. Each stage should own at least one return NextResponse.json({ success: false, error: ... }, { status: 400 }) line, with a different error string. If validation, CRS detection, and conversion all collapse into one "invalid file" message, your users cannot tell you which stage failed.

The second prompt is for the CRS detector specifically: open the heuristics, find the line that says "if we can't detect, assume X." That line is a single point of failure for every malformed upload. Decide whether the right answer is "assume WGS84" (the current choice — best for general-purpose dashboards) or "refuse the upload and ask the user to declare a CRS" (better when you cannot afford a polygon in the wrong ocean).

Trade-off

The five-step ingest costs more than a raw file write. Validation walks every feature; reprojection allocates a new geometry tree; the response body is roughly the size of the converted file. For a 50 MB upload with 80k features, the route is doing real work — somewhere between 800 ms and 3 seconds depending on geometry complexity. The trade-off the framework accepts is latency for correctness: by the time the route returns 200, the data on the client is renderable, in the map's CRS, with a layer config that MapLibre will not reject. The alternative — fire-and-forget the bytes to disk and let the client figure it out — is faster on the happy path and unrecoverable on the sad one.

Business impact

The audience for a tool that ingests user geometry is rarely the GIS team. It is a planner, an analyst, a project manager who exports something from a desktop tool and wants to see it on the dashboard the rest of the team is already looking at. If the first upload fails silently — wrong CRS, no error, blank map — that user goes back to the desktop tool and the dashboard does not get the second visit. The five-step ingest is an investment in the moment of first contact; every named stage and every specific error message exists so that "the upload didn't work" never appears in your support inbox without a clue attached to it.

What to do next

Pull up your own upload handler. Read the first 30 lines. Count the stages. If you cannot name them — envelope, structure, CRS, reprojection, layer config — the route is doing too much in one block. Refactor each stage into a function that returns either a typed result or a typed error, and let the route be the orchestrator that decides which HTTP code to send. The point is not the five names; the point is that every stage gets its own failure mode and every failure mode gets its own error string.

Uploading a geojson is not a file write — it is a five-step ingest

Uploading a geojson is not a file write — it is a five-step ingest

The decision this framework is for

The framework

Each step with one paragraph of explanation

Walk the framework through a real artifact in the target repo

Where the framework fails

CTA

Trade-off

Business impact

What to do next

Related Articles

Comments (0)

Newsletter