Skip to content

Evolving content without a destructive migration

5/29/2026Web DevelopmentBlock-Based Content Platform14 min read

How v1 to v2 (field renames, repeater normalize) runs in memory on load, keyed by schemaVersion.

The brief in one paragraph

A block-based content platform stores each marketing page as a JSON array of typed blocks in a single database column. Six months into production, the block format needed to change: three legacy field-type names were wrong, blocks needed a few new optional flags, and a new "repeater" field type meant some props stored as plain strings or single objects now had to be arrays of objects. The data lived in hundreds of rows of editor-authored JSON, much of it for live pages getting traffic. The brief: change the stored shape without a destructive database migration that rewrites every row in one transaction, and without breaking a single page nobody had touched. This is a walkthrough of the decision that made that possible: a per-page integer schemaVersion field plus a small migration ladder that runs in memory.

The constraints that shaped the technical decisions

The first constraint was blast radius. A global "migrate everything" script that rewrites every sections column is one transaction with one failure mode: a bug on an edge case in row 700 means you either roll the whole thing back or ship corrupted JSON to production pages. With editor-authored content the input is not uniform — every page is a different combination of sixty-plus block types, and the props inside each block are deliberately untyped. You cannot enumerate the inputs in advance, so you cannot fully trust a one-shot transform.

The second constraint was that old content has to keep rendering during and after the change. A page written against v1 that nobody re-saves still has to validate and display. That ruled out any approach making the new format mandatory at the database level.

The third was the audit problem: there had to be a cheap way to answer "which pages have not been upgraded yet?" — to reason about the rollout and eventually sweep the stragglers. A version number you cannot query is just a comment.

Those three constraints — bounded blast radius, backward-compatible reads, and a queryable progress signal — point at the same answer: version each document, migrate lazily, and index the version.

The architecture

The version lives on the row, as an integer, with an index. Here is the Prisma model:

model DynamicPage {
  dynamicPageId  String           @id @default(cuid())
  slug           String           @unique
  title          String
  description    String?
  keywords       String[]
  sections       Json             @default("[]")
  metadata       Json? // { ogTitle, ogDescription, ogImage, twitterTitle, twitterDescription, twitterCard }
  status         DynamicPageStatus @default(DRAFT)
  schemaVersion  Int              @default(2)
  createdAt      DateTime         @default(now())
  updatedAt      DateTime         @updatedAt

  translations   DynamicPageTranslation[]
  githubPromotion GitHubPromotion?

  @@index([slug])
  @@index([schemaVersion])
}

Two indexes tell you what this table is queried on. slug is the public render path. schemaVersion is the migration sweep — find every page below the current version. The default of 2 means newly created pages are stamped with the current version and never need migrating; only pre-existing rows carry an older number. Choosing a per-page integer over a global flag is the whole decision: each page records the version it was written against, independently, so the system can migrate them one at a time instead of all at once.

The current version is a single constant, shared between the type layer and the migration layer:

/** The current block/page data schema version. Increment when BlockDataSchema or DynamicPageSchema changes in a breaking or additive way. */
export const CURRENT_SCHEMA_VERSION = 2 as const
export type SchemaVersion = number

export const BlockDataSchema = z.object({
  id: z.string(),
  type: z.string(),
  order: z.number(),
  props: z.record(z.unknown()),
  hidden: z.boolean().optional(),
  label: z.string().optional(),
  className: z.string().optional(),
})

CURRENT_SCHEMA_VERSION is the target every migration climbs toward, exported once so the database default, the validation schema, and the migration runner cannot drift apart. The detail that makes backward-compatible reads work is in the last three lines: hidden, label, and className are .optional(). That single keyword lets one schema accept both v1 and v2 data. A v1 block that never carried these fields still passes BlockDataSchema.safeParse, so untouched old pages validate and render exactly as before — no migration required just to be readable.

The hardest sub-problem and how it resolved

The version history comment records exactly what v1 to v2 had to do:

/**
 * Each entry transforms sections from the previous version to this version.
 *
 * Version history:
 *   v1 – initial schema (id, type, order, props); legacy FieldTypes included
 *        `menu`, `date`, `richtext`; no `repeater` field type
 *   v2 – added optional hidden/label/className to BlockData;
 *        FieldType renames (richtext → rich-text, menu → select, date → text);
 *        introduced `repeater` (props formerly stored as CSV/JSON strings,
 *        primitive arrays, or single objects are reshaped into arrays of objects
 *        when the block schema declares them as repeaters).
 */

The field renames are the easy half. A lookup table maps each legacy token to its replacement, and a pass over every prop swaps any string value that matches:

const LEGACY_FIELD_TYPE_RENAMES: Record<string, string> = {
  richtext: 'rich-text',
  menu: 'select',
  date: 'text',
}

const renameLegacyFieldTypeTokens = (sections: BlockData[]): BlockData[] =>
  sections.map((s) => {
    const props: Record<string, unknown> = {}
    for (const [k, v] of Object.entries(s.props ?? {})) {
      props[k] = typeof v === 'string' && LEGACY_FIELD_TYPE_RENAMES[v] ? LEGACY_FIELD_TYPE_RENAMES[v] : v
    }
    return { ...s, props }
  })

The transform is pure — it builds a new props object rather than mutating the old one, and it only touches string values that exactly match a legacy token. Anything else passes through untouched.

The hard half is the repeater. A repeater field is a list of structured rows, but in v1 the same data could have been stored as a CSV string, a JSON string, a flat array of primitives, or a single object — because the old field types had no concept of "a list of objects." There is no clean rule that covers all of those, so the normalizer is explicitly a best-effort heuristic that branches on the runtime shape of the value:

function coerceToRepeaterRows(
  value: unknown,
  fieldDef: FieldSchema,
): Array<Record<string, unknown>> {
  const subKeys = Object.keys(fieldDef.fields ?? {})
  const firstKey = subKeys[0] ?? 'value'

  const wrapPrimitive = (v: unknown): Record<string, unknown> => ({ [firstKey]: v })

  if (value === undefined || value === null || value === '') return []

  // String: try JSON first, then split by common separators
  if (typeof value === 'string') {
    const parsed = tryParseJson(value)
    if (parsed !== undefined) return coerceToRepeaterRows(parsed, fieldDef)
    return value
      .split(SPLIT_PATTERN)
      .map((s) => s.trim())
      .filter(Boolean)
      .map(wrapPrimitive)
  }

  // Array
  if (Array.isArray(value)) {
    return value.map((item) => {
      if (item && typeof item === 'object' && !Array.isArray(item)) {
        return item as Record<string, unknown>
      }
      return wrapPrimitive(item)
    })
  }

  // Single object → wrap in single-element array
  if (typeof value === 'object') {
    return [value as Record<string, unknown>]
  }

  // Primitive (number/boolean) → wrap
  return [wrapPrimitive(value)]
}

Each branch handles one legacy storage shape. A string is tried as JSON first; if that fails, it is split on commas, newlines, semicolons, or pipes and each fragment becomes a single-field row. An array is walked element by element, keeping objects as-is and wrapping primitives. A lone object becomes a one-element array. The sub-field key for wrapped primitives comes from the block's own field schema (fieldDef.fields), falling back to value when the schema is silent. None of this is magic — it is a deliberate set of conversions for shapes that actually showed up in the data.

The repeater work is the hardest part because it needs the block's schema to know which props are even repeaters. That schema is passed in as context, which is also what makes the next requirement — idempotency — both necessary and tractable.

What shipped and what did not

The piece that ties the migration to the lazy strategy is idempotency. Because pages migrate in memory and are not written back automatically, the same page can be migrated many times across many requests without ever being persisted, and a transform that assumes "runs exactly once" would double-convert. The repeater normalizer is written so a second run is a no-op:

const props = { ...(s.props ?? {}) }
let changed = false
for (const [propKey, fieldDef] of Object.entries(blockSchema)) {
  if (fieldDef.type !== 'repeater') continue
  const current = props[propKey]
  const alreadyOk =
    Array.isArray(current) &&
    current.every((item) => item !== null && typeof item === 'object' && !Array.isArray(item))
  if (alreadyOk) continue
  props[propKey] = coerceToRepeaterRows(current, fieldDef)
  changed = true
  conversions++
}
return changed ? { ...s, props } : s

The alreadyOk check is the guard: if a prop is already an array of plain objects, it is skipped. After the first conversion, every subsequent run sees the converted shape, hits alreadyOk, and changes nothing. The function also returns a count of conversions and a list of which <blockType, propKey> pairs it touched, which feeds logging and the import UI.

The migration runner climbs from the page's stored version up to the current one, applying each registered step in order:

const migrations: Partial<Record<number, MigrationFn>> = {
  2: (sections, ctx) => normalizeRepeaters(renameLegacyFieldTypeTokens(sections), ctx).sections,
}

export function migrateSections(
  sections: BlockData[],
  fromVersion: number,
  ctx: MigrationContext = {},
): { sections: BlockData[]; schemaVersion: number; appliedMigrations: number[] } {
  if (fromVersion >= CURRENT_SCHEMA_VERSION) {
    return { sections, schemaVersion: fromVersion, appliedMigrations: [] }
  }

  const applied: number[] = []
  let current = sections
  for (let v = fromVersion + 1; v <= CURRENT_SCHEMA_VERSION; v++) {
    const fn = migrations[v]
    if (fn) {
      current = fn(current, ctx)
      applied.push(v)
    }
  }

  return { sections: current, schemaVersion: CURRENT_SCHEMA_VERSION, appliedMigrations: applied }
}

The early return is the cheap path: a page already at the current version does no work at all, so the migration system costs effectively nothing for the rows that do not need it. For older rows the loop applies every step between fromVersion and CURRENT_SCHEMA_VERSION, so adding a v3 later is one new entry in the migrations map — the ladder structure is already there. The source version is detected from the data, defaulting to v1 because v1 predates the field:

export function detectSchemaVersion(input: unknown): number {
  if (input && typeof input === 'object') {
    const v = (input as Record<string, unknown>).schemaVersion
    if (typeof v === 'number' && Number.isFinite(v) && v >= 1) return Math.floor(v)
  }
  return 1
}

The import route shows where this runs in practice. It resolves a fromVersion (an explicit override, or detection), builds the block-schema registry once, and runs the ladder per page:

const fromVersion =
  typeof body.fromVersion === 'number' && body.fromVersion >= 1
    ? Math.floor(body.fromVersion)
    : detectSchemaVersion(body.data)

const { sections: migratedSections, appliedMigrations } = migrateSections(
  normalized,
  opts.fromVersion,
  { blockSchemas: opts.blockSchemas },
)
const {
  sections: postRepeaters,
  conversions: repeaterConversions,
  converted: repeaterConvertedFields,
} = normalizeRepeaters(migratedSections, { blockSchemas: opts.blockSchemas })

The route reports fromVersion, toVersion, and the applied migrations in its response summary, so an operator running an import sees exactly which version each page came from and what was done to it. Idempotency is why running normalizeRepeaters again here, after migrateSections has already run it, is safe rather than destructive.

What did not ship: there is no cron job rewriting every row to v2 in the background. Migration happens on the paths that already load and validate a page, import being the clearest example. The @@index([schemaVersion]) exists precisely so the stragglers can be found later, on a deliberate schedule, rather than forcing a big-bang rewrite up front. The version field is also the seam for an optional AI pass on shapes the deterministic converters cannot normalize, but that is a fallback, not the core mechanism.

The trade-off, stated plainly

Lazy in-memory migration accepts a real cost: the read path now carries the migration. Every load of an old page re-runs the transform instead of reading a finished result, and because nothing is written back automatically, that cost recurs until the row is eventually re-saved. The early return keeps it near-zero for current pages and the transforms are cheap, but it is not free — you trade a one-time write cost for a small, repeated read cost. The other half of the trade is that two storage shapes coexist in the database indefinitely: v1 and v2 rows live side by side, and the application has to keep understanding both for as long as any v1 row survives. The optional fields and the idempotent normalizer are the price of that dual-format world. The bet is that bounded blast radius and never breaking a live page are worth a recurring few milliseconds and a schema that two versions of code can read.

Business impact

For the business this is the difference between a content change that needs a maintenance window and one that does not. There is no scheduled downtime, no all-or-nothing migration script standing between the team and a format change, and no scenario where one bad row takes down every page at once — a failure is scoped to the single page being processed. The site keeps serving existing pages unchanged while the format evolves underneath them, so a content model can keep improving for years without the periodic "we have to migrate the database this weekend" event that usually decides whether a platform stays maintainable or quietly ossifies.

What to do next

If you are about to change a stored JSON shape, the first question to settle is not how to write the transform — it is where the version number lives and whether you can query it. Add a schemaVersion to the row, index it, and make the new fields optional so old data still validates before you migrate a single record. Then write the transform to be idempotent from the start, because the moment migration moves to the read path you lose the guarantee that it runs once. If you want a second pair of eyes on a content-model change — whether to migrate eagerly or lazily, and how to keep the old format readable — that is a conversation worth having before the first row gets rewritten, not after. The services and case studies pages have more on how these calls get made on real projects.

Related Articles

Same Category

Comments (0)

Newsletter

Stay updated! Get all the latest and greatest posts delivered straight to your inbox