DocsConcepts

Determinism

What "byte-identical" actually means in seedkit, and the limits of the guarantee.


seedkit seed --seed my-fixture --from-cache produces byte-identical INSERTs across machines, time, and CI runs. This page explains how, and where the guarantee starts and stops.

What's deterministic

Within a single major CLI version:

  1. The cached SQL for a given key is byte-stable. Same key → same bytes.
  2. The cache key is content-addressed (see Seed cache). Same inputs → same key.
  3. Therefore: same inputs across teammates → identical SQL → identical inserts → identical row order.

What's not (yet)

  • Across major versions: a major bump may change generator versions, which invalidates keys. Old keys still resolve for old CLI versions, but the new CLI will regenerate. We document major-version key changes in the changelog.
  • Across schemas: rename a column, the key changes, the SQL is regenerated. That's by design — the old data wouldn't satisfy the new schema anyway.
  • Across --scope / --prompt: the --scope and --prompt arguments are part of generation but not in the cache key (they only matter on cache miss). If you need different data per environment, give it a different --seed name.

Why a PRNG and not the LLM?

Sampling is driven by a per-seed xoroshiro128+ PRNG. The model returns a batch of candidate values for any given column; the PRNG decides the assignment. So even though an LLM is involved, the output is a deterministic function of (schema, seed_name, generator_versions) once the cache exists.

Practical recipes

One fixture across local + CI:

# locally, once:
seedkit new --schema schema.prisma --seed my-fixture

# CI, every run:
seedkit seed --schema schema.prisma --seed my-fixture --from-cache --reset

Two fixtures (small + big):

seedkit new --schema schema.prisma --seed demo-small --rows 500
seedkit new --schema schema.prisma --seed demo-large --rows 500000

Different names → different cache entries → independent regen.

Per-PR fixture, shared across environments:

seedkit seed --schema schema.prisma --seed pr-${{ github.event.pull_request.number }}

Every PR gets a stable, named seed. CI, preview deploys, and a teammate pulling the branch all hit the same data.

See also