Features — seedkit

01Synthetic data

Realistic data, on demand.

Seedkit reads your schema directly — Prisma file, raw SQL DDL, or a live database — and orchestrates an LLM with column-name and type heuristics to produce data that looks like real users. Foreign keys resolve, distributions match the domain, free-text fields don't say lorem ipsum. Everything is deterministic from a seed name, so the same input produces the same rows every time.

Schema introspection
Prisma · DDL · live DB
FK-correct across composite keys
Domain-aware generation
CRM · fintech · health · e-commerce
Deterministic seeding
byte-identical re-runs

How synthetic data works

// preview · 5 rows from `users`

Priya Mehtapriya@aventar.copro

Javier Sotoj.soto@kite-labs.ioenterprise

Naomi Fieldsnaomi@greenpath.devfree

Elias Brandtelias.b@northwind.iopro

Maya Linmaya@ironroot.studiopro

✓ 14,820 rows across 8 tables · FK-correct · seed 9a2f4e01

02Database & API

A real database — and a real API — for whatever you're building.

Every seed can spin up an ephemeral Postgres on Neon (real connection string, real psql-compatible, scale-to-zero between queries) and serve as a CORS-enabled REST endpoint with pagination and optional auth. Use one, both, or neither — and re-provision either of them from any cached seed.

Ephemeral Postgres on Neon
EU-hosted by default
REST API per seed
pagination · auth · CORS
Token-gated or public
switch with one toggle
Revive expired DBs
from any saved seed

Ephemeral databases & APIs

seedkit dashboard with active ephemeral databases

GET /api/mocks/a3fd/users?limit=2

HTTP/1.1 200 · application/json

{ "data": [{id: 1, name: "Priya M."}, ...], "next": "?cursor=..." }

03Reproducibility & CLI

Same data, every run.

Lock a seed name with --seed and seedkit caches the generated SQL server-side, keyed by name + schema hash. Anyone on your team — laptop, preview-per-PR, CI runner — gets byte-identical data on demand. The CLI is open source (MIT) and runs anywhere your pipeline does.

Searchable seed library
every run, saved & shareable
Open-source CLI
MIT · works without an account
First-class CI integration
GitHub Actions · GitLab · CircleCI
Stable across schema drift
version your seed alongside your migrations

Open seeds library

seedkit seed library catalog with cached datasets

# .github/workflows/test.yml

- name: Seed test DB

run: npx seedkit-cli seed --seed my-fixture --from-cache

✓ from cache · 14,820 rows in 1.4s

04Synthetic Data API

Realistic data — without a database.

One HTTP call returns deterministic JSON sampled from a curated pool of 500+ identities, companies, posts, products, and reviews — all internally coherent (every posts.author_id resolves to a real person, every review points at a real product). For frontend prototypes, demos, design reviews, content fixtures — anywhere you want realistic data without setting up a schema. Pro+.

REST endpoint
/api/v1/data/<dataset>
Curated, coherent pool
FKs resolve across datasets
Deterministic via ?seed=
stable for fixtures
Pro+ with PAT auth
10K calls/mo on Pro

Synthetic Data API

$ curl -H "Authorization: Bearer sk_live_…" \

app.seedkit.dev/api/v1/data/identities?industry=healthcare&count=3

HTTP/1.1 200 · application/json

{

"data": [

{ "id": "maya-okonkwo", "job_title": "Pediatric Nurse", … },

{ "id": "dr-james-park", "job_title": "Cardiologist", … },

{ "id": "sofia-restrepo", "job_title": "Hospital Admin", … }

"total_available": 33,

"dataset": "identities"

}

Four things seedkit does, end-to-end.

Realistic data, on demand.

A real database — and a real API — for whatever you're building.

Same data, every run.

Realistic data — without a database.

Your next database is a sentence away.