Skip to content

Commit 8a63287

Browse files
committed
feat: add seed data documentation and dataset schema for type-safe data management
1 parent 988136b commit 8a63287

5 files changed

Lines changed: 806 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,6 +210,7 @@ The `skills/` directory contains domain-specific AI skill definitions. When work
210210
| **Quickstart** | `skills/objectstack-quickstart/SKILL.md` | Project creation, defineStack(), drivers, adapters, bootstrap |
211211
| **Plugin** | `skills/objectstack-plugin/SKILL.md` | Plugin lifecycle, DI, EventBus, Kernel config |
212212
| Schema Design | `skills/objectstack-schema/SKILL.md` | Designing Objects, Fields, Relations, Validations |
213+
| **Seed Data** | `skills/objectstack-seed/SKILL.md` | defineDataset(), seed fixtures, import modes, env scoping |
213214
| Query Design | `skills/objectstack-query/SKILL.md` | Filters, sorting, pagination, aggregation, joins |
214215
| API Design | `skills/objectstack-api/SKILL.md` | Designing REST/GraphQL endpoints |
215216
| UI Design | `skills/objectstack-ui/SKILL.md` | Designing Views, Dashboards, Apps |

content/docs/guides/meta.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
"packages",
66
"metadata",
77
"data-modeling",
8+
"seed-data",
89
"common-patterns",
910
"airtable-dashboard-analysis",
1011
"---Building---",

content/docs/guides/seed-data.mdx

Lines changed: 304 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,304 @@
1+
---
2+
title: Seed Data & Fixtures
3+
description: Populate ObjectStack objects with bootstrap data, reference records, and demo fixtures using defineDataset()
4+
---
5+
6+
# Seed Data & Fixtures
7+
8+
`defineDataset()` is the canonical way to define seed data in ObjectStack. It provides
9+
compile-time type safety by inferring valid field keys directly from your object
10+
definition, so typos in record field names are caught before the code runs.
11+
12+
Use seed data for:
13+
14+
- **System bootstrap** — default roles, admin users, system configuration
15+
- **Reference data** — countries, currencies, ISO codes, standard picklist values
16+
- **Demo / test fixtures** — realistic sample records for development and CI
17+
18+
---
19+
20+
## Quick Start
21+
22+
```typescript
23+
import { defineDataset } from '@objectstack/spec/data';
24+
import { Account } from './objects/account.object';
25+
26+
export const accountsSeed = defineDataset(Account, {
27+
externalId: 'name', // field used as the upsert / idempotency key
28+
mode: 'upsert', // create if new, update if found
29+
env: ['dev', 'test'], // only load in dev and test environments
30+
records: [
31+
{
32+
name: 'Acme Corporation',
33+
type: 'customer',
34+
industry: 'technology',
35+
annual_revenue: 5000000,
36+
},
37+
{
38+
name: 'Globex Industries',
39+
type: 'prospect',
40+
industry: 'manufacturing',
41+
annual_revenue: 12000000,
42+
},
43+
],
44+
});
45+
```
46+
47+
The first argument is the **object definition** (the exported constant from your
48+
object file), not a string. This lets TypeScript validate every field name in
49+
`records` against the object's `fields` map at compile time.
50+
51+
---
52+
53+
## Import Modes
54+
55+
The `mode` field controls how the seed runner behaves when it encounters an existing
56+
record (matched by `externalId`).
57+
58+
| Mode | Behavior | Use Case |
59+
|:-----|:---------|:---------|
60+
| `upsert` | Create if new, update if found | Default — idempotent for most data |
61+
| `insert` | Create only, throw on duplicate | Append-only tables, audit logs |
62+
| `update` | Update only, skip if not found | Migration patches on existing rows |
63+
| `ignore` | Create if new, silently skip duplicates | Bootstrap data that must not overwrite user edits |
64+
| `replace` | Delete ALL records then insert | Cache / lookup tables rebuilt on each run |
65+
66+
### `upsert` — Recommended Default
67+
68+
```typescript
69+
defineDataset(Currency, {
70+
externalId: 'code',
71+
mode: 'upsert',
72+
records: [
73+
{ code: 'USD', name: 'US Dollar', symbol: '$' },
74+
{ code: 'EUR', name: 'Euro', symbol: '' },
75+
{ code: 'GBP', name: 'British Pound', symbol: '£' },
76+
],
77+
});
78+
```
79+
80+
### `ignore` — Bootstrap Without Overwriting
81+
82+
```typescript
83+
defineDataset(SystemRole, {
84+
externalId: 'code',
85+
mode: 'ignore',
86+
records: [
87+
{ code: 'admin', label: 'Administrator' },
88+
{ code: 'viewer', label: 'Viewer' },
89+
],
90+
});
91+
```
92+
93+
### `replace` — Full Table Rebuild
94+
95+
```typescript
96+
// ⚠️ Deletes ALL records in the object before inserting.
97+
// Only use for cache or lookup tables with no user-generated data.
98+
defineDataset(ExchangeRateCache, {
99+
externalId: 'key',
100+
mode: 'replace',
101+
env: ['dev'],
102+
records: [
103+
{ key: 'USD_EUR', rate: 0.92 },
104+
{ key: 'USD_GBP', rate: 0.79 },
105+
],
106+
});
107+
```
108+
109+
---
110+
111+
## Environment Scoping
112+
113+
The `env` array controls which deployment environments receive the records. The
114+
default is `['prod', 'dev', 'test']` — all environments.
115+
116+
```typescript
117+
// Reference data — safe for all environments (default)
118+
defineDataset(Country, {
119+
// env omitted → defaults to ['prod', 'dev', 'test']
120+
records: [
121+
{ code: 'US', name: 'United States' },
122+
{ code: 'GB', name: 'United Kingdom' },
123+
],
124+
});
125+
126+
// Demo data — never reaches production
127+
defineDataset(Account, {
128+
env: ['dev', 'test'],
129+
records: [
130+
{ name: 'Demo Corp', type: 'customer' },
131+
],
132+
});
133+
134+
// Automated test fixtures — CI/CD only
135+
defineDataset(TestUser, {
136+
env: ['test'],
137+
records: [
138+
{ email: 'ci-admin@example.com', role: 'admin' },
139+
],
140+
});
141+
```
142+
143+
---
144+
145+
## Type Safety
146+
147+
`defineDataset()` infers valid field keys from the object definition you pass as the
148+
first argument. If you reference a field that does not exist on the object, TypeScript
149+
reports an error immediately.
150+
151+
```typescript
152+
import { Account } from './objects/account.object';
153+
154+
defineDataset(Account, {
155+
records: [
156+
{
157+
name: 'Test Corp',
158+
typo_fild: 'value',
159+
// ^^^^^^^^^
160+
// TS Error: Object literal may only specify known properties,
161+
// and 'typo_fild' does not exist in type 'Partial<Record<keyof ...>>'
162+
},
163+
],
164+
});
165+
```
166+
167+
This is a major advantage over writing plain JSON — always use `defineDataset()`
168+
over the raw `DatasetSchema.parse()` call.
169+
170+
---
171+
172+
## Relationship Fields
173+
174+
For `lookup` fields that reference another object, supply the **natural key value**
175+
of the related record (such as `name`, `email`, or `code`) — not its UUID. The seed
176+
runner resolves natural keys to database IDs automatically at load time.
177+
178+
```typescript
179+
// Step 1 — seed the parent object first
180+
const accountsSeed = defineDataset(Account, {
181+
externalId: 'name',
182+
records: [
183+
{ name: 'Acme Corporation', type: 'customer' },
184+
],
185+
});
186+
187+
// Step 2 — seed the child object, referencing the parent by natural key
188+
const contactsSeed = defineDataset(Contact, {
189+
externalId: 'email',
190+
records: [
191+
{
192+
email: 'john.smith@acme.example.com',
193+
first_name: 'John',
194+
last_name: 'Smith',
195+
account: 'Acme Corporation', // natural key, not a UUID
196+
},
197+
],
198+
});
199+
200+
// Export in dependency order — parents before children
201+
export const SeedData = [accountsSeed, contactsSeed];
202+
```
203+
204+
---
205+
206+
## Organising Multiple Datasets
207+
208+
For applications with several objects, co-locate seed files under `src/data/` and
209+
export a single aggregate array.
210+
211+
```
212+
src/
213+
data/
214+
index.ts ← exports SeedData array in dependency order
215+
accounts.seed.ts
216+
contacts.seed.ts
217+
leads.seed.ts
218+
products.seed.ts
219+
```
220+
221+
```typescript
222+
// src/data/index.ts
223+
import { accountsSeed } from './accounts.seed';
224+
import { contactsSeed } from './contacts.seed';
225+
import { leadsSeed } from './leads.seed';
226+
import { productsSeed } from './products.seed';
227+
228+
/** All seed datasets — order determines load sequence */
229+
export const SeedData = [
230+
accountsSeed, // no dependencies
231+
productsSeed, // no dependencies
232+
contactsSeed, // depends on accounts
233+
leadsSeed, // no dependencies
234+
];
235+
```
236+
237+
---
238+
239+
## Best Practices
240+
241+
### Choose a stable `externalId`
242+
243+
The `externalId` field must be a **stable natural key** that does not change between
244+
environments. Avoid using the auto-generated `id` (UUID) because UUIDs differ between
245+
databases.
246+
247+
| Scenario | Recommended `externalId` |
248+
|:---------|:------------------------|
249+
| Named entities (countries, currencies) | `'code'` or `'slug'` |
250+
| User records | `'email'` |
251+
| Generic named records | `'name'` (default) |
252+
| Externally sourced data | `'external_id'` |
253+
254+
### Scope demo data with `env`
255+
256+
Keep demo and test-only records out of production by setting `env: ['dev', 'test']`.
257+
System bootstrap data that must exist in production should omit `env` (or explicitly
258+
set `['prod', 'dev', 'test']`).
259+
260+
### Use `upsert` by default
261+
262+
`upsert` is idempotent and the safest default. Only change the mode when the use
263+
case requires it — for example, `ignore` when you must not overwrite user edits to
264+
system defaults, or `replace` for ephemeral cache tables.
265+
266+
### Seed in dependency order
267+
268+
Always export parent datasets before child datasets. If `contact` has a lookup to
269+
`account`, `accountsSeed` must appear before `contactsSeed` in the exported array.
270+
271+
### Keep records realistic
272+
273+
Demo data appears in screenshots, documentation, and live demos. Use realistic
274+
company names, email addresses, and values — not `foo`, `bar`, or `test123`.
275+
276+
### One file per object
277+
278+
Split large seed payloads into `{object}.seed.ts` files. A single `index.ts` that
279+
re-exports and orders them keeps the entry point clean.
280+
281+
---
282+
283+
## `defineDataset()` API Reference
284+
285+
```typescript
286+
function defineDataset<
287+
const TObj extends { name: string; fields: Record<string, unknown> }
288+
>(
289+
objectDef: TObj,
290+
config: {
291+
externalId?: string; // default: 'name'
292+
mode?: 'insert' | 'update' | 'upsert' | 'replace' | 'ignore'; // default: 'upsert'
293+
env?: Array<'prod' | 'dev' | 'test'>; // default: ['prod','dev','test']
294+
records: Array<Partial<Record<keyof TObj['fields'], unknown>>>;
295+
}
296+
): Dataset
297+
```
298+
299+
The returned `Dataset` object is a plain serialisable valuepass it to your
300+
stack's seed runner or store it in an export array.
301+
302+
---
303+
304+
**Next:** [Data Modeling Guide](./data-modeling)

0 commit comments

Comments
 (0)