Database Seeding

Complete guide to seeding database data across the MyNATCA ecosystem. Covers platform data, PayChecker data, Discord data, and custom test scenarios.

Overview

The MyNATCA platform uses an orchestrated seeding workflow:

Platform seeds public schema - Core member and facility data from production
Subprojects seed their schemas - Each project manages its own test data
Orchestration via db-refresh.sh - Single command seeds everything

All seeding is coordinated through the platform's db-refresh.sh script, which:

Resets database (applies all migrations)
Seeds platform data (public schema)
Discovers linked projects from .linked-projects.json
Runs each project's seed script

Quick Start

Seed Everything (Recommended)

cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

This single command:

Resets database and applies all migrations
Seeds platform data (~34,982 members from remote)
Seeds pay data (pay periods + test shifts + custom SQL)
Seeds discord data (if configured)

Seed Platform Only

cd ~/dev/mynatca/platform
npm run seed:from-remote  # Seed from staging/production
npm run seed:dry-run      # Preview without applying

Seed Pay Only

cd ~/dev/mynatca/pay
npm run seed                # Seed test data
npm run seed:from-remote    # Pull from staging/production
npm run seed:dry-run        # Preview what would be seeded

Platform Seeding (Public Schema)

Platform data is pulled from a remote source (staging or production).

Configuration

Set environment variables in platform/.env:

# Remote source for seeding
SEED_SOURCE_URL=https://[staging-ref].supabase.co
SEED_SOURCE_SERVICE_KEY=eyJ...

What Gets Seeded

Members - ~34,982 members (excludes auth0_user_id - environment-specific)
Facilities - ~623 facilities
Regions - 12 regions
Positions - ~1,247 positions
Position grants - Permission mappings

Environment-Specific Exclusions

The auth0_user_id column on the members table is NOT seeded because it's environment-specific:

Local development has different Auth0 tenant than staging
Staging has different Auth0 tenant than production
Each environment maintains its own user-to-member mappings

Seeding Commands

cd ~/dev/mynatca/platform
 
# Seed from remote (staging/production)
npm run seed:from-remote
 
# Preview what will be seeded (dry run)
npm run seed:dry-run
 
# These commands call: node scripts/seed-staging-from-production.mjs

Manual Seeding

The seed script can also be run directly:

cd ~/dev/mynatca/platform
node scripts/seed-staging-from-production.mjs
 
# With dry run flag
node scripts/seed-staging-from-production.mjs --dry-run

Pay Seeding (Pay Schema)

PayChecker supports multiple seeding approaches: Python scripts for dynamic data and SQL files for static data.

What Gets Seeded

Test Mode (Default):

209 pay periods (2018-2025)
18 test shifts for member 40671
Premium pay entries (CIC, OJTI, TOS/TNW)
Custom SQL seeds from supabase/seeds/*.sql (if present)

Remote Mode:

All pay periods from remote
All member shifts from remote
All LES data (earnings, deductions, leave balances)
All premium pay tracking
Configuration data

Seed Execution Order

When db-refresh.sh runs pay seeding, it executes in this order:

1. Python Scripts (dynamic data generation)
   ├─ seed_pay_periods.py → 209 pay periods
   └─ seed_dev_shift_data.py → 18 test shifts

2. SQL Files (static/custom data, alphabetical)
   ├─ 01_example_shifts.sql (commented examples)
   ├─ 02_example_active_shift.sql (commented examples)
   ├─ 03_test_custom_seed.sql (working test data)
   └─ [your custom *.sql files...]

Python Seed Scripts

Location: pay/scripts/dev/*.py

Active scripts:

seed_pay_periods.py - Generates 209 pay periods (2018-2025)
seed_dev_shift_data.py - Generates 18 test shifts for member 40671

How to create custom Python seeds:

cd ~/dev/mynatca/pay
 
# Copy template
cp scripts/dev/seed_example_template.py.example scripts/dev/seed_my_data.py
 
# Edit seed_my_data.py with your logic
# Modify the seed_data() function
 
# Add to scripts/seed.sh
echo "python3 scripts/dev/seed_my_data.py" >> scripts/seed.sh
 
# Test
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

SQL Seed Files

Location: pay/supabase/seeds/*.sql

SQL files are automatically executed in alphabetical order by the pay seed script.

Creating custom SQL seeds:

cd ~/dev/mynatca/pay
 
# Create seed file (numbered for execution order)
cat > supabase/seeds/50_my_test_data.sql << 'EOF'
-- My custom test data
 
INSERT INTO pay.shifts (
    membernumber,
    pp_id,
    shift_date,
    start_time,
    end_time,
    type,
    notes,
    created_by
) VALUES (
    '40671',  -- Member exists in platform seed data
    '202524', -- Nov 2-15, 2025 pay period
    '2025-11-12',
    '2025-11-12 08:00:00',
    '2025-11-12 16:00:00',
    'regular',
    'My test shift',
    '40671'
) ON CONFLICT (membernumber, shift_date, pp_id) WHERE deleted_at IS NULL DO NOTHING;
EOF
 
# Next db-refresh will automatically execute this file
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

SQL seed features:

Files executed alphabetically (use number prefixes to control order)
Runs AFTER Python scripts (can reference Python-created data)
Full SQL support (DO blocks, SELECT INTO, functions, etc.)
Use ON CONFLICT DO NOTHING for idempotency
*.sql files gitignored by default (use git add -f to commit examples)

File naming convention:

01_example.sql - Runs first
50_my_data.sql - Runs in middle
99_last.sql - Runs last

Remote Mode (Pull Production/Staging Data)

Pull real data from production or staging:

cd ~/dev/mynatca/pay
 
# Configure in .env:
# SEED_SOURCE_URL=https://[staging-ref].supabase.co
# SEED_SOURCE_SERVICE_KEY=eyJ...
 
# Pull all data from remote
npm run seed:from-remote
# OR
./scripts/seed.sh --from-remote
 
# Preview without making changes
npm run seed:dry-run
# OR
./scripts/seed.sh --dry-run

What gets pulled:

All pay periods
All member shifts
All LES data (earnings, deductions, leave balances, etc.)
All premium pay tracking
Configuration data

Discord Seeding (Discord Schema)

Discord seeding is project-specific. Configure a seed script in the discord repository.

Setting Up Discord Seeding

cd ~/dev/mynatca/discord
 
# Create seed script
cat > scripts/seed.sh << 'EOF'
#!/usr/bin/env bash
set -e
 
# Example: Seed verification test data
psql "$DATABASE_URL" << SQL
INSERT INTO discord.verification_requests (...) VALUES (...) ON CONFLICT DO NOTHING;
SQL
 
echo "Discord seed complete"
EOF
 
# Make executable
chmod +x scripts/seed.sh
 
# Link to platform
cd ~/dev/mynatca/platform
./scripts/link-project.sh
# Provide seed script path: ./scripts/seed.sh
 
# Test
./scripts/db-refresh.sh

Subproject Seed Integration

How Platform Discovers Seeds

Platform's db-refresh.sh reads .linked-projects.json:

[
  {
    "name": "pay",
    "path": "/Users/jason/dev/mynatca/pay",
    "schema": "pay",
    "seedScript": "./scripts/seed.sh"
  },
  {
    "name": "discord",
    "path": "/Users/jason/dev/mynatca/discord",
    "schema": "discord",
    "seedScript": "./scripts/seed.sh"
  }
]

Requirements for Subproject Seed Scripts

Your subproject's seed script should:

Be executable: chmod +x scripts/seed.sh
Handle its own schema seeding
Exit with code 0 on success
Output clear progress messages
Be idempotent (safe to run multiple times)

Three Seeding Approaches

Subprojects can use Python scripts, SQL files, npm/Node.js scripts, or any combination.

Option 1: Python Seed Scripts

Best for: Python-based projects, data science, complex calculations

Example structure:

pay/
├── scripts/
│   ├── seed.sh                    # Main entry point
│   └── dev/
│       ├── seed_pay_periods.py    # Generate pay periods
│       └── seed_dev_shift_data.py # Generate test shifts

Sample seed.sh:

#!/usr/bin/env bash
set -e
 
# Activate virtual environment
source .venv/bin/activate
 
# Run Python seed scripts
python3 scripts/dev/seed_pay_periods.py
python3 scripts/dev/seed_dev_shift_data.py

Option 2: npm/Node.js Seed Scripts

Best for: JavaScript/TypeScript projects, Node.js ecosystem

Example structure:

discord/
├── scripts/
│   ├── seed.sh                    # Main entry point
│   └── seeds/
│       ├── seed-verification-requests.js
│       └── seed-validation-logs.ts
├── package.json

Sample seed.sh:

#!/usr/bin/env bash
set -e
 
# Run npm seed scripts
npm run seed:verification
npm run seed:validation
 
# Or run node scripts directly
node scripts/seeds/seed-verification-requests.js
npx tsx scripts/seeds/seed-validation-logs.ts

Option 3: SQL Seed Files

Best for: Simple INSERT statements, static test data

Example structure:

pay/
├── supabase/
│   └── seeds/
│       ├── README.md
│       ├── 01_baseline_data.sql
│       ├── 50_test_shifts.sql
│       └── 99_custom_data.sql

Sample seed.sh that auto-executes SQL files:

#!/usr/bin/env bash
set -e
 
# Run SQL seed files if they exist
if [ -d "supabase/seeds" ]; then
    for sql_file in supabase/seeds/*.sql; do
        [ -f "$sql_file" ] || continue
        echo "Executing $(basename "$sql_file")..."
        psql "$DATABASE_URL" -f "$sql_file"
    done
fi

Option 4: Mix and Match (Recommended)

Example: Python + SQL (pay project)

#!/usr/bin/env bash
set -e
 
# 1. Run Python scripts for complex/dynamic data
source .venv/bin/activate
python3 scripts/dev/seed_pay_periods.py
python3 scripts/dev/seed_dev_shift_data.py
 
# 2. Run SQL files for simple/custom data
if [ -d "supabase/seeds" ]; then
    for sql_file in supabase/seeds/*.sql; do
        [ -f "$sql_file" ] || continue
        echo "Executing $(basename "$sql_file")..."
        psql "$DATABASE_URL" -f "$sql_file"
    done
fi

Creating Custom Test Data

For Platform (Public Schema)

Platform data is seeded from remote sources. To add custom test data:

Option 1: Modify the remote source (staging/production) Option 2: Create a post-seed script that adds additional test data

cd ~/dev/mynatca/platform
 
# Create post-seed script
cat > scripts/seed-custom-test-data.sh << 'EOF'
#!/usr/bin/env bash
psql "$DATABASE_URL" << SQL
-- Add custom test members
INSERT INTO public.members (...) VALUES (...) ON CONFLICT DO NOTHING;
SQL
EOF
 
chmod +x scripts/seed-custom-test-data.sh
 
# Run after main seed
npm run seed:from-remote
./scripts/seed-custom-test-data.sh

For Pay (Pay Schema)

SQL files (recommended for simple data):

cd ~/dev/mynatca/pay
 
cat > supabase/seeds/60_shutdown_scenario.sql << 'EOF'
-- Test scenario: Shutdown with multiple shifts
 
INSERT INTO pay.shifts (
    membernumber, pp_id, shift_date, start_time, end_time, type, notes
) VALUES
    ('40671', '202524', '2025-11-10', '2025-11-10 22:00:00', '2025-11-11 06:00:00', 'mid', 'Shutdown shift 1'),
    ('40671', '202524', '2025-11-11', '2025-11-11 22:00:00', '2025-11-12 06:00:00', 'mid', 'Shutdown shift 2'),
    ('40671', '202524', '2025-11-12', '2025-11-12 22:00:00', '2025-11-13 06:00:00', 'mid', 'Shutdown shift 3')
ON CONFLICT (membernumber, shift_date, pp_id) WHERE deleted_at IS NULL DO NOTHING;
 
INSERT INTO pay.cic_time (
    membernumber, pp_id, shift_date, start_time, end_time, hours
) VALUES
    ('40671', '202524', '2025-11-10', '2025-11-10 22:00:00', '2025-11-11 06:00:00', 8.0),
    ('40671', '202524', '2025-11-11', '2025-11-11 22:00:00', '2025-11-12 06:00:00', 8.0),
    ('40671', '202524', '2025-11-12', '2025-11-12 22:00:00', '2025-11-13 06:00:00', 8.0)
ON CONFLICT DO NOTHING;
EOF
 
# Next db-refresh will execute this
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

Python scripts (for complex/dynamic data):

cd ~/dev/mynatca/pay
 
# Copy template
cp scripts/dev/seed_example_template.py.example scripts/dev/seed_shutdown_scenario.py
 
# Edit to generate shutdown scenario data
# ... modify seed_data() function ...
 
# Add to seed.sh
echo "python3 scripts/dev/seed_shutdown_scenario.py" >> scripts/seed.sh
 
# Test
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

Best Practices

Make Seeds Idempotent

Use ON CONFLICT to safely re-run seeds:

INSERT INTO pay.shifts (membernumber, pp_id, shift_date, ...)
VALUES ('40671', '202524', '2025-11-10', ...)
ON CONFLICT (membernumber, shift_date, pp_id)
WHERE deleted_at IS NULL
DO NOTHING;

Reference Platform Data

Member numbers from platform seed data you can use:

40671 - Test member (used in pay Python seeds)
35973, 38264, 39829, 40104 - Additional test members

-- Get a random member from platform data
SELECT membernumber FROM public.members
WHERE verified_at IS NOT NULL
LIMIT 1;

Check Pay Periods

Before creating shift test data, check available pay periods:

-- View current pay period
SELECT pp_id, start_date, end_date
FROM pay.pay_periods
WHERE CURRENT_DATE BETWEEN start_date AND end_date;
 
-- View recent pay periods
SELECT pp_id, start_date, end_date
FROM pay.pay_periods
ORDER BY pp_id DESC
LIMIT 5;

Use Meaningful Test Data

Use realistic dates within valid pay periods
Use realistic shift times (not arbitrary)
Add descriptive notes to explain test scenarios
Create complete scenarios (not isolated data points)

Document Test Scenarios

Add comments to SQL seed files explaining the test scenario:

-- Test Scenario: Shutdown Event
--
-- Member: 40671 (Test Member)
-- Pay Period: 202524 (Nov 2-15, 2025)
-- Scenario: 3 consecutive mid shifts during facility shutdown
-- Expected: 24 hours regular pay + 24 hours CIC premium
--
-- This tests:
-- - Multi-day shutdown tracking
-- - CIC premium calculation
-- - Mid shift handling
 
INSERT INTO pay.shifts ...

Development Workflow

Daily Development

# Reset and seed everything
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh
 
# Start working
cd ~/dev/mynatca/pay
npm run dev

Testing with Production Data

# 1. Configure pay/.env with SEED_SOURCE_* vars pointing to staging
 
# 2. Pull data from staging
cd ~/dev/mynatca/pay
./scripts/seed.sh --from-remote
 
# 3. Verify data
psql postgresql://postgres:postgres@127.0.0.1:54322/postgres \
  -c "SELECT COUNT(*) FROM pay.shifts;"

Creating Reusable Test Scenarios

# Create a SQL seed file for your scenario
cd ~/dev/mynatca/pay
cat > supabase/seeds/60_shutdown_scenario.sql << 'EOF'
-- Test scenario: Shutdown with multiple shifts
INSERT INTO pay.shifts (...) VALUES (...);
INSERT INTO pay.cic_time (...) VALUES (...);
EOF
 
# Commit it for the team (requires git add -f since *.sql is gitignored)
git add -f supabase/seeds/60_shutdown_scenario.sql
git commit -m "Add shutdown test scenario seed"
git push

Troubleshooting

Seed Fails with Foreign Key Errors

Cause: Platform seed didn't run first (pay data references members)

Fix:

cd ~/dev/mynatca/platform
./scripts/db-refresh.sh  # Seeds platform first, then pay

SQL Seed File Not Executing

Check:

File exists in pay/supabase/seeds/
File has .sql extension
File is not in .gitignore exclusion list
No SQL syntax errors

Verify:

cd ~/dev/mynatca/pay
./scripts/seed.sh  # Should show "Executing [filename]..."

Want to Start Fresh

cd ~/dev/mynatca/platform
./scripts/db-refresh.sh  # Resets database and seeds everything

Remote Seed Fails with Permission Error

Cause: Wrong API key or missing schema grants

Fix:

Check pay/.env has correct SEED_SOURCE_SERVICE_KEY
Verify source database has pay schema grants applied
Ensure not trying to write to production (safety check blocks this)

Pay Seed Not Running During db-refresh

Check .linked-projects.json:

cat ~/dev/mynatca/platform/.linked-projects.json
# Should contain pay entry with seedScript

Re-link if needed:

cd ~/dev/mynatca/platform
./scripts/link-project.sh

Python Virtual Environment Issues

cd ~/dev/mynatca/pay
rm -rf .venv
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Quick Reference

Commands

# Platform
cd ~/dev/mynatca/platform
npm run seed:from-remote    # Seed from staging/production
npm run seed:dry-run        # Preview
./scripts/db-refresh.sh     # Reset DB + seed everything
 
# Pay
cd ~/dev/mynatca/pay
npm run seed                # Seed test data
npm run seed:from-remote    # Pull from staging/production
npm run seed:dry-run        # Preview
./scripts/seed.sh           # Direct script execution
 
# Discord
cd ~/dev/mynatca/discord
./scripts/seed.sh           # If seed script configured

File Locations

Platform:

Seed script: scripts/seed-staging-from-production.mjs
Registry: .linked-projects.json

Pay:

Main seed: scripts/seed.sh
Python seeds: scripts/dev/*.py
SQL seeds: supabase/seeds/*.sql
Docs: docs/seed-data.md, scripts/README.md

Discord:

Seed script: scripts/seed.sh (if configured)

Migration from Old System

Old way (deprecated):

# Old: Manual steps
cd ~/dev/mynatca/platform
supabase db reset
npm run seed:staging
 
cd ~/dev/mynatca/pay
python scripts/dev/seed_pay_periods.py
python scripts/dev/seed_dev_shift_data.py

New way:

# New: Single command
cd ~/dev/mynatca/platform
./scripts/db-refresh.sh

Database Seeding

Overview

Quick Start

Seed Everything (Recommended)

Seed Platform Only

Seed Pay Only

Platform Seeding (Public Schema)

Configuration

What Gets Seeded

Environment-Specific Exclusions

Seeding Commands

Manual Seeding

Pay Seeding (Pay Schema)

What Gets Seeded

Seed Execution Order

Python Seed Scripts

SQL Seed Files

Remote Mode (Pull Production/Staging Data)

Discord Seeding (Discord Schema)

Setting Up Discord Seeding

Subproject Seed Integration

How Platform Discovers Seeds

Requirements for Subproject Seed Scripts

Three Seeding Approaches

Option 1: Python Seed Scripts

Option 2: npm/Node.js Seed Scripts

Option 3: SQL Seed Files

Option 4: Mix and Match (Recommended)

Creating Custom Test Data

For Platform (Public Schema)

For Pay (Pay Schema)

Best Practices

Make Seeds Idempotent

Reference Platform Data

Check Pay Periods

Use Meaningful Test Data

Document Test Scenarios

Development Workflow

Daily Development

Testing with Production Data

Creating Reusable Test Scenarios

Troubleshooting

Seed Fails with Foreign Key Errors

SQL Seed File Not Executing

Want to Start Fresh

Remote Seed Fails with Permission Error

Pay Seed Not Running During db-refresh

Python Virtual Environment Issues

Quick Reference

Commands

File Locations

Migration from Old System

Related Documentation