Troubleshooting Guide
This comprehensive troubleshooting guide covers common issues, diagnostic procedures, and solutions for the MyNATCA platform.
General Diagnostics
Quick Health Check
# Test all core services
curl https://discord.mynatca.org/api/health
curl https://api.mynatca.org/api/health
curl https://hub.mynatca.org/api/health
# Check database connectivity
npm run test:connections
# Verify environment variables
npm run validate:configLog Analysis
# View recent logs
tail -f logs/app.log
tail -f logs/error.log
# Search for specific errors
grep -i "error" logs/app.log | tail -20
grep -i "auth0" logs/app.log | tail -10
# Monitor real-time activity
pm2 logs mynatca-discord --lines 50Performance Monitoring
# Check system resources
htop
free -m
df -h
# Monitor application performance
pm2 monit
npm run health:detailedDiscord Bot Issues
Guild Access Errors (NEW - October 2025)
Symptoms
/refreshcommand fails with "Cannot read properties of null (reading 'members')"- Commands fail when trying to access guild resources
TypeError: interaction.guild is nullerrors in logs- Commands work sometimes but fail randomly
Root Cause
Discord.js can return null for interaction.guild in certain edge cases, even when commands are executed in a guild context. This occurs intermittently and is not related to bot permissions.
Diagnostic Steps
# 1. Check logs for null guild errors
grep -i "guild is null" logs/discord-bot.log
# 2. Verify command is not being used in DMs
# Check if command has DM permission disabled
# 3. Test command in guild
# Use /refresh command in Discord serverSolutions
1. Implement Guild Access Fallback Pattern (REQUIRED)
All commands that access guild resources must use the fallback pattern:
// CORRECT: Guild access with fallback
const guild = interaction.guild || interaction.client.guilds.cache.first();
if (!guild) {
return interaction.reply({
content: 'This command can only be used in a server.',
ephemeral: true
});
}
// Now safely access guild resources
const member = await guild.members.fetch(userId);2. Add DM Permission Protection
Prevent commands from being used in DMs:
const { SlashCommandBuilder } = require('discord.js');
module.exports = {
data: new SlashCommandBuilder()
.setName('refresh')
.setDescription('Refresh member roles')
.setDMPermission(false), // Prevent DM usage
async execute(interaction) {
const guild = interaction.guild || interaction.client.guilds.cache.first();
// ... command logic
}
};3. Update Existing Commands
Apply the pattern to all affected commands:
/refresh user- Refresh specific user roles/refresh member- Refresh by member number/refresh all- Refresh all verified members- All administrative commands
4. Redeploy Commands
After updating command definitions:
npm run deployPrevention
For New Commands:
- Always use guild access fallback pattern
- Set
.setDMPermission(false)for guild-only commands - Validate guild exists before accessing resources
- Test commands thoroughly in guild context
Code Review Checklist:
- Guild access uses fallback pattern
- DM permissions set correctly
- Guild null check before resource access
- Error handling for guild access failures
Bot Not Responding to Commands
Symptoms
- Commands don't appear when typing
/ - Bot doesn't respond to executed commands
- Commands show as "Application did not respond"
Diagnostic Steps
# 1. Check bot status
curl https://discord.mynatca.org/api/health
# 2. Verify bot permissions in Discord
# Check bot role hierarchy and permissions
# 3. Check environment variables
echo $DISCORD_TOKEN
echo $DISCORD_CLIENT_ID
echo $DISCORD_GUILD_ID
# 4. Test Discord API connectivity
node -e "
const { Client } = require('discord.js');
const client = new Client({ intents: [] });
client.login(process.env.DISCORD_TOKEN)
.then(() => console.log('✅ Discord connection successful'))
.catch(err => console.error('❌ Discord connection failed:', err));
"Solutions
-
Redeploy Commands
npm run deploy -
Check Bot Permissions
- Ensure bot has "Use Slash Commands" permission
- Verify bot role is above managed roles
- Check channel-specific permissions
-
Restart Bot Service
pm2 restart mynatca-discord-bot # or npm run dev -
Regenerate Bot Token
- Go to Discord Developer Portal
- Navigate to Bot section
- Reset token and update environment variables
Registration Help Messages in #verify Channel (NEW - October 2025)
Feature Overview
The Discord bot now provides automatic help messages in the #verify channel when users make common registration mistakes.
Behavior:
- Detects messages that aren't the
/registercommand - Posts helpful @mention messages with guidance
- Auto-deletes help messages after 30 seconds to keep channel clean
- Catches common mistakes like typing member numbers directly
Common Detected Mistakes:
- Member numbers without /register - e.g., typing "123456" instead of using
/register - Typos and spacing errors - e.g., "register 123456" instead of
/register - Wrong commands in #verify - Using other commands in verification channel
- Random messages - Any non-command text receives catch-all reminder
Expected User Experience
Scenario 1: User types member number directly
User: 123456
Bot: @User Hey! To register, use the /register command and follow the prompts. Don't just type your member number.
[Message auto-deletes after 30 seconds]Scenario 2: User makes typo
User: /regsiter
Bot: @User I think you meant /register - use the slash command to start verification.
[Message auto-deletes after 30 seconds]Scenario 3: Any other message
User: How do I register?
Bot: @User Only the /register command is allowed in this channel. Type /register to begin.
[Message auto-deletes after 30 seconds]Troubleshooting Registration Help
Help messages not appearing:
# 1. Verify bot has permissions in #verify channel
# Required permissions:
# - Read Messages
# - Send Messages
# - Mention Everyone
# - Manage Messages (for auto-delete)
# 2. Check messageCreate event handler is running
grep "messageCreate" logs/discord-bot.log
# 3. Verify channel name/ID matches configuration
# Check VERIFY_CHANNEL_ID in environment variablesHelp messages not auto-deleting:
# Verify bot has "Manage Messages" permission
# Check for errors in logs
grep "delete.*message" logs/discord-bot.logFalse positives (help for valid commands):
# Check message detection logic
# Should only trigger for non-slash-command messages
# Verify: msg.content.startsWith('/') check existsImplementation Reference
The registration help system is implemented in the messageCreate event handler:
// Event handler detects:
// - Messages in #verify channel
// - Messages that aren't slash commands
// - Common registration mistakes
// Response pattern:
// 1. Post @mention help message
// 2. Set 30-second auto-delete timer
// 3. Clean up to keep channel tidyVerification Flow Issues
"Invalid Verification Link" Error
Symptoms
- Users receive "Invalid Verification Link" message
- Verification links expire immediately
- Database connection issues
Diagnostic Steps
# 1. Check Supabase connection
node -e "
const { createClient } = require('@supabase/supabase-js');
const client = createClient(process.env.SUPABASE_URL, process.env.SUPABASE_KEY);
client.from('verification_requests').select('count').then(console.log).catch(console.error);
"
# 2. Verify verification_requests table exists
npx supabase status
npx supabase db reset --linked
# 3. Check verification creation logs
grep "verification" logs/app.log | tail -20Solutions
-
Create Missing Table
-- Run in Supabase SQL editor CREATE TABLE IF NOT EXISTS verification_requests ( id SERIAL PRIMARY KEY, verification_id UUID UNIQUE NOT NULL, discord_id TEXT NOT NULL, discord_username TEXT NOT NULL, status TEXT DEFAULT 'pending', auth0_user_id TEXT, member_number TEXT, created_at TIMESTAMPTZ DEFAULT NOW(), expires_at TIMESTAMPTZ NOT NULL, completed_at TIMESTAMPTZ, updated_at TIMESTAMPTZ DEFAULT NOW() ); -
Fix RLS Policies
ALTER TABLE verification_requests ENABLE ROW LEVEL SECURITY; CREATE POLICY "Service role access" ON verification_requests FOR ALL USING (auth.role() = 'service_role'); -
Clear Expired Verifications
npm run cleanup:expired-verifications
Staff and RNAV Member Command Issues (NEW - October 2025)
Issue: /status Command Fails for Staff or RNAV Members
Symptoms:
/statuscommand works for regular members (membertypeid=6)/statusfails or returns "not verified" for Staff (membertypeid=8)/statusfails or returns "not verified" for RNAV (membertypeid=10)- Error: "No member data found" for valid Staff/RNAV members
Root Cause:
The /status command query was using .eq('membertypeid', 6) which only returned regular members, excluding Staff and RNAV members.
Solution:
Update query to include all valid member types using .in():
// OLD (incorrect) - only returns regular members
const { data: memberData } = await supabase
.from('members')
.select('*')
.eq('discordid', userId)
.eq('membertypeid', 6) // ❌ Only regular members
.single();
// NEW (correct) - returns all member types
const { data: memberData } = await supabase
.from('members')
.select('*')
.eq('discordid', userId)
.in('membertypeid', [6, 8, 10]) // ✅ Regular, Staff, and RNAV
.single();Verification:
# Test /status command for each member type
# In Discord, as a Staff member:
/status
# Should show:
# ✅ Verified Member
# Member Type: NATCA Staff
# Roles: NATCA Staff
# As RNAV member:
/status
# Should show:
# ✅ Verified Member
# Member Type: RNAV Member
# Roles: RNAV MemberIssue: Staff Position Nicknames Not Showing Correctly
Symptoms:
- Staff members don't get "(Staff)" in nickname
- Chief of Staff not showing "(Chief of Staff)" designation
- Staff role assigned but nickname uses facility format
Root Cause:
setMemberNickname function not receiving positions parameter or not detecting Staff positions.
Solution:
Ensure setMemberNickname receives positions and checks for Staff:
// Correct implementation
async setMemberNickname(member, memberData, positions) {
const { firstname, lastname } = memberData;
// Check for Chief of Staff position
const isChiefOfStaff = positions?.some(p =>
p.positiontype === 'staff' &&
p.position.toLowerCase().includes('chief of staff')
);
// Check for any staff position
const hasStaffPosition = positions?.some(p => p.positiontype === 'staff');
let nickname;
if (isChiefOfStaff) {
nickname = `${firstname} ${lastname} (Chief of Staff)`;
} else if (hasStaffPosition) {
nickname = `${firstname} ${lastname} (Staff)`;
} else {
// Regular member nickname logic
nickname = `${firstname} ${lastname} (${region}/${facility})`;
}
await member.setNickname(nickname);
}Verification:
# Check Staff member nickname in Discord
# Should show:
# - "John Doe (Chief of Staff)" for Chief of Staff
# - "Jane Smith (Staff)" for other staff positions
# Check logs for nickname assignment
grep -i "staff.*nickname" logs/discord-bot.logTesting Staff and RNAV Support
Test Suite:
# Run command tests including Staff/RNAV scenarios
npm test
# Look for these test cases:
# ✓ /status command works for NATCA Members (membertypeid=6)
# ✓ /status command works for Staff (membertypeid=8)
# ✓ /status command works for RNAV (membertypeid=10)
# Run with verbose output for details
npm run test:verboseManual Testing:
- Create test Staff member in database (membertypeid=8)
- Create test RNAV member in database (membertypeid=10)
- Test
/statuscommand with each member type - Verify role assignment includes "NATCA Staff" and "RNAV Member"
- Check nicknames match position types
Role Assignment Problems
Symptoms
- Members verified but roles not assigned
- Incorrect roles assigned
- Role assignment errors in logs
- Staff role not assigned to Staff members
Diagnostic Steps
# 1. Check bot role hierarchy
# Bot role must be above all managed roles in Discord
# 2. Test role assignment manually
node scripts/test-role-assignment.js
# 3. Check member data
curl -H "Authorization: Bearer $AUTH0_TOKEN" \
"https://api.mynatca.org/api/members/123456"
# 4. Verify role mapping configuration
npm run test:role-mappingSolutions
-
Fix Role Hierarchy
- Move bot role above managed roles in Discord
- Ensure bot has "Manage Roles" permission
-
Update Role Mapping
// lib/roleManager.js const positionRoleMap = { 'facrep': 'FacRep', 'comchair': 'Committee Member', 'neb': 'NEB', 'regional': 'Regional Rep' }; -
Refresh Member Roles
# Refresh specific member curl -X POST \ -H "Authorization: Bearer $SERVICE_TOKEN" \ -H "Content-Type: application/json" \ -d '{"discord_id": "123456789012345678"}' \ https://discord.mynatca.org/api/discord/roles/refresh
Auth0 Integration Issues
Critical: "secret" is required Error
Symptoms
- Application fails to start with error:
"secret" is required - Deployment succeeds but runtime errors occur
- Auth0 authentication endpoints return 500 errors
Root Cause
The NextJS Auth0 SDK requires AUTH0_SECRET environment variable for session encryption. This is NOT the same as AUTH0_CLIENT_SECRET.
Immediate Solution
# 1. Generate AUTH0_SECRET (32+ characters required)
AUTH0_SECRET=$(openssl rand -hex 32)
# 2. Add to Vercel environment variables
vercel env add AUTH0_SECRET production
# 3. Set the value when prompted
# Example: a1b2c3d4e5f6789012345678901234567890abcdef1234567890
# 4. Redeploy application
vercel --prodVerification
# Check that AUTH0_SECRET is set
vercel env ls | grep AUTH0_SECRET
# Test deployment
curl -I https://discord.mynatca.org/api/auth/login
# Should return 302 redirect, not 500 errorAUTH0_CLIENT_SECRET vs AUTH0_SECRET
Important Distinction
These are two different secrets with different purposes:
| Variable | Purpose | Source | Format |
|---|---|---|---|
AUTH0_CLIENT_SECRET | OAuth2 flow authentication | Auth0 Dashboard → App Settings | Base64/string from Auth0 |
AUTH0_SECRET | Session cookie encryption | Generated by developer | 32+ random hex characters |
Common Mistake
# ❌ WRONG - Using client secret as session secret
AUTH0_SECRET=your_auth0_client_secret_from_dashboard
# ✅ CORRECT - Using generated random secret
AUTH0_SECRET=a1b2c3d4e5f6789012345678901234567890abcdef1234567890Authentication Failures
Symptoms
- "Login Required" errors
- JWT token validation failures
- Redirect loops during login
- Session not persisting after login
Diagnostic Steps
# 1. Verify all Auth0 environment variables are set
echo "AUTH0_DOMAIN: $AUTH0_DOMAIN"
echo "AUTH0_CLIENT_ID: $AUTH0_CLIENT_ID"
echo "AUTH0_CLIENT_SECRET: [hidden]"
echo "AUTH0_SECRET: [hidden]"
echo "AUTH0_BASE_URL: $AUTH0_BASE_URL"
echo "AUTH0_ISSUER_BASE_URL: $AUTH0_ISSUER_BASE_URL"
# 2. Test Auth0 connectivity
curl https://natca-prod.us.auth0.com/.well-known/jwks.json
# 3. Check Auth0 endpoints
curl -I https://discord.mynatca.org/api/auth/login
curl -I https://discord.mynatca.org/api/auth/callback
# 4. Validate callback URLs in Auth0 dashboardSolutions
-
Missing AUTH0_SECRET
# Generate and set AUTH0_SECRET openssl rand -hex 32 # Add to environment variables -
Update Callback URLs
- Check Auth0 dashboard settings
- Ensure production URLs are configured:
- Callback URL:
https://discord.mynatca.org/api/auth/callback - Logout URL:
https://discord.mynatca.org/api/auth/logout
- Callback URL:
- Update development URLs for local testing
-
Refresh Auth0 Secrets
# Generate new session secret (different for each environment) AUTH0_SECRET_STAGING=$(openssl rand -hex 32) AUTH0_SECRET_PRODUCTION=$(openssl rand -hex 32) # Update environment variables vercel env add AUTH0_SECRET staging vercel env add AUTH0_SECRET production -
Fix Domain Configuration
# Ensure consistent domain configuration AUTH0_DOMAIN=natca-prod.us.auth0.com AUTH0_ISSUER_BASE_URL=https://natca-prod.us.auth0.com AUTH0_BASE_URL=https://discord.mynatca.org
Session Persistence Issues (NEW - October 2025)
Issue: Sessions Not Persisting After Login on Production
Symptoms:
- User successfully authenticates with Auth0
- Immediately logged out after redirect
- Session cookie not being set in browser
- Works fine on localhost but fails on production (Digital Ocean, nginx, etc.)
Root Cause:
Platform deployed behind a reverse proxy without trust proxy configuration. Express doesn't recognize HTTPS from X-Forwarded-Proto header and fails to set secure cookies.
Diagnostic Steps:
# 1. Check if running behind reverse proxy
# Look for X-Forwarded-* headers in request
curl -I https://platform.natca.org/api/health | grep -i x-forwarded
# 2. Test cookie being set
curl -c cookies.txt -v https://platform.natca.org/api/auth/login 2>&1 | grep -i set-cookie
# Expected: Set-Cookie: platform.session=...; Secure; HttpOnly
# If missing 'Secure' flag, trust proxy not configured
# 3. Check Express trust proxy setting
# In server.js logs, look for req.protocol and req.secure valuesSolution: Enable Trust Proxy (REQUIRED for Production)
Add this line to server.js BEFORE session middleware:
// server.js
const express = require('express');
const app = express();
// CRITICAL: Trust first proxy (Digital Ocean, nginx, etc.)
app.set('trust proxy', 1);
// Now Express recognizes X-Forwarded-Proto: https
app.use(session({
cookie: {
secure: process.env.NODE_ENV === 'production' // Works correctly now
}
}));Why This Works:
- Reverse proxy adds
X-Forwarded-Proto: httpsheader - Express (without trust proxy) ignores it, sees HTTP
- Secure cookie requires HTTPS, fails to set
- With trust proxy enabled, Express recognizes HTTPS
- Secure cookie sets correctly, session persists
Environment-Specific Configuration:
// Digital Ocean App Platform
app.set('trust proxy', 1);
// nginx reverse proxy
app.set('trust proxy', 1);
// Multiple proxies (nginx → load balancer → app)
app.set('trust proxy', 2); // Trust first 2 proxies
// Trust specific proxy IP
app.set('trust proxy', '127.0.0.1');Issue: Redis Connection Failures
Symptoms:
- Intermittent session failures
- "Redis connection timeout" errors
- Sessions lost randomly
Solution:
// Robust Redis configuration with reconnection
const redis = require('redis');
const redisClient = redis.createClient({
url: process.env.REDIS_URL,
socket: {
connectTimeout: 10000,
commandTimeout: 5000,
reconnectStrategy: (retries) => {
if (retries > 10) {
logger.error('Redis max retries exceeded');
return new Error('Max retries reached');
}
// Exponential backoff: 100ms, 200ms, 400ms, etc.
return Math.min(retries * 100, 3000);
}
}
});
// Error handling
redisClient.on('error', (err) => {
logger.error('Redis client error', err);
});
redisClient.on('connect', () => {
logger.info('Redis client connected');
});Session Issues (General)
Symptoms
- Users logged out immediately after login
- Session cookies not persisting
- "Invalid session" errors
Diagnostic Steps
# 1. Check session configuration
echo "SESSION_SECRET: [hidden]"
echo "REDIS_URL: $REDIS_URL"
echo "NODE_ENV: $NODE_ENV"
# 2. Test session creation
curl -c cookies.txt https://platform.natca.org/api/auth/login
curl -b cookies.txt https://platform.natca.org/api/auth/session
# 3. Check cookie security settings
# Inspect browser dev tools → Application → Cookies
# Verify: Secure, HttpOnly, SameSite flags present
# 4. Test Redis connection
redis-cli -u $REDIS_URL pingSolutions
-
Enable Trust Proxy (Most Common Issue)
// Add to server.js BEFORE session middleware app.set('trust proxy', 1); -
Ensure Unique SESSION_SECRET per Environment
# Different secrets prevent cookie conflicts openssl rand -base64 32 # Generate new secret # Use different secrets for staging and production -
Configure Session Settings
app.use(session({ store: new RedisStore({ client: redisClient }), secret: process.env.SESSION_SECRET, name: 'platform.session', resave: false, saveUninitialized: false, rolling: true, // Refresh TTL on each request cookie: { maxAge: 7 * 24 * 60 * 60 * 1000, // 7 days httpOnly: true, secure: process.env.NODE_ENV === 'production', sameSite: 'lax' } })); -
Check Cookie Domain for Cross-Subdomain Sessions
cookie: { domain: '.natca.org', // Note the leading dot path: '/' }
Management API Issues
Symptoms
- "Insufficient scope" errors
- User creation/update failures
- Metadata sync issues
Diagnostic Steps
# 1. Test Management API token
curl -X GET \
-H "Authorization: Bearer $M2M_TOKEN" \
"https://natca-prod.us.auth0.com/api/v2/users?per_page=1"
# 2. Check granted scopes
node -e "
const ManagementClient = require('auth0').ManagementClient;
const client = new ManagementClient({
domain: process.env.AUTH0_DOMAIN,
clientId: process.env.AUTH0_M2M_CLIENT_ID,
clientSecret: process.env.AUTH0_M2M_CLIENT_SECRET
});
client.getUsers({ per_page: 1 }).then(console.log).catch(console.error);
"Solutions
-
Update M2M Application Scopes
- Go to Auth0 Dashboard > APIs > Management API
- Select your M2M application
- Grant required scopes:
read:usersupdate:userscreate:usersupdate:user_metadataupdate:user_app_metadata
-
Regenerate M2M Credentials
- Create new M2M application if needed
- Update environment variables with new credentials
Database Issues
Supabase Connection Problems
Symptoms
- Database timeout errors
- Connection refused errors
- SSL certificate issues
Diagnostic Steps
# 1. Test Supabase connectivity
curl -H "apikey: $SUPABASE_KEY" \
"$SUPABASE_URL/rest/v1/members?select=count"
# 2. Check Supabase project status
npx supabase status --linked
# 3. Verify connection string
node -e "
const { createClient } = require('@supabase/supabase-js');
const client = createClient(process.env.SUPABASE_URL, process.env.SUPABASE_KEY);
client.from('members').select('count').then(console.log).catch(console.error);
"Solutions
-
Update Supabase URL/Key
- Check Supabase dashboard for correct values
- Ensure using service key for server-side operations
-
Fix RLS Policies
-- Enable RLS on tables ALTER TABLE members ENABLE ROW LEVEL SECURITY; -- Create service role policy CREATE POLICY "Service role access" ON members FOR ALL USING (auth.role() = 'service_role'); -
Reset Database Connection
# Restart application pm2 restart all # Clear connection pool npm run db:reset-connections
MySQL Sync Issues
Symptoms
- Sync failures or timeouts
- "Connection lost" errors
- Data inconsistencies
Diagnostic Steps
# 1. Test MySQL connectivity
node -e "
const mysql = require('mysql2/promise');
mysql.createConnection({
host: process.env.MYSQL_HOST,
user: process.env.MYSQL_USER,
password: process.env.MYSQL_PASS,
database: process.env.MYSQL_DB
}).then(conn => {
console.log('✅ MySQL connected');
return conn.end();
}).catch(console.error);
"
# 2. Check sync status
npm run sync health
# 3. Run sync validation
npm run sync validate
# 4. Check data counts
npm run sync:compare-countsSolutions
-
Fix Connection Settings
# Update MySQL configuration MYSQL_TIMEOUT=60000 MYSQL_POOL_SIZE=10 MYSQL_SSL=true -
Optimize Sync Parameters
# Reduce batch size for slow connections npm run sync sync-all --batch-size=500 # Increase retry count npm run sync sync-all --retries=5 -
Manual Sync Recovery
# Clear stuck sync status npm run sync:clear-status # Restart sync process npm run sync sync-all --force
Data Synchronization Issues (Updated October 2025)
Sync Process Failures
Symptoms
- Sync gets stuck in "running" state
- High failure rates
- Data inconsistencies between systems
- Production sync writing to dev schema instead of public schema
Diagnostic Steps
# 1. Check sync health
npm run sync health --json
# 2. Review sync logs - verify target schema
grep -i "syncing to" logs/sync.log | tail -50
# 3. Check database locks
npm run db:check-locks
# 4. Verify data integrity
npm run sync verify-data
# 5. Verify target schema in Supabase
psql -c "SELECT schemaname, tablename, n_live_tup FROM pg_stat_user_tables WHERE schemaname IN ('public', 'dev');"Solutions
-
Production Sync Targeting Wrong Schema
Symptom: Running
node sync/sync-all.js positions --env=prodbut data appears indev.positionsinstead ofpublic.positionsRoot Cause: Sync script not properly detecting or respecting
--env=prodflagSolution:
# 1. Verify script has environment detection # Check sync-positions.js for: # const env = process.argv.includes('--env=prod') ? 'prod' : 'dev'; # const targetSchema = env === 'prod' ? 'public' : 'dev'; # 2. Look for schema confirmation in output node sync/sync-all.js positions --env=prod # Should display: "🎯 Syncing to public schema" # 3. If missing, update script to match pattern from sync-teams.js # See platform/sync/sync-teams.js for reference implementation -
Missing Production Schema Columns
Symptom: Sync fails with "column does not exist" errors
Common Missing Columns:
public.positions.enddate- End date for positions- Unique constraint on
(membernumber, positiontype)
Solution:
# Run required migrations first cd platform psql -f migrations/add_positions_enddate.sql psql -f migrations/add_positions_unique_constraint.sql # Verify migrations applied psql -c "\d public.positions" # Then retry sync node sync/sync-all.js positions --env=prod -
Dependency Sync Failures
Symptom: Foreign key constraint violations during sync
Cause: Syncing dependent tables before base tables
Solution:
# Wrong: Skip deps on first sync node sync/sync-all.js positions --skip-deps --env=prod # May fail # Correct: Full sync respects dependencies node sync/sync-all.js --env=prod # Or sync dependencies first node sync/sync-all.js members --env=prod node sync/sync-all.js positions --env=prod -
Using --skip-deps Flag Correctly
When to use
--skip-deps:- Re-syncing after initial full sync completed
- Testing individual sync scripts
- Quick updates when dependencies haven't changed
When NOT to use
--skip-deps:- First-time environment setup
- After schema migrations affecting multiple tables
- When foreign key relationships changed
Example:
# Initial setup - do NOT use --skip-deps node sync/sync-all.js --env=prod # Later, quick position re-sync - OK to use --skip-deps node sync/sync-all.js positions --skip-deps --env=prod -
Clear Stuck Sync
# Reset sync metadata npm run sync:reset-metadata # Force restart sync npm run sync sync-all --force -
Fix Data Inconsistencies
# Compare record counts between schemas psql -c "SELECT 'public.members' as table, COUNT(*) FROM public.members UNION ALL SELECT 'dev.members', COUNT(*) FROM dev.members;" # Resync specific table node sync/sync-all.js members --env=prod -
Optimize Sync Performance
# Adjust batch sizes in sync scripts SYNC_BATCH_SIZE=500 npm run sync sync-all # Use --skip-deps for faster individual table syncs node sync/sync-all.js positions --skip-deps --env=prod
New Sync Commands (October 2025)
Teams Sync
Sync committees and councils data:
# Development
node sync/sync-all.js teams
# Production
node sync/sync-all.js teams --env=prodIndividual Table Sync with Environment Flag (Enhanced October 2025)
All sync commands now support --env=prod flag correctly:
# Positions sync (fixed October 2025 to respect --env=prod)
node sync/sync-all.js positions --env=prod
# Output should show: "🎯 Syncing to public schema"
# Members sync
node sync/sync-all.js members --env=prod
# Teams sync (committees and councils)
node sync/sync-all.js teams --env=prod
# Facilities sync
node sync/sync-all.js facilities --env=prod
# Regions sync
node sync/sync-all.js regions --env=prod
# Fast re-sync with --skip-deps flag (New October 2025)
node sync/sync-all.js positions --skip-deps --env=prod
# Skips dependency syncs, only updates positions tableKey Improvement (October 2025):
The positions sync script was fixed to properly respect the --env=prod flag. Previously, it would always sync to the dev schema regardless of the environment flag. Now it correctly syncs to the public schema when --env=prod is specified.
Verification: Look for this log message to confirm correct schema targeting:
🎯 Syncing to public schemaIf you see "🎯 Syncing to dev schema" when using --env=prod, the script needs to be updated.
Verify Sync Results
# Check record counts in production
psql -c "SELECT COUNT(*) FROM public.members;"
psql -c "SELECT COUNT(*) FROM public.positions;"
psql -c "SELECT COUNT(*) FROM public.teams;"
# Compare dev vs production counts
psql -c "
SELECT
'members' as table,
(SELECT COUNT(*) FROM public.members) as prod,
(SELECT COUNT(*) FROM dev.members) as dev
UNION ALL
SELECT
'positions',
(SELECT COUNT(*) FROM public.positions),
(SELECT COUNT(*) FROM dev.positions);
"Network and Connectivity Issues
External Service Timeouts
Symptoms
- Timeout errors when connecting to Auth0/Discord/Supabase
- Intermittent connection failures
- SSL/TLS handshake failures
Diagnostic Steps
# 1. Test network connectivity
ping discord.com
ping auth0.com
ping supabase.co
# 2. Check DNS resolution
nslookup discord.mynatca.org
nslookup natca-prod.us.auth0.com
# 3. Test SSL connectivity
openssl s_client -connect discord.com:443 -servername discord.com
# 4. Check firewall/proxy settings
curl -v https://discord.com/api/v10/gatewaySolutions
-
Increase Timeout Values
HTTP_TIMEOUT=30000 AUTH0_TIMEOUT=30000 DISCORD_TIMEOUT=30000 -
Configure Retry Logic
// lib/http-client.js const retryConfig = { retries: 3, retryDelay: 1000, retryCondition: (error) => { return error.code === 'ECONNRESET' || error.code === 'ETIMEDOUT'; } }; -
Check Proxy Configuration
HTTP_PROXY=http://proxy.company.com:8080 HTTPS_PROXY=http://proxy.company.com:8080 NO_PROXY=localhost,127.0.0.1,.internal
Performance Optimization
Memory Issues
Symptoms
- Out of memory errors
- Gradual memory increase
- Application crashes
Diagnostic Steps
# 1. Monitor memory usage
node --inspect app.js
# Open chrome://inspect in Chrome
# 2. Check for memory leaks
npm run test:memory-leak
# 3. Analyze heap dump
node --heapdump app.jsSolutions
-
Optimize Batch Processing
// Process smaller batches const BATCH_SIZE = 500; // Reduce from 1000 // Clear references after processing batch = null; if (global.gc) global.gc(); -
Implement Connection Pooling
// Configure database connection pools const poolConfig = { max: 10, min: 2, acquireTimeoutMillis: 30000, idleTimeoutMillis: 600000 };
CPU Optimization
Symptoms
- High CPU usage
- Slow response times
- Request timeouts
Solutions
-
Implement Caching
// Cache member data const NodeCache = require('node-cache'); const memberCache = new NodeCache({ stdTTL: 300 }); // 5 minutes -
Use Worker Threads
// Offload heavy processing const { Worker, isMainThread, parentPort } = require('worker_threads'); if (isMainThread) { const worker = new Worker(__filename); worker.postMessage(data); }
Security Issues
Token Validation Failures
Symptoms
- "Invalid token" errors
- "Token expired" messages
- Authentication bypasses
Solutions
-
Implement Proper Token Validation
// Verify JWT tokens properly const jwt = require('jsonwebtoken'); const jwksClient = require('jwks-rsa'); const client = jwksClient({ jwksUri: `https://${process.env.AUTH0_DOMAIN}/.well-known/jwks.json` }); -
Secure Environment Variables
# Use strong secrets AUTH0_SECRET=$(openssl rand -base64 32) SESSION_SECRET=$(openssl rand -base64 32) # Rotate secrets regularly npm run rotate:secrets
Recovery Procedures
Disaster Recovery
Data Recovery
# 1. Restore from backup
npm run restore:database --date=2023-01-01
# 2. Verify data integrity
npm run verify:data-integrity
# 3. Restart all services
pm2 restart all
# 4. Run health checks
npm run health:full-checkService Recovery
# 1. Check service status
pm2 status
# 2. Restart failed services
pm2 restart mynatca-discord-bot
pm2 restart mynatca-discord-web
# 3. Verify functionality
curl https://discord.mynatca.org/api/health
npm run test:integrationRollback Procedures
Application Rollback
# 1. Identify previous working deployment
vercel deployments list
# 2. Promote previous deployment
vercel promote <deployment-url>
# 3. Update DNS if needed
# 4. Verify rollback successDatabase Rollback
# 1. Stop sync processes
pm2 stop sync-scheduler
# 2. Restore database
npm run db:restore --backup=backup_2023_01_01
# 3. Restart services
pm2 restart allThis comprehensive troubleshooting guide provides systematic approaches to identifying and resolving issues across the MyNATCA platform.