ADR-020: Background Jobs & Scheduled Tasks

Status: Planning Owner: @bilal @deen Date: 2026-02-15

Why This Needs an ADR

Multiple accepted ADRs assume scheduled jobs exist but none defines the infrastructure:

ADR	Job Needed	Frequency
ADR-011 Regulatory Compliance	Data retention cleanup (purge expired messages, media)	Daily
ADR-011 Regulatory Compliance	Document expiry alerts (30-day warning)	Daily
ADR-013 Event-Driven Architecture	Retry failed event publishes	Every minute
ADR-010 Tenant Engine	AI summary generation for stale conversations	Every 4 hours
ADR-010 Tenant Engine	Conversation archival (inactive 24h)	Daily
Future	SLA breach detection (if not using Temporal)	Every 15 min

The CI/CD doc shows Vercel Cron config but this hasn’t been formally decided.

Options

Option A: Vercel Cron Jobs

// vercel.json
{
  "crons": [
    { "path": "/api/cron/cleanup", "schedule": "0 2 * * *" },
    { "path": "/api/cron/summaries", "schedule": "0 */4 * * *" },
    { "path": "/api/cron/retry-events", "schedule": "* * * * *" },
    { "path": "/api/cron/doc-expiry", "schedule": "0 9 * * *" }
  ]
}

Pros	Cons
Zero infrastructure — runs on existing Vercel deployment	10s execution limit on Hobby, 60s on Pro
Cron syntax, reliable scheduling	No retry on failure (must handle in code)
Already in the stack	Can’t run sub-minute intervals on Hobby
Serverless — no idle cost	Cold starts add latency

Option B: Supabase pg_cron + Edge Functions

-- pg_cron for database-level jobs
SELECT cron.schedule('cleanup', '0 2 * * *', $$
  DELETE FROM messages WHERE created_at < NOW() - INTERVAL '90 days';
$$);

Edge Functions for application-level jobs (retry events, summaries).

Pros	Cons
Database jobs run at DB level (fast, no HTTP overhead)	Two systems (pg_cron + Edge Functions)
Edge Functions have 150s timeout	Less visibility than Vercel dashboard
No cold start for pg_cron	pg_cron requires Supabase Pro plan

Option C: Hybrid (Recommended thinking)

pg_cron for pure SQL jobs: retention cleanup, partition management
Vercel Cron for application jobs: event retry, summary generation, expiry alerts
Both hit the same database

This avoids adding infrastructure while using each tool where it’s strongest.

Key Design Questions

1. Failure Handling

What happens when a cron job fails?

Retry automatically? (Vercel doesn’t retry)
Log and alert? (Need ADR-021 Observability)
Idempotency — can the job safely run twice?

2. Job Locking

If a job takes longer than the interval:

Vercel Cron can trigger overlapping invocations
Need advisory lock or “last run” check

// Simple lock pattern
const lock = await prisma.cronLock.findFirst({ where: { job: 'cleanup' } })
if (lock && lock.startedAt > new Date(Date.now() - 30 * 60 * 1000)) {
  return // Still running
}

3. Monitoring

How do we know jobs are running?

Log completion to a cron_runs table?
Alert if a job hasn’t run in expected window?
Dashboard view of job history?

4. The Retry Publisher Problem

ADR-013’s event retry needs to run every minute. Vercel Hobby plan only supports hourly minimum. Options:

Accept hourly retry (events delayed up to 1 hour on failure)
Use pg_cron for minute-level retry
Make the async publish more reliable (reduce need for retry)

Minimum Viable Approach

For pre-deployment / MVP:

Vercel Cron for daily/hourly jobs
In-app error handling with console.error (no retry infrastructure)
Manual monitoring via Vercel logs

Add pg_cron and proper monitoring as the product scales.

EHQ Brain

Explorer

ADR-020 Background Jobs & Scheduled Tasks

ADR-020: Background Jobs & Scheduled Tasks

Why This Needs an ADR

Options

Option A: Vercel Cron Jobs

Option B: Supabase pg_cron + Edge Functions

Option C: Hybrid (Recommended thinking)

Key Design Questions

1. Failure Handling

2. Job Locking

3. Monitoring

4. The Retry Publisher Problem

Minimum Viable Approach

Graph View

Table of Contents

Backlinks

EHQ Brain

Explorer

ADR-020 Background Jobs & Scheduled Tasks

ADR-020: Background Jobs & Scheduled Tasks

Why This Needs an ADR

Options

Option A: Vercel Cron Jobs

Option B: Supabase pg_cron + Edge Functions

Option C: Hybrid (Recommended thinking)

Key Design Questions

1. Failure Handling

2. Job Locking

3. Monitoring

4. The Retry Publisher Problem

Minimum Viable Approach

Related

Graph View

Table of Contents

Backlinks