Plan: Issue Lifecycle Workflows & Conversation Actors

Status: Draft Created: 2026-04-15 Depends on: multi-issue-conversations.md (Phase 3+) Feature flags: automated_reminders (future), workflow_coordination (future)

Problem Statement

Issue lifecycle automation is entirely manual or cron-based today. Once an issue is created, there’s no automated orchestration — no SLA enforcement, no vendor follow-up, no tenant status updates, no escalation chains. The landlord sees metrics after the fact (slaBreachCount in analytics) but nothing proactively prevents breaches.

Current state:

  • Issue created → email notification → nothing automated happens
  • SLA targets defined (EMERGENCY: 1d, HIGH: 3d, MEDIUM: 7d, LOW: 14d) but only measured, never enforced
  • Vendor management is a stub (no tables, no assignment logic, no notifications)
  • Three existing crons (rent chase, doc expiry, weekly digest) but none for issues
  • No follow-up reminders to tenants
  • No vendor acceptance tracking
  • No automated escalation on SLA breach

What this plan adds:

  • Durable, multi-step issue lifecycle orchestration via Cloudflare Workflows
  • SLA enforcement with automated escalation
  • Vendor assignment → acceptance → completion flow
  • Tenant follow-up notifications (proactive status updates)
  • Conversation Actor concept for real-time per-conversation state (future)

Architecture Decision: Why Cloudflare Workflows

OptionFitVerdict
TemporalNeeds persistent server + workers. We’re on CF Workers (stateless edge). Would require a second platform.No
Vercel Workflow (AI SDK)We’re not on Vercel. Solves LLM chaining, not multi-day orchestration.No
BullMQ / pg-bossNeeds a persistent Node.js process or Redis. Not native to CF Workers.No
More cronsWould work but brittle — polling every X minutes, no event-driven triggers, no durable state. Already have 3 crons.Fallback only
CF WorkflowsNative to our platform. Durable execution, sleep up to 365 days (free — no CPU cost), waitForEvent for vendor webhooks, automatic retries. No infra to manage.Yes

Deployment: Separate Worker (envo-orchestrator)

The dashboard runs on OpenNext with a 10 MB bundle limit. Workflows and Durable Objects should live in a separate Worker to avoid bundle pressure and maintain separation of concerns.

envo-dashboard (CF Worker)          envo-orchestrator (CF Worker)
├── Next.js App Router              ├── IssueLifecycleWorkflow
├── GraphQL API                     ├── ConversationActor (future)
├── Webhook handlers                └── Shared Hyperdrive binding
└── Cross-script bindings ──────────►

The dashboard triggers workflows via cross-script binding. The orchestrator reads/writes to the same Supabase database via its own Hyperdrive connection.


Part 1: Issue Lifecycle Workflow

Workflow Definition

// envo-orchestrator/src/workflows/issue-lifecycle.ts
 
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from "cloudflare:workers"
 
interface IssueWorkflowParams {
  issueId: string
  organisationId: string
  propertyId: string
  tenantId: string | null
  tenantPhone: string | null
  category: string
  urgency: 'LOW' | 'MEDIUM' | 'HIGH' | 'EMERGENCY'
  description: string
  conversationId: string | null
}
 
interface Env {
  ISSUE_WORKFLOW: Workflow<IssueWorkflowParams>
  HYPERDRIVE: Hyperdrive
}
 
// SLA deadlines in milliseconds
const SLA_DEADLINES: Record<string, number> = {
  EMERGENCY: 1 * 24 * 60 * 60 * 1000,   // 1 day
  HIGH:      3 * 24 * 60 * 60 * 1000,    // 3 days
  MEDIUM:    7 * 24 * 60 * 60 * 1000,    // 7 days
  LOW:       14 * 24 * 60 * 60 * 1000,   // 14 days
}
 
export class IssueLifecycleWorkflow extends WorkflowEntrypoint<Env, IssueWorkflowParams> {
  async run(event: WorkflowEvent<IssueWorkflowParams>, step: WorkflowStep) {
    const { issueId, urgency, tenantPhone, organisationId } = event.payload
    const slaDeadlineMs = SLA_DEADLINES[urgency]
    const createdAt = event.timestamp.getTime()
    const slaDeadline = new Date(createdAt + slaDeadlineMs)
 
    // ─── Step 1: Acknowledge to tenant (if via conversation) ───
    if (tenantPhone) {
      await step.do("tenant-ack", {
        retries: { limit: 3, delay: "5 seconds", backoff: "exponential" },
        timeout: "30 seconds",
      }, async () => {
        await sendSMS(tenantPhone, `Your issue has been logged (ref: ${issueId.slice(0,8)}). We'll keep you updated.`)
      })
    }
 
    // ─── Step 2: Auto-assign vendor (if vendor management enabled) ───
    const vendor = await step.do("auto-assign-vendor", async () => {
      return await tryAutoAssignVendor(this.env, event.payload)
    })
 
    if (vendor) {
      // ─── Step 3a: Notify vendor, wait for acceptance ───
      await step.do("notify-vendor", {
        retries: { limit: 3, delay: "10 seconds", backoff: "exponential" },
      }, async () => {
        await notifyVendor(vendor, event.payload)
      })
 
      // Wait for vendor to accept (or timeout)
      const acceptanceTimeout = urgency === 'EMERGENCY' ? "2 hours" : "24 hours"
      try {
        const acceptance = await step.waitForEvent("vendor-acceptance", {
          type: "vendor-accepted",
          timeout: acceptanceTimeout,
        })
 
        await step.do("record-acceptance", async () => {
          await recordIssueEvent(this.env, issueId, 'VENDOR_ACCEPTED', {
            vendorId: vendor.id,
            eta: acceptance.payload?.eta,
          })
        })
 
        // Notify tenant of vendor assignment
        if (tenantPhone) {
          await step.do("tenant-vendor-update", async () => {
            await sendSMS(tenantPhone,
              `Good news — a ${vendor.specialty} has been assigned to your issue and will be in touch.`)
          })
        }
      } catch {
        // Vendor didn't respond — escalate to landlord
        await step.do("vendor-timeout-escalate", async () => {
          await notifyLandlord(this.env, organisationId, {
            type: 'vendor_no_response',
            issueId,
            vendorName: vendor.name,
            message: `${vendor.name} did not respond within ${acceptanceTimeout}. Please assign manually.`,
          })
          await recordIssueEvent(this.env, issueId, 'NOTIFICATION_SENT', {
            type: 'vendor_timeout',
            vendorId: vendor.id,
          })
        })
      }
    }
 
    // ─── Step 4: SLA monitoring ───
    // Sleep until 75% of SLA deadline, then check
    const warningMs = Math.floor(slaDeadlineMs * 0.75)
    await step.sleep("sla-warning-wait", `${Math.floor(warningMs / 1000)} seconds`)
 
    const issueAtWarning = await step.do("check-at-sla-warning", async () => {
      return await getIssueStatus(this.env, issueId)
    })
 
    if (issueAtWarning.status !== 'COMPLETED' && issueAtWarning.status !== 'CANCELLED') {
      // SLA warning — notify landlord
      await step.do("sla-warning-notify", async () => {
        const remaining = slaDeadline.getTime() - Date.now()
        const hoursLeft = Math.floor(remaining / (60 * 60 * 1000))
        await notifyLandlord(this.env, organisationId, {
          type: 'sla_warning',
          issueId,
          message: `Issue approaching SLA deadline — ${hoursLeft} hours remaining. Urgency: ${urgency}.`,
        })
      })
 
      // Sleep until SLA deadline
      const remainingMs = slaDeadlineMs - warningMs
      await step.sleep("sla-deadline-wait", `${Math.floor(remainingMs / 1000)} seconds`)
 
      const issueAtDeadline = await step.do("check-at-sla-deadline", async () => {
        return await getIssueStatus(this.env, issueId)
      })
 
      if (issueAtDeadline.status !== 'COMPLETED' && issueAtDeadline.status !== 'CANCELLED') {
        // SLA BREACHED
        await step.do("sla-breach", async () => {
          await notifyLandlord(this.env, organisationId, {
            type: 'sla_breach',
            issueId,
            message: `SLA BREACHED — ${urgency} issue unresolved after ${SLA_DEADLINES[urgency] / (24*60*60*1000)} days.`,
          })
          await recordIssueEvent(this.env, issueId, 'NOTIFICATION_SENT', {
            type: 'sla_breach',
            urgency,
            deadlineExceededAt: new Date().toISOString(),
          })
        })
 
        // Notify tenant if they're waiting
        if (tenantPhone) {
          await step.do("tenant-delay-apology", async () => {
            await sendSMS(tenantPhone,
              `We're sorry your issue is taking longer than expected. Our team has been alerted and will prioritise this.`)
          })
        }
      }
    }
 
    // ─── Step 5: Post-resolution follow-up ───
    // Poll daily until resolved (up to 30 days)
    for (let day = 0; day < 30; day++) {
      const current = await step.do(`resolution-check-day-${day}`, async () => {
        return await getIssueStatus(this.env, issueId)
      })
 
      if (current.status === 'COMPLETED' || current.status === 'CANCELLED') {
        // Issue resolved — send satisfaction check
        if (tenantPhone && current.status === 'COMPLETED') {
          await step.do("tenant-resolution-notify", async () => {
            await sendSMS(tenantPhone,
              `Your ${event.payload.category.toLowerCase()} issue has been resolved. If you have any further problems, just reply to this number.`)
          })
        }
        // Workflow complete
        return { status: 'completed', resolvedAt: current.resolvedAt }
      }
 
      await step.sleep(`daily-wait-${day}`, "24 hours")
    }
 
    // 30 days without resolution — final escalation
    await step.do("stale-issue-escalation", async () => {
      await notifyLandlord(this.env, organisationId, {
        type: 'stale_issue',
        issueId,
        message: `Issue has been open for 30 days without resolution. Please review and close or reassign.`,
      })
    })
 
    return { status: 'stale', daysOpen: 30 }
  }
}

Triggering the Workflow

From the dashboard’s issue creation flow:

// In lib/tenant-engine/issue-creation.ts (after createIssueFromConversation)
// OR in lib/graphql/mutations/issue.ts (after manual issue creation)
 
async function triggerIssueLifecycle(issue: CreatedIssue, conversation?: ConversationContext) {
  // env.ISSUE_WORKFLOW is the cross-script binding to envo-orchestrator
  const instance = await env.ISSUE_WORKFLOW.create({
    id: `issue-${issue.id}`,
    params: {
      issueId: issue.id,
      organisationId: issue.organisationId,
      propertyId: issue.propertyId,
      tenantId: issue.tenantId,
      tenantPhone: conversation?.contactPhone ?? null,
      category: issue.category,
      urgency: issue.urgency,
      description: issue.description,
      conversationId: conversation?.id ?? null,
    },
  })
 
  // Store workflow instance ID on the issue for status queries
  await prisma.issue.update({
    where: { id: issue.id },
    data: { workflowInstanceId: instance.id },
  })
}

Sending Events to a Running Workflow

When a vendor accepts via webhook or dashboard action:

// In a vendor acceptance webhook handler or GraphQL mutation
async function onVendorAccepted(issueId: string, vendorId: string, eta?: string) {
  const issue = await prisma.issue.findUnique({ where: { id: issueId } })
  if (!issue?.workflowInstanceId) return
 
  const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
  await instance.sendEvent({
    type: "vendor-accepted",
    payload: { vendorId, eta },
  })
}

Querying Workflow Status

From the dashboard (issue detail page):

async function getIssueWorkflowStatus(issueId: string) {
  const issue = await prisma.issue.findUnique({ where: { id: issueId } })
  if (!issue?.workflowInstanceId) return null
 
  const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
  const status = await instance.status()
  // status.status: "queued" | "running" | "paused" | "waiting" | "complete" | "errored" | "terminated"
  return status
}

Cancelling a Workflow

When an issue is cancelled or manually resolved:

async function onIssueCancelled(issueId: string) {
  const issue = await prisma.issue.findUnique({ where: { id: issueId } })
  if (!issue?.workflowInstanceId) return
 
  const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
  await instance.terminate()
}

Part 2: Wrangler Configuration

New Worker: envo-orchestrator/wrangler.jsonc

{
  "$schema": "node_modules/wrangler/config-schema.json",
  "name": "envo-orchestrator",
  "main": "src/index.ts",
  "compatibility_date": "2025-05-05",
  "compatibility_flags": ["nodejs_compat"],
 
  "workflows": [
    {
      "name": "issue-lifecycle",
      "binding": "ISSUE_WORKFLOW",
      "class_name": "IssueLifecycleWorkflow",
      "limits": { "steps": 25000 }
    }
  ],
 
  "hyperdrive": [
    {
      "binding": "HYPERDRIVE",
      "id": "d6e9fe00b05b44ef99eae5ecacf07892"
    }
  ],
 
  "vars": {
    "ENVIRONMENT": "production"
  }
}

Dashboard bindings: envo-dashboard/wrangler.jsonc additions

{
  // ... existing config ...
 
  "workflows": [
    {
      "name": "issue-lifecycle",
      "binding": "ISSUE_WORKFLOW",
      "class_name": "IssueLifecycleWorkflow",
      "script_name": "envo-orchestrator"
    }
  ]
}

Part 3: Database Changes

Migration: 0XX_workflow_tracking.sql

-- ============================================================
-- Workflow tracking for issue lifecycle orchestration
-- ============================================================
 
-- 1. Add workflow instance ID to issues
ALTER TABLE issues
    ADD COLUMN workflow_instance_id TEXT;
 
CREATE INDEX idx_issues_workflow_instance
    ON issues(workflow_instance_id)
    WHERE workflow_instance_id IS NOT NULL;
 
-- 2. Extend IssueEventType with workflow events
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'SLA_WARNING';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'SLA_BREACH';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'VENDOR_TIMEOUT';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'TENANT_NOTIFIED';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'WORKFLOW_STARTED';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'WORKFLOW_COMPLETED';
 
-- 3. Vendor tables (prerequisite for auto-assignment)
CREATE TABLE vendors (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organisation_id UUID NOT NULL REFERENCES organisations(id) ON DELETE CASCADE,
    name            TEXT NOT NULL,
    email           TEXT,
    phone           VARCHAR(20),
    specialties     TEXT[] NOT NULL DEFAULT '{}',
    -- Specialties map to IssueCategory values:
    -- PLUMBING, ELECTRICAL, HEATING, STRUCTURAL, etc.
    is_active       BOOLEAN NOT NULL DEFAULT true,
    rating          DECIMAL(2,1),  -- 1.0-5.0
    notes           TEXT,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);
 
CREATE INDEX idx_vendors_org ON vendors(organisation_id);
CREATE INDEX idx_vendors_specialties ON vendors USING GIN(specialties);
CREATE INDEX idx_vendors_active ON vendors(organisation_id) WHERE is_active = true;
 
ALTER TABLE vendors ENABLE ROW LEVEL SECURITY;
 
CREATE POLICY vendors_org_access ON vendors
    FOR ALL
    USING (organisation_id::text = auth.jwt() ->> 'organisation_id');
 
-- 4. Vendor assignments (issue → vendor relationship)
CREATE TABLE vendor_assignments (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    issue_id        UUID NOT NULL REFERENCES issues(id) ON DELETE CASCADE,
    vendor_id       UUID NOT NULL REFERENCES vendors(id) ON DELETE CASCADE,
    status          TEXT NOT NULL DEFAULT 'PENDING'
        CHECK (status IN ('PENDING', 'ACCEPTED', 'DECLINED', 'COMPLETED', 'CANCELLED')),
    assigned_at     TIMESTAMPTZ NOT NULL DEFAULT now(),
    accepted_at     TIMESTAMPTZ,
    completed_at    TIMESTAMPTZ,
    eta             TEXT,           -- Vendor-provided ETA
    notes           TEXT,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
 
    UNIQUE (issue_id, vendor_id)
);
 
CREATE INDEX idx_vendor_assignments_issue ON vendor_assignments(issue_id);
CREATE INDEX idx_vendor_assignments_vendor ON vendor_assignments(vendor_id);
CREATE INDEX idx_vendor_assignments_status ON vendor_assignments(status);
 
ALTER TABLE vendor_assignments ENABLE ROW LEVEL SECURITY;
 
CREATE POLICY vendor_assignments_org_access ON vendor_assignments
    FOR ALL
    USING (
        EXISTS (
            SELECT 1 FROM issues i
            WHERE i.id = vendor_assignments.issue_id
            AND i.organisation_id::text = auth.jwt() ->> 'organisation_id'
        )
    );

Prisma additions (after pnpm db:pull)

model Vendor {
  id              String   @id @default(uuid()) @db.Uuid
  organisationId  String   @map("organisation_id") @db.Uuid
  name            String
  email           String?
  phone           String?  @db.VarChar(20)
  specialties     String[]
  isActive        Boolean  @default(true) @map("is_active")
  rating          Decimal? @db.Decimal(2, 1)
  notes           String?
  createdAt       DateTime @default(now()) @map("created_at")
  updatedAt       DateTime @default(now()) @updatedAt @map("updated_at")
 
  organisation    Organisation       @relation(fields: [organisationId], references: [id], onDelete: Cascade)
  assignments     VendorAssignment[]
 
  @@index([organisationId])
  @@map("vendors")
}
 
model VendorAssignment {
  id          String    @id @default(uuid()) @db.Uuid
  issueId     String    @map("issue_id") @db.Uuid
  vendorId    String    @map("vendor_id") @db.Uuid
  status      String    @default("PENDING")
  assignedAt  DateTime  @default(now()) @map("assigned_at")
  acceptedAt  DateTime? @map("accepted_at")
  completedAt DateTime? @map("completed_at")
  eta         String?
  notes       String?
  createdAt   DateTime  @default(now()) @map("created_at")
  updatedAt   DateTime  @default(now()) @updatedAt @map("updated_at")
 
  issue       Issue     @relation(fields: [issueId], references: [id], onDelete: Cascade)
  vendor      Vendor    @relation(fields: [vendorId], references: [id], onDelete: Cascade)
 
  @@unique([issueId, vendorId])
  @@map("vendor_assignments")
}

Part 4: Workflow Steps Breakdown

Step-by-step timeline for a MEDIUM urgency issue

Day 0, 00:00  ─── Issue created ───────────────────────────────
  │  Step: tenant-ack          → SMS: "Issue logged, ref: abc12345"
  │  Step: auto-assign-vendor  → Find vendor by category + location
  │  Step: notify-vendor       → SMS/email to vendor
  │  Step: waitForEvent        → Waiting for "vendor-accepted"
  │                               (timeout: 24 hours)
  │
Day 0, ~04:00  ─── Vendor accepts ─────────────────────────────
  │  Step: record-acceptance   → Issue event: VENDOR_ACCEPTED
  │  Step: tenant-vendor-update → SMS: "A plumber has been assigned"
  │  Step: sla-warning-wait    → sleep 5.25 days (75% of 7-day SLA)
  │
Day 5, 06:00  ─── SLA warning ─────────────────────────────────
  │  Step: check-at-sla-warning → Query issue status
  │  Step: sla-warning-notify   → Email/SMS to landlord:
  │                                "42 hours remaining on SLA"
  │  Step: sla-deadline-wait    → sleep 1.75 days
  │
Day 7, 00:00  ─── SLA deadline ────────────────────────────────
  │  Step: check-at-sla-deadline → Query issue status
  │  IF still open:
  │  Step: sla-breach           → Email/SMS to landlord: "SLA BREACHED"
  │  Step: tenant-delay-apology → SMS: "Sorry for the delay"
  │
Day 7+  ─── Daily resolution checks ──────────────────────────
  │  Step: resolution-check-day-N → Query status
  │  Step: daily-wait-N           → sleep 24 hours
  │  ... repeat until COMPLETED/CANCELLED or day 30
  │
Day 30  ─── Stale issue escalation ────────────────────────────
  │  Step: stale-issue-escalation → "Open 30 days, please review"
  │  Workflow complete.

Cost for this workflow

  • Steps: ~15-40 depending on resolution time (well within 25,000 limit)
  • CPU time: < 100ms total (all steps are lightweight DB queries + API calls)
  • Sleep time: Free (no CPU charged during hibernation)
  • Estimated cost per issue: < $0.001

At 3,000 properties with ~2 issues/property/month = 6,000 workflows/month → < $6/month total.


Part 5: Auto-Assignment Logic

// envo-orchestrator/src/vendor-assignment.ts
 
interface AssignmentResult {
  id: string
  name: string
  phone: string | null
  email: string | null
  specialty: string
}
 
async function tryAutoAssignVendor(
  env: Env,
  issue: IssueWorkflowParams
): Promise<AssignmentResult | null> {
  // 1. Check if org has vendor management enabled
  const orgFeatures = await getOrgFeatures(env, issue.organisationId)
  if (!orgFeatures.includes('contractor_coordination')) return null
 
  // 2. Find active vendors matching this category
  const vendors = await queryVendors(env, {
    organisationId: issue.organisationId,
    specialty: issue.category,
    isActive: true,
  })
 
  if (vendors.length === 0) return null
 
  // 3. Pick best vendor (round-robin, or by rating, or by availability)
  // Start simple: highest rated active vendor for this category
  const sorted = [...vendors].sort((a, b) => (b.rating ?? 0) - (a.rating ?? 0))
  const chosen = sorted[0]
 
  // 4. Create assignment record
  await createVendorAssignment(env, {
    issueId: issue.issueId,
    vendorId: chosen.id,
    status: 'PENDING',
  })
 
  // 5. Update issue status
  await updateIssueStatus(env, issue.issueId, 'VENDOR_ASSIGNED')
 
  return chosen
}

Part 6: Conversation Actor (Concept)

What it is

A Durable Object that acts as a stateful, single-threaded actor for each conversation. Instead of the current pattern (load state from Postgres on every webhook → process → write back), the Conversation Actor holds state in memory between messages, with SQLite-backed persistence for durability.

Why consider it

Current patternConversation Actor
Every message: query Postgres for conversation + messages + issuesState in memory — zero DB reads for hot-path decisions
Identity gating: DB read + write per stepState transitions happen in-memory, batched to storage
Race conditions: two webhooks for same conversation can interleaveSingle-threaded — messages processed strictly in order
Conversation timeout: cron-based or checked on next messageAlarm-based — DO wakes itself up after inactivity
No real-time presenceDO can hold WebSocket connections for live chat

How it would work

// envo-orchestrator/src/actors/conversation-actor.ts
 
import { DurableObject } from "cloudflare:workers"
 
interface ConversationActorState {
  conversationId: string
  tenantPhone: string
  organisationId: string
  propertyId: string | null
  tenantId: string | null
  identityStatus: 'UNIDENTIFIED' | 'IDENTIFIED' | 'CONFIRMED' | 'ACTIVE'
  issues: Array<{
    issueId: string
    category: string
    gatheringState: string
    handledBy: 'AI' | 'HUMAN'
    isActive: boolean
  }>
  messageCount: number
  lastMessageAt: number
  aiRouterActive: boolean
}
 
export class ConversationActor extends DurableObject<Env> {
  private state!: ConversationActorState
 
  constructor(ctx: DurableObjectState, env: Env) {
    super(ctx, env)
    // Load state from SQLite storage on wake
    this.ctx.blockConcurrencyWhile(async () => {
      const stored = await this.ctx.storage.get<ConversationActorState>("state")
      if (stored) {
        this.state = stored
      }
      // If no state, this DO was just created — initialise() will be called
    })
  }
 
  // Called when a new conversation starts
  async initialise(params: {
    conversationId: string
    tenantPhone: string
    organisationId: string
    propertyId: string | null
    tenantId: string | null
    identityStatus: ConversationActorState['identityStatus']
  }): Promise<void> {
    this.state = {
      ...params,
      issues: [],
      messageCount: 0,
      lastMessageAt: Date.now(),
      aiRouterActive: true,
    }
    await this.persist()
 
    // Set 24h inactivity timeout
    await this.ctx.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000)
  }
 
  // Main entry point for every inbound message
  async handleMessage(message: {
    content: string
    mediaUrl?: string
    externalId: string
  }): Promise<{ reply: string; issueCreated?: string }> {
    this.state = {
      ...this.state,
      messageCount: this.state.messageCount + 1,
      lastMessageAt: Date.now(),
    }
 
    // Reset inactivity timer
    await this.ctx.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000)
 
    // Identity gating (in-memory — no DB hit)
    if (this.state.identityStatus === 'UNIDENTIFIED') {
      return this.handleUnidentified(message)
    }
    if (this.state.identityStatus === 'IDENTIFIED') {
      return this.handleIdentified(message)
    }
 
    // Classify intent against open issues (in-memory list)
    const intent = await classifyIntent({
      message: message.content,
      openIssues: this.state.issues,
    })
 
    // Route based on classification
    // ... (same routing logic as multi-issue-conversations plan)
 
    await this.persist()
    return { reply: "..." }
  }
 
  // Update issue state (called by workflow or dashboard)
  async updateIssue(issueId: string, update: Partial<ConversationActorState['issues'][0]>): Promise<void> {
    this.state = {
      ...this.state,
      issues: this.state.issues.map(i =>
        i.issueId === issueId ? { ...i, ...update } : i
      ),
    }
    await this.persist()
  }
 
  // Inactivity timeout
  async alarm() {
    const inactiveMs = Date.now() - this.state.lastMessageAt
    if (inactiveMs >= 24 * 60 * 60 * 1000) {
      // Mark conversation as resolved after 24h inactivity
      await this.markResolved()
    }
  }
 
  async getState(): Promise<ConversationActorState> {
    return this.state
  }
 
  private async persist(): Promise<void> {
    await this.ctx.storage.put("state", this.state)
  }
 
  private async markResolved(): Promise<void> {
    // Write back to Postgres (source of truth for dashboard/analytics)
    await updateConversationInDB(this.env, this.state.conversationId, {
      status: 'RESOLVED',
      endedAt: new Date(),
    })
  }
}

Routing: Webhook → DO

// In envo-dashboard webhook handler (or envo-orchestrator fetch handler)
 
async function handleInboundSMS(request: Request, env: Env) {
  const body = await parseFormData(request)
  const phone = body.From
 
  // Deterministic ID from phone + channel — same phone always hits same DO
  const actorId = env.CONVERSATION_ACTOR.idFromName(`sms:${phone}`)
  const actor = env.CONVERSATION_ACTOR.get(actorId)
 
  const result = await actor.handleMessage({
    content: body.Body,
    mediaUrl: body.MediaUrl0,
    externalId: body.MessageSid,
  })
 
  return new Response(twimlResponse(result.reply))
}

DO ↔ Workflow interaction

Inbound SMS
    ↓
ConversationActor (DO)
    ├── Classifies intent
    ├── Collects issue details (in-memory)
    ├── Creates issue in Postgres
    └── Triggers IssueLifecycleWorkflow ──→ Workflow
                                               ├── SLA monitoring
                                               ├── Vendor assignment
                                               ├── Escalation chains
                                               └── Calls actor.updateIssue()
                                                    to sync state back

When to build this

Not now. The Conversation Actor is a performance optimisation and architectural refinement. The current pattern (Postgres read/write per message) works fine at current scale and will work to ~3,000 properties.

Build this when:

  • Message volume causes DB read latency issues (> 100ms per conversation load)
  • Race conditions become a real problem (concurrent webhooks for same conversation)
  • You want real-time WebSocket chat in the dashboard (DOs natively support WebSockets)
  • You’re building the vendor-facing chat interface (vendors need real-time comms too)

Estimated trigger point: 1,000+ active conversations/day or when live chat becomes a requirement.

Trade-offs

ProCon
Zero-latency state access (in-memory)State lives in two places (DO + Postgres) — sync complexity
Guaranteed message ordering (single-threaded)If DO evicted, must reload from Postgres — cold start latency
Native alarm-based timeoutsOnly one alarm per DO — need manual scheduling for multiple timers
WebSocket support for live chatNew operational surface to monitor and debug
Natural fit for actor model (one actor per conversation)10 GB storage limit per DO (plenty, but finite)
Composable with Workflows via env bindingsCannot join/query across DOs (no cross-DO queries)

Implementation Phases

Phase 0: Scaffolding (prerequisite)

  1. Create envo-orchestrator/ directory with wrangler.jsonc, tsconfig.json, package.json
  2. Set up Hyperdrive binding (reuse existing connection pool ID)
  3. Add cross-script workflow binding to envo-dashboard/wrangler.jsonc
  4. Deploy empty worker to verify binding works

Phase 1: Core workflow + vendor tables

  1. Run 0XX_workflow_tracking.sql migration
  2. Implement IssueLifecycleWorkflow with steps 1 (tenant ack) and 4-5 (SLA monitoring + resolution check)
  3. Skip vendor assignment for now (step 2-3) — just SLA enforcement
  4. Wire up triggerIssueLifecycle() in issue creation flow
  5. Wire up onIssueCancelled() in status change mutation
  6. Feature gate: Only trigger for orgs with workflow_coordination feature enabled

Phase 2: Vendor management

  1. Build vendor CRUD (GraphQL mutations, dashboard UI)
  2. Implement auto-assignment logic
  3. Add vendor notification (SMS/email)
  4. Add vendor acceptance webhook endpoint
  5. Wire up waitForEvent + sendEvent for vendor acceptance
  6. Feature gate: Vendor steps only run if contractor_coordination enabled

Phase 3: Dashboard integration

  1. Show workflow status on issue detail page (timeline view)
  2. Show SLA countdown badge on issue cards
  3. Manual vendor assignment UI (with workflow event trigger)
  4. Workflow audit log (issue events timeline)

Phase 4: Conversation Actor (future)

  1. Implement ConversationActor DO in envo-orchestrator
  2. Migrate webhook handlers to route through DO
  3. Add Postgres write-back for dashboard consistency
  4. Add WebSocket support for live chat
  5. Only build when scale demands it or live chat is required

Testing Plan

Unit tests

  • tryAutoAssignVendor() — returns null when no vendors, picks highest rated
  • SLA deadline calculation — correct for all urgency levels
  • Workflow step naming — deterministic, no dynamic values in names

Integration tests

  • Workflow triggers on issue creation
  • Workflow terminates on issue cancellation
  • sendEvent unblocks waitForEvent for vendor acceptance
  • Vendor timeout triggers landlord notification
  • SLA warning fires at 75% of deadline
  • SLA breach fires at 100% of deadline
  • Resolution check stops polling after COMPLETED

E2E tests

  • Full flow: tenant reports issue via SMS → workflow starts → SLA monitored → resolved → tenant notified
  • Vendor flow: issue created → vendor assigned → vendor accepts → visit completed → issue resolved
  • Breach flow: issue created → SLA passes → landlord notified → tenant apologised to
  • Cancel flow: issue created → workflow running → issue cancelled → workflow terminated

Risks & Mitigations

RiskImpactMitigation
OpenNext bundle + workflow bindings conflictCan’t deploySeparate worker (already planned)
Workflow step failures cascadeTenant gets no updatesRetries with exponential backoff on all external calls
SMS costs increase with automated updatesHigher Twilio billCap SMS notifications per issue (e.g. max 5); email for non-urgent
Vendor doesn’t have a phone/emailCan’t notifyFall back to landlord notification: “Please contact vendor manually”
Workflow left running after manual resolutionStale notificationsResolution check polls DB — will catch manual resolution and exit
Multiple workflows for same issueDuplicate notificationsDeterministic instance ID (issue-${id}) prevents duplicates

Open Questions

  1. Vendor notification channel — SMS, email, or both? SMS costs money per message. WhatsApp Business API as an option?
  2. Tenant satisfaction survey — after resolution, should we ask for a rating? (1-5 via SMS reply)
  3. Workflow visibility — should tenants be able to text “status” and get a workflow-aware response? (Ties into multi-issue plan’s STATUS_CHECK intent)
  4. Emergency workflow — should EMERGENCY issues have a completely different workflow with shorter intervals and more aggressive escalation?
  5. Multi-vendor bidding — future: notify multiple vendors, first to accept wins?