Plan: Issue Lifecycle Workflows & Conversation Actors
Status: Draft
Created: 2026-04-15
Depends on: multi-issue-conversations.md (Phase 3+)
Feature flags: automated_reminders (future), workflow_coordination (future)
Problem Statement
Issue lifecycle automation is entirely manual or cron-based today. Once an issue is created, there’s no automated orchestration — no SLA enforcement, no vendor follow-up, no tenant status updates, no escalation chains. The landlord sees metrics after the fact (slaBreachCount in analytics) but nothing proactively prevents breaches.
Current state:
- Issue created → email notification → nothing automated happens
- SLA targets defined (EMERGENCY: 1d, HIGH: 3d, MEDIUM: 7d, LOW: 14d) but only measured, never enforced
- Vendor management is a stub (no tables, no assignment logic, no notifications)
- Three existing crons (rent chase, doc expiry, weekly digest) but none for issues
- No follow-up reminders to tenants
- No vendor acceptance tracking
- No automated escalation on SLA breach
What this plan adds:
- Durable, multi-step issue lifecycle orchestration via Cloudflare Workflows
- SLA enforcement with automated escalation
- Vendor assignment → acceptance → completion flow
- Tenant follow-up notifications (proactive status updates)
- Conversation Actor concept for real-time per-conversation state (future)
Architecture Decision: Why Cloudflare Workflows
| Option | Fit | Verdict |
|---|---|---|
| Temporal | Needs persistent server + workers. We’re on CF Workers (stateless edge). Would require a second platform. | No |
| Vercel Workflow (AI SDK) | We’re not on Vercel. Solves LLM chaining, not multi-day orchestration. | No |
| BullMQ / pg-boss | Needs a persistent Node.js process or Redis. Not native to CF Workers. | No |
| More crons | Would work but brittle — polling every X minutes, no event-driven triggers, no durable state. Already have 3 crons. | Fallback only |
| CF Workflows | Native to our platform. Durable execution, sleep up to 365 days (free — no CPU cost), waitForEvent for vendor webhooks, automatic retries. No infra to manage. | Yes |
Deployment: Separate Worker (envo-orchestrator)
The dashboard runs on OpenNext with a 10 MB bundle limit. Workflows and Durable Objects should live in a separate Worker to avoid bundle pressure and maintain separation of concerns.
envo-dashboard (CF Worker) envo-orchestrator (CF Worker)
├── Next.js App Router ├── IssueLifecycleWorkflow
├── GraphQL API ├── ConversationActor (future)
├── Webhook handlers └── Shared Hyperdrive binding
└── Cross-script bindings ──────────►
The dashboard triggers workflows via cross-script binding. The orchestrator reads/writes to the same Supabase database via its own Hyperdrive connection.
Part 1: Issue Lifecycle Workflow
Workflow Definition
// envo-orchestrator/src/workflows/issue-lifecycle.ts
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from "cloudflare:workers"
interface IssueWorkflowParams {
issueId: string
organisationId: string
propertyId: string
tenantId: string | null
tenantPhone: string | null
category: string
urgency: 'LOW' | 'MEDIUM' | 'HIGH' | 'EMERGENCY'
description: string
conversationId: string | null
}
interface Env {
ISSUE_WORKFLOW: Workflow<IssueWorkflowParams>
HYPERDRIVE: Hyperdrive
}
// SLA deadlines in milliseconds
const SLA_DEADLINES: Record<string, number> = {
EMERGENCY: 1 * 24 * 60 * 60 * 1000, // 1 day
HIGH: 3 * 24 * 60 * 60 * 1000, // 3 days
MEDIUM: 7 * 24 * 60 * 60 * 1000, // 7 days
LOW: 14 * 24 * 60 * 60 * 1000, // 14 days
}
export class IssueLifecycleWorkflow extends WorkflowEntrypoint<Env, IssueWorkflowParams> {
async run(event: WorkflowEvent<IssueWorkflowParams>, step: WorkflowStep) {
const { issueId, urgency, tenantPhone, organisationId } = event.payload
const slaDeadlineMs = SLA_DEADLINES[urgency]
const createdAt = event.timestamp.getTime()
const slaDeadline = new Date(createdAt + slaDeadlineMs)
// ─── Step 1: Acknowledge to tenant (if via conversation) ───
if (tenantPhone) {
await step.do("tenant-ack", {
retries: { limit: 3, delay: "5 seconds", backoff: "exponential" },
timeout: "30 seconds",
}, async () => {
await sendSMS(tenantPhone, `Your issue has been logged (ref: ${issueId.slice(0,8)}). We'll keep you updated.`)
})
}
// ─── Step 2: Auto-assign vendor (if vendor management enabled) ───
const vendor = await step.do("auto-assign-vendor", async () => {
return await tryAutoAssignVendor(this.env, event.payload)
})
if (vendor) {
// ─── Step 3a: Notify vendor, wait for acceptance ───
await step.do("notify-vendor", {
retries: { limit: 3, delay: "10 seconds", backoff: "exponential" },
}, async () => {
await notifyVendor(vendor, event.payload)
})
// Wait for vendor to accept (or timeout)
const acceptanceTimeout = urgency === 'EMERGENCY' ? "2 hours" : "24 hours"
try {
const acceptance = await step.waitForEvent("vendor-acceptance", {
type: "vendor-accepted",
timeout: acceptanceTimeout,
})
await step.do("record-acceptance", async () => {
await recordIssueEvent(this.env, issueId, 'VENDOR_ACCEPTED', {
vendorId: vendor.id,
eta: acceptance.payload?.eta,
})
})
// Notify tenant of vendor assignment
if (tenantPhone) {
await step.do("tenant-vendor-update", async () => {
await sendSMS(tenantPhone,
`Good news — a ${vendor.specialty} has been assigned to your issue and will be in touch.`)
})
}
} catch {
// Vendor didn't respond — escalate to landlord
await step.do("vendor-timeout-escalate", async () => {
await notifyLandlord(this.env, organisationId, {
type: 'vendor_no_response',
issueId,
vendorName: vendor.name,
message: `${vendor.name} did not respond within ${acceptanceTimeout}. Please assign manually.`,
})
await recordIssueEvent(this.env, issueId, 'NOTIFICATION_SENT', {
type: 'vendor_timeout',
vendorId: vendor.id,
})
})
}
}
// ─── Step 4: SLA monitoring ───
// Sleep until 75% of SLA deadline, then check
const warningMs = Math.floor(slaDeadlineMs * 0.75)
await step.sleep("sla-warning-wait", `${Math.floor(warningMs / 1000)} seconds`)
const issueAtWarning = await step.do("check-at-sla-warning", async () => {
return await getIssueStatus(this.env, issueId)
})
if (issueAtWarning.status !== 'COMPLETED' && issueAtWarning.status !== 'CANCELLED') {
// SLA warning — notify landlord
await step.do("sla-warning-notify", async () => {
const remaining = slaDeadline.getTime() - Date.now()
const hoursLeft = Math.floor(remaining / (60 * 60 * 1000))
await notifyLandlord(this.env, organisationId, {
type: 'sla_warning',
issueId,
message: `Issue approaching SLA deadline — ${hoursLeft} hours remaining. Urgency: ${urgency}.`,
})
})
// Sleep until SLA deadline
const remainingMs = slaDeadlineMs - warningMs
await step.sleep("sla-deadline-wait", `${Math.floor(remainingMs / 1000)} seconds`)
const issueAtDeadline = await step.do("check-at-sla-deadline", async () => {
return await getIssueStatus(this.env, issueId)
})
if (issueAtDeadline.status !== 'COMPLETED' && issueAtDeadline.status !== 'CANCELLED') {
// SLA BREACHED
await step.do("sla-breach", async () => {
await notifyLandlord(this.env, organisationId, {
type: 'sla_breach',
issueId,
message: `SLA BREACHED — ${urgency} issue unresolved after ${SLA_DEADLINES[urgency] / (24*60*60*1000)} days.`,
})
await recordIssueEvent(this.env, issueId, 'NOTIFICATION_SENT', {
type: 'sla_breach',
urgency,
deadlineExceededAt: new Date().toISOString(),
})
})
// Notify tenant if they're waiting
if (tenantPhone) {
await step.do("tenant-delay-apology", async () => {
await sendSMS(tenantPhone,
`We're sorry your issue is taking longer than expected. Our team has been alerted and will prioritise this.`)
})
}
}
}
// ─── Step 5: Post-resolution follow-up ───
// Poll daily until resolved (up to 30 days)
for (let day = 0; day < 30; day++) {
const current = await step.do(`resolution-check-day-${day}`, async () => {
return await getIssueStatus(this.env, issueId)
})
if (current.status === 'COMPLETED' || current.status === 'CANCELLED') {
// Issue resolved — send satisfaction check
if (tenantPhone && current.status === 'COMPLETED') {
await step.do("tenant-resolution-notify", async () => {
await sendSMS(tenantPhone,
`Your ${event.payload.category.toLowerCase()} issue has been resolved. If you have any further problems, just reply to this number.`)
})
}
// Workflow complete
return { status: 'completed', resolvedAt: current.resolvedAt }
}
await step.sleep(`daily-wait-${day}`, "24 hours")
}
// 30 days without resolution — final escalation
await step.do("stale-issue-escalation", async () => {
await notifyLandlord(this.env, organisationId, {
type: 'stale_issue',
issueId,
message: `Issue has been open for 30 days without resolution. Please review and close or reassign.`,
})
})
return { status: 'stale', daysOpen: 30 }
}
}Triggering the Workflow
From the dashboard’s issue creation flow:
// In lib/tenant-engine/issue-creation.ts (after createIssueFromConversation)
// OR in lib/graphql/mutations/issue.ts (after manual issue creation)
async function triggerIssueLifecycle(issue: CreatedIssue, conversation?: ConversationContext) {
// env.ISSUE_WORKFLOW is the cross-script binding to envo-orchestrator
const instance = await env.ISSUE_WORKFLOW.create({
id: `issue-${issue.id}`,
params: {
issueId: issue.id,
organisationId: issue.organisationId,
propertyId: issue.propertyId,
tenantId: issue.tenantId,
tenantPhone: conversation?.contactPhone ?? null,
category: issue.category,
urgency: issue.urgency,
description: issue.description,
conversationId: conversation?.id ?? null,
},
})
// Store workflow instance ID on the issue for status queries
await prisma.issue.update({
where: { id: issue.id },
data: { workflowInstanceId: instance.id },
})
}Sending Events to a Running Workflow
When a vendor accepts via webhook or dashboard action:
// In a vendor acceptance webhook handler or GraphQL mutation
async function onVendorAccepted(issueId: string, vendorId: string, eta?: string) {
const issue = await prisma.issue.findUnique({ where: { id: issueId } })
if (!issue?.workflowInstanceId) return
const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
await instance.sendEvent({
type: "vendor-accepted",
payload: { vendorId, eta },
})
}Querying Workflow Status
From the dashboard (issue detail page):
async function getIssueWorkflowStatus(issueId: string) {
const issue = await prisma.issue.findUnique({ where: { id: issueId } })
if (!issue?.workflowInstanceId) return null
const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
const status = await instance.status()
// status.status: "queued" | "running" | "paused" | "waiting" | "complete" | "errored" | "terminated"
return status
}Cancelling a Workflow
When an issue is cancelled or manually resolved:
async function onIssueCancelled(issueId: string) {
const issue = await prisma.issue.findUnique({ where: { id: issueId } })
if (!issue?.workflowInstanceId) return
const instance = await env.ISSUE_WORKFLOW.get(issue.workflowInstanceId)
await instance.terminate()
}Part 2: Wrangler Configuration
New Worker: envo-orchestrator/wrangler.jsonc
{
"$schema": "node_modules/wrangler/config-schema.json",
"name": "envo-orchestrator",
"main": "src/index.ts",
"compatibility_date": "2025-05-05",
"compatibility_flags": ["nodejs_compat"],
"workflows": [
{
"name": "issue-lifecycle",
"binding": "ISSUE_WORKFLOW",
"class_name": "IssueLifecycleWorkflow",
"limits": { "steps": 25000 }
}
],
"hyperdrive": [
{
"binding": "HYPERDRIVE",
"id": "d6e9fe00b05b44ef99eae5ecacf07892"
}
],
"vars": {
"ENVIRONMENT": "production"
}
}Dashboard bindings: envo-dashboard/wrangler.jsonc additions
{
// ... existing config ...
"workflows": [
{
"name": "issue-lifecycle",
"binding": "ISSUE_WORKFLOW",
"class_name": "IssueLifecycleWorkflow",
"script_name": "envo-orchestrator"
}
]
}Part 3: Database Changes
Migration: 0XX_workflow_tracking.sql
-- ============================================================
-- Workflow tracking for issue lifecycle orchestration
-- ============================================================
-- 1. Add workflow instance ID to issues
ALTER TABLE issues
ADD COLUMN workflow_instance_id TEXT;
CREATE INDEX idx_issues_workflow_instance
ON issues(workflow_instance_id)
WHERE workflow_instance_id IS NOT NULL;
-- 2. Extend IssueEventType with workflow events
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'SLA_WARNING';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'SLA_BREACH';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'VENDOR_TIMEOUT';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'TENANT_NOTIFIED';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'WORKFLOW_STARTED';
ALTER TYPE "IssueEventType" ADD VALUE IF NOT EXISTS 'WORKFLOW_COMPLETED';
-- 3. Vendor tables (prerequisite for auto-assignment)
CREATE TABLE vendors (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
organisation_id UUID NOT NULL REFERENCES organisations(id) ON DELETE CASCADE,
name TEXT NOT NULL,
email TEXT,
phone VARCHAR(20),
specialties TEXT[] NOT NULL DEFAULT '{}',
-- Specialties map to IssueCategory values:
-- PLUMBING, ELECTRICAL, HEATING, STRUCTURAL, etc.
is_active BOOLEAN NOT NULL DEFAULT true,
rating DECIMAL(2,1), -- 1.0-5.0
notes TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_vendors_org ON vendors(organisation_id);
CREATE INDEX idx_vendors_specialties ON vendors USING GIN(specialties);
CREATE INDEX idx_vendors_active ON vendors(organisation_id) WHERE is_active = true;
ALTER TABLE vendors ENABLE ROW LEVEL SECURITY;
CREATE POLICY vendors_org_access ON vendors
FOR ALL
USING (organisation_id::text = auth.jwt() ->> 'organisation_id');
-- 4. Vendor assignments (issue → vendor relationship)
CREATE TABLE vendor_assignments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
issue_id UUID NOT NULL REFERENCES issues(id) ON DELETE CASCADE,
vendor_id UUID NOT NULL REFERENCES vendors(id) ON DELETE CASCADE,
status TEXT NOT NULL DEFAULT 'PENDING'
CHECK (status IN ('PENDING', 'ACCEPTED', 'DECLINED', 'COMPLETED', 'CANCELLED')),
assigned_at TIMESTAMPTZ NOT NULL DEFAULT now(),
accepted_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
eta TEXT, -- Vendor-provided ETA
notes TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (issue_id, vendor_id)
);
CREATE INDEX idx_vendor_assignments_issue ON vendor_assignments(issue_id);
CREATE INDEX idx_vendor_assignments_vendor ON vendor_assignments(vendor_id);
CREATE INDEX idx_vendor_assignments_status ON vendor_assignments(status);
ALTER TABLE vendor_assignments ENABLE ROW LEVEL SECURITY;
CREATE POLICY vendor_assignments_org_access ON vendor_assignments
FOR ALL
USING (
EXISTS (
SELECT 1 FROM issues i
WHERE i.id = vendor_assignments.issue_id
AND i.organisation_id::text = auth.jwt() ->> 'organisation_id'
)
);Prisma additions (after pnpm db:pull)
model Vendor {
id String @id @default(uuid()) @db.Uuid
organisationId String @map("organisation_id") @db.Uuid
name String
email String?
phone String? @db.VarChar(20)
specialties String[]
isActive Boolean @default(true) @map("is_active")
rating Decimal? @db.Decimal(2, 1)
notes String?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @default(now()) @updatedAt @map("updated_at")
organisation Organisation @relation(fields: [organisationId], references: [id], onDelete: Cascade)
assignments VendorAssignment[]
@@index([organisationId])
@@map("vendors")
}
model VendorAssignment {
id String @id @default(uuid()) @db.Uuid
issueId String @map("issue_id") @db.Uuid
vendorId String @map("vendor_id") @db.Uuid
status String @default("PENDING")
assignedAt DateTime @default(now()) @map("assigned_at")
acceptedAt DateTime? @map("accepted_at")
completedAt DateTime? @map("completed_at")
eta String?
notes String?
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @default(now()) @updatedAt @map("updated_at")
issue Issue @relation(fields: [issueId], references: [id], onDelete: Cascade)
vendor Vendor @relation(fields: [vendorId], references: [id], onDelete: Cascade)
@@unique([issueId, vendorId])
@@map("vendor_assignments")
}Part 4: Workflow Steps Breakdown
Step-by-step timeline for a MEDIUM urgency issue
Day 0, 00:00 ─── Issue created ───────────────────────────────
│ Step: tenant-ack → SMS: "Issue logged, ref: abc12345"
│ Step: auto-assign-vendor → Find vendor by category + location
│ Step: notify-vendor → SMS/email to vendor
│ Step: waitForEvent → Waiting for "vendor-accepted"
│ (timeout: 24 hours)
│
Day 0, ~04:00 ─── Vendor accepts ─────────────────────────────
│ Step: record-acceptance → Issue event: VENDOR_ACCEPTED
│ Step: tenant-vendor-update → SMS: "A plumber has been assigned"
│ Step: sla-warning-wait → sleep 5.25 days (75% of 7-day SLA)
│
Day 5, 06:00 ─── SLA warning ─────────────────────────────────
│ Step: check-at-sla-warning → Query issue status
│ Step: sla-warning-notify → Email/SMS to landlord:
│ "42 hours remaining on SLA"
│ Step: sla-deadline-wait → sleep 1.75 days
│
Day 7, 00:00 ─── SLA deadline ────────────────────────────────
│ Step: check-at-sla-deadline → Query issue status
│ IF still open:
│ Step: sla-breach → Email/SMS to landlord: "SLA BREACHED"
│ Step: tenant-delay-apology → SMS: "Sorry for the delay"
│
Day 7+ ─── Daily resolution checks ──────────────────────────
│ Step: resolution-check-day-N → Query status
│ Step: daily-wait-N → sleep 24 hours
│ ... repeat until COMPLETED/CANCELLED or day 30
│
Day 30 ─── Stale issue escalation ────────────────────────────
│ Step: stale-issue-escalation → "Open 30 days, please review"
│ Workflow complete.
Cost for this workflow
- Steps: ~15-40 depending on resolution time (well within 25,000 limit)
- CPU time: < 100ms total (all steps are lightweight DB queries + API calls)
- Sleep time: Free (no CPU charged during hibernation)
- Estimated cost per issue: < $0.001
At 3,000 properties with ~2 issues/property/month = 6,000 workflows/month → < $6/month total.
Part 5: Auto-Assignment Logic
// envo-orchestrator/src/vendor-assignment.ts
interface AssignmentResult {
id: string
name: string
phone: string | null
email: string | null
specialty: string
}
async function tryAutoAssignVendor(
env: Env,
issue: IssueWorkflowParams
): Promise<AssignmentResult | null> {
// 1. Check if org has vendor management enabled
const orgFeatures = await getOrgFeatures(env, issue.organisationId)
if (!orgFeatures.includes('contractor_coordination')) return null
// 2. Find active vendors matching this category
const vendors = await queryVendors(env, {
organisationId: issue.organisationId,
specialty: issue.category,
isActive: true,
})
if (vendors.length === 0) return null
// 3. Pick best vendor (round-robin, or by rating, or by availability)
// Start simple: highest rated active vendor for this category
const sorted = [...vendors].sort((a, b) => (b.rating ?? 0) - (a.rating ?? 0))
const chosen = sorted[0]
// 4. Create assignment record
await createVendorAssignment(env, {
issueId: issue.issueId,
vendorId: chosen.id,
status: 'PENDING',
})
// 5. Update issue status
await updateIssueStatus(env, issue.issueId, 'VENDOR_ASSIGNED')
return chosen
}Part 6: Conversation Actor (Concept)
What it is
A Durable Object that acts as a stateful, single-threaded actor for each conversation. Instead of the current pattern (load state from Postgres on every webhook → process → write back), the Conversation Actor holds state in memory between messages, with SQLite-backed persistence for durability.
Why consider it
| Current pattern | Conversation Actor |
|---|---|
| Every message: query Postgres for conversation + messages + issues | State in memory — zero DB reads for hot-path decisions |
| Identity gating: DB read + write per step | State transitions happen in-memory, batched to storage |
| Race conditions: two webhooks for same conversation can interleave | Single-threaded — messages processed strictly in order |
| Conversation timeout: cron-based or checked on next message | Alarm-based — DO wakes itself up after inactivity |
| No real-time presence | DO can hold WebSocket connections for live chat |
How it would work
// envo-orchestrator/src/actors/conversation-actor.ts
import { DurableObject } from "cloudflare:workers"
interface ConversationActorState {
conversationId: string
tenantPhone: string
organisationId: string
propertyId: string | null
tenantId: string | null
identityStatus: 'UNIDENTIFIED' | 'IDENTIFIED' | 'CONFIRMED' | 'ACTIVE'
issues: Array<{
issueId: string
category: string
gatheringState: string
handledBy: 'AI' | 'HUMAN'
isActive: boolean
}>
messageCount: number
lastMessageAt: number
aiRouterActive: boolean
}
export class ConversationActor extends DurableObject<Env> {
private state!: ConversationActorState
constructor(ctx: DurableObjectState, env: Env) {
super(ctx, env)
// Load state from SQLite storage on wake
this.ctx.blockConcurrencyWhile(async () => {
const stored = await this.ctx.storage.get<ConversationActorState>("state")
if (stored) {
this.state = stored
}
// If no state, this DO was just created — initialise() will be called
})
}
// Called when a new conversation starts
async initialise(params: {
conversationId: string
tenantPhone: string
organisationId: string
propertyId: string | null
tenantId: string | null
identityStatus: ConversationActorState['identityStatus']
}): Promise<void> {
this.state = {
...params,
issues: [],
messageCount: 0,
lastMessageAt: Date.now(),
aiRouterActive: true,
}
await this.persist()
// Set 24h inactivity timeout
await this.ctx.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000)
}
// Main entry point for every inbound message
async handleMessage(message: {
content: string
mediaUrl?: string
externalId: string
}): Promise<{ reply: string; issueCreated?: string }> {
this.state = {
...this.state,
messageCount: this.state.messageCount + 1,
lastMessageAt: Date.now(),
}
// Reset inactivity timer
await this.ctx.storage.setAlarm(Date.now() + 24 * 60 * 60 * 1000)
// Identity gating (in-memory — no DB hit)
if (this.state.identityStatus === 'UNIDENTIFIED') {
return this.handleUnidentified(message)
}
if (this.state.identityStatus === 'IDENTIFIED') {
return this.handleIdentified(message)
}
// Classify intent against open issues (in-memory list)
const intent = await classifyIntent({
message: message.content,
openIssues: this.state.issues,
})
// Route based on classification
// ... (same routing logic as multi-issue-conversations plan)
await this.persist()
return { reply: "..." }
}
// Update issue state (called by workflow or dashboard)
async updateIssue(issueId: string, update: Partial<ConversationActorState['issues'][0]>): Promise<void> {
this.state = {
...this.state,
issues: this.state.issues.map(i =>
i.issueId === issueId ? { ...i, ...update } : i
),
}
await this.persist()
}
// Inactivity timeout
async alarm() {
const inactiveMs = Date.now() - this.state.lastMessageAt
if (inactiveMs >= 24 * 60 * 60 * 1000) {
// Mark conversation as resolved after 24h inactivity
await this.markResolved()
}
}
async getState(): Promise<ConversationActorState> {
return this.state
}
private async persist(): Promise<void> {
await this.ctx.storage.put("state", this.state)
}
private async markResolved(): Promise<void> {
// Write back to Postgres (source of truth for dashboard/analytics)
await updateConversationInDB(this.env, this.state.conversationId, {
status: 'RESOLVED',
endedAt: new Date(),
})
}
}Routing: Webhook → DO
// In envo-dashboard webhook handler (or envo-orchestrator fetch handler)
async function handleInboundSMS(request: Request, env: Env) {
const body = await parseFormData(request)
const phone = body.From
// Deterministic ID from phone + channel — same phone always hits same DO
const actorId = env.CONVERSATION_ACTOR.idFromName(`sms:${phone}`)
const actor = env.CONVERSATION_ACTOR.get(actorId)
const result = await actor.handleMessage({
content: body.Body,
mediaUrl: body.MediaUrl0,
externalId: body.MessageSid,
})
return new Response(twimlResponse(result.reply))
}DO ↔ Workflow interaction
Inbound SMS
↓
ConversationActor (DO)
├── Classifies intent
├── Collects issue details (in-memory)
├── Creates issue in Postgres
└── Triggers IssueLifecycleWorkflow ──→ Workflow
├── SLA monitoring
├── Vendor assignment
├── Escalation chains
└── Calls actor.updateIssue()
to sync state back
When to build this
Not now. The Conversation Actor is a performance optimisation and architectural refinement. The current pattern (Postgres read/write per message) works fine at current scale and will work to ~3,000 properties.
Build this when:
- Message volume causes DB read latency issues (> 100ms per conversation load)
- Race conditions become a real problem (concurrent webhooks for same conversation)
- You want real-time WebSocket chat in the dashboard (DOs natively support WebSockets)
- You’re building the vendor-facing chat interface (vendors need real-time comms too)
Estimated trigger point: 1,000+ active conversations/day or when live chat becomes a requirement.
Trade-offs
| Pro | Con |
|---|---|
| Zero-latency state access (in-memory) | State lives in two places (DO + Postgres) — sync complexity |
| Guaranteed message ordering (single-threaded) | If DO evicted, must reload from Postgres — cold start latency |
| Native alarm-based timeouts | Only one alarm per DO — need manual scheduling for multiple timers |
| WebSocket support for live chat | New operational surface to monitor and debug |
| Natural fit for actor model (one actor per conversation) | 10 GB storage limit per DO (plenty, but finite) |
| Composable with Workflows via env bindings | Cannot join/query across DOs (no cross-DO queries) |
Implementation Phases
Phase 0: Scaffolding (prerequisite)
- Create
envo-orchestrator/directory withwrangler.jsonc,tsconfig.json,package.json - Set up Hyperdrive binding (reuse existing connection pool ID)
- Add cross-script workflow binding to
envo-dashboard/wrangler.jsonc - Deploy empty worker to verify binding works
Phase 1: Core workflow + vendor tables
- Run
0XX_workflow_tracking.sqlmigration - Implement
IssueLifecycleWorkflowwith steps 1 (tenant ack) and 4-5 (SLA monitoring + resolution check) - Skip vendor assignment for now (step 2-3) — just SLA enforcement
- Wire up
triggerIssueLifecycle()in issue creation flow - Wire up
onIssueCancelled()in status change mutation - Feature gate: Only trigger for orgs with
workflow_coordinationfeature enabled
Phase 2: Vendor management
- Build vendor CRUD (GraphQL mutations, dashboard UI)
- Implement auto-assignment logic
- Add vendor notification (SMS/email)
- Add vendor acceptance webhook endpoint
- Wire up
waitForEvent+sendEventfor vendor acceptance - Feature gate: Vendor steps only run if
contractor_coordinationenabled
Phase 3: Dashboard integration
- Show workflow status on issue detail page (timeline view)
- Show SLA countdown badge on issue cards
- Manual vendor assignment UI (with workflow event trigger)
- Workflow audit log (issue events timeline)
Phase 4: Conversation Actor (future)
- Implement
ConversationActorDO inenvo-orchestrator - Migrate webhook handlers to route through DO
- Add Postgres write-back for dashboard consistency
- Add WebSocket support for live chat
- Only build when scale demands it or live chat is required
Testing Plan
Unit tests
-
tryAutoAssignVendor()— returns null when no vendors, picks highest rated - SLA deadline calculation — correct for all urgency levels
- Workflow step naming — deterministic, no dynamic values in names
Integration tests
- Workflow triggers on issue creation
- Workflow terminates on issue cancellation
-
sendEventunblockswaitForEventfor vendor acceptance - Vendor timeout triggers landlord notification
- SLA warning fires at 75% of deadline
- SLA breach fires at 100% of deadline
- Resolution check stops polling after COMPLETED
E2E tests
- Full flow: tenant reports issue via SMS → workflow starts → SLA monitored → resolved → tenant notified
- Vendor flow: issue created → vendor assigned → vendor accepts → visit completed → issue resolved
- Breach flow: issue created → SLA passes → landlord notified → tenant apologised to
- Cancel flow: issue created → workflow running → issue cancelled → workflow terminated
Risks & Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| OpenNext bundle + workflow bindings conflict | Can’t deploy | Separate worker (already planned) |
| Workflow step failures cascade | Tenant gets no updates | Retries with exponential backoff on all external calls |
| SMS costs increase with automated updates | Higher Twilio bill | Cap SMS notifications per issue (e.g. max 5); email for non-urgent |
| Vendor doesn’t have a phone/email | Can’t notify | Fall back to landlord notification: “Please contact vendor manually” |
| Workflow left running after manual resolution | Stale notifications | Resolution check polls DB — will catch manual resolution and exit |
| Multiple workflows for same issue | Duplicate notifications | Deterministic instance ID (issue-${id}) prevents duplicates |
Open Questions
- Vendor notification channel — SMS, email, or both? SMS costs money per message. WhatsApp Business API as an option?
- Tenant satisfaction survey — after resolution, should we ask for a rating? (1-5 via SMS reply)
- Workflow visibility — should tenants be able to text “status” and get a workflow-aware response? (Ties into multi-issue plan’s STATUS_CHECK intent)
- Emergency workflow — should EMERGENCY issues have a completely different workflow with shorter intervals and more aggressive escalation?
- Multi-vendor bidding — future: notify multiple vendors, first to accept wins?