Skip to main content

Deploying a CRM tenant

CRM ships as a per-tenant Kubernetes deployment. Each tenant gets its own pod set, its own Postgres database, its own Redis namespace, and its own RabbitMQ queue binding. The deployment scripts live in devops/gitops/tenants/{tenant}/.

This page covers the bits a CRM-specific change has to flow through — the broader infra layer (Helm charts, ArgoCD apps) is owned by the DevOps doc set.

Required env vars

Full list in environment-reference.md. The deploy-critical subset:

VariablePurposeSource
TENANT_IDUUID — single source of truth for the bound tenantProvisioning
TENANT_NAMEPermission FQN segment (tenant.{name}.crm.*)Provisioning
SERVICE_NAMEAlways crm — used in permission FQN buildingStatic
APP_KEYLaravel encryption keyartisan key:generate once at provision
DB_*Postgres connectionProvisioning
REDIS_HOST / REDIS_PORT / REDIS_PASSWORDCache + queue backendShared cluster
RABBITMQ_*Identity event consumer bindingShared cluster
IDENTITY_URLOAuth + policy fetchStatic (cluster-internal URL)
OAUTH_CLIENT_IDOAuth client id (per-tenant)Identity admin
OAUTH_CLIENT_SECRETOAuth client secretIdentity admin (rotate via PassportSeeder)
SYSTEM_API_TOKENShared system-to-system tokenIdentity admin

The tenant-specific values come from devops/gitops/tenants/{tenant}/tenant.config.yaml (see DevOps docs for the format).

Required Postgres extensions

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS unaccent; -- App\Support\Search (MAGAS #61)

The migrations (2026_05_12_000003_enable_unaccent_extension.php) attempt this idempotently, but only succeed if the migration role has extension-create privilege. Set the role up at cluster-provision time.

Required workers

Per tenant deployment:

WorkerCommandRestart policy
HTTPphp artisan serve (dev) / fpm-nginx (prod)always
Queuephp artisan queue:work redis --tries=3always
Identity eventsphp artisan crm:consume-identity-eventsalways
Scheduledphp artisan schedule:workalways

The identity events consumer is mandatory — without it the local TenantUser cache, policy cache and tasks/assignments cleanup don't react to identity-side changes. ArgoCD's Helm chart defines them all.

First-time provisioning

# 1. Create the database + extensions (typically done by infra ops).
# 2. From the CRM pod:
php artisan migrate --force
php artisan db:seed --force # base lookup data only
php artisan optimize # config + route + view cache

There's no tenant-side seeding — CRM is data-driven, not seed-driven. Identity provisions the first admin user; CRM picks them up on first login via IdentityService::getUserFromToken → lazy TenantUser create.

Upgrade flow

  1. PR merges → CI runs php artisan test, npm run test, npm run api:check, composer normalize.
  2. ArgoCD picks up the new image tag → rolling deployment per tenant.
  3. Migrations run in a pre-deploy job (Helm hook pre-upgrade).
  4. Workers restart automatically (deployment annotation triggers a rollout).

Migration safety

CRM migrations are forward-only by convention (the down() methods exist but are not used in production rollback flows). The risk profile per migration:

MigrationSafe to run live?Notes
Adding nullable columnsYeszero-downtime
Dropping columnsCautionrolling deploy needs the columnless app to ship first
Modifying ENUM typesCautionper-driver behavior; see widen_project_member_role for a varchar fallback
Adding partial unique indexesYesPostgres builds in-memory then rotates
Backfilling dataYes (in a job)never inline in a migration on a hot table

The tenant_id drop on 2026-05-12 was a forward-only schema migration that touched 46 tables; coordinated deploy was needed because both code and schema changed together.

Cache flush on deploy

php artisan cache:clear is not part of the upgrade flow — it would invalidate every tenant's policy cache at once. Instead:

  • Policy cache is per-tenant tagged (crm.policies, tenant:{tid}) and flushed by IdentityEventHandler::onPolicyUpdated when identity emits the matching event.
  • User payload cache (identity.user:{sha256(token)}) TTLs in 60s — let it drain.
  • Route + config cache: regenerated by php artisan optimize in the post-migrate hook.

Health check

ArgoCD readiness probe should hit /api/health (the route is public). The probe should reach 200 within 10s of pod start; failures keep the old pod serving. If readiness fails, check:

  1. DB connection (most common: secret rotation hasn't propagated)
  2. Identity reachability (the /api/health doesn't validate this — but a smoke test should)
  3. RabbitMQ broker reachability — the worker container fails to start without it

Rollback

Forward-only. If you need to revert:

  1. Deploy the older image tag.
  2. Migration regression: requires a hand-crafted down-migration. Prefer rolling forward with a new fix migration; rolling-back across a column drop is hostile to data integrity.
  3. Cache state: harmless to wipe (cache:clear); will re-warm within the next minute of traffic.

Tenant deactivation

To suspend a tenant without deleting data:

  1. Identity-side: deactivate the OAuth client. CRM 401s on every call after the cached token TTLs out (~60s).
  2. Scale CRM deployment to 0 replicas.
  3. Postgres database stays — re-activation is "scale back to N".

To delete a tenant: requires a manual review (data retention, backups). The tenants/{tenant}/ GitOps directory should be left in place even after teardown, with a deactivated_at timestamp in the config; this is the audit record.

Where to learn more

  • intro — service surface
  • architecture — diagrams
  • multi-tenant — why per-deployment vs per-row
  • DevOps doc set — documentations/docs/devops/intro.md