Production-ready app handoff checklist for self-hosting
Use this production-ready app handoff checklist to package envs, secrets, monitoring, backups, and runbooks so operations can deploy and own your app.

What “production-ready handoff” means in practice
A production-ready handoff means ops can run the app without guessing. They can deploy a known version, confirm it’s healthy, respond to alerts, and recover from a bad release or outage. If any of that depends on one developer’s memory, the handoff isn’t done.
Treat the handoff as a package that answers one question: if the original builders disappear for a week, can ops still keep the system safe and available?
A solid package usually covers what the app does, what “healthy” looks like, how releases work (deploy, verify, roll back), where configuration lives, how secrets are handled, and how to monitor, back up, and respond to incidents.
Just as important is what it doesn’t cover. A handoff is not a promise to add features, refactor, redesign screens, or “clean things up later.” Those are separate projects with their own scope.
Before you call it complete, agree on ownership and response times. For example: ops owns uptime and deploys; the product team owns roadmap changes; the dev team provides a defined window of post-handoff support for fixes and questions.
Create a simple system inventory (what runs where)
Ops can only own what they can see. A simple one-page inventory prevents guesswork during deploys, incidents, and audits. Keep it plain English and specific.
List every running part of the system and where it lives: the backend API, web app, background workers, scheduled jobs, and how mobile apps connect. Even if iOS/Android are distributed through stores, they still depend on the same backend.
Include external services the app can’t run without. If you use PostgreSQL, a queue, object storage, or third-party APIs (payments like Stripe, messaging, email/SMS, Telegram), write down the exact service name and what it’s used for.
Capture network requirements so hosting doesn’t turn into trial and error: required domains (app, api, admin), ports and protocols, who renews TLS certificates, where DNS is managed, and any inbound/outbound allowlists.
Finally, write down expected load in real numbers: peak requests per minute, active users, typical payload sizes, current database size, and expected growth. Even rough ranges help ops set limits and alerts.
If you built with AppMaster, inventory the generated backend, web app, and integrations so ops knows what must be deployed together.
Package environment configuration (without exposing secrets)
Production setups usually fail at the boring part: config that only lives in someone’s head. Treat configuration as a deliverable. Ops should be able to see what settings exist, what differs by environment, and how to change them safely.
Start by naming every environment that exists today, even if you think it’s temporary. Most teams have dev, staging, and production, plus copies like “production-eu” or “staging-us.” Note which environment is used for release testing, data migrations, and incident drills.
Provide a single config reference that lists variable names and safe example values (never real credentials). Make placeholders obvious.
Your handoff package should include:
- A list of environments and what each one is for
- A reference of config keys (env vars or config file keys), expected type, and a non-secret example value
- Known differences between environments (feature flags, rate limits, cache sizes, email mode, logging level)
- Defaults and what happens if a key is missing
- Where config is stored and how it’s applied during deploy
Add a simple change process. For example: request in a ticket, review by the service owner, apply in staging first, then promote to production in a scheduled window with a rollback plan if error rates rise.
If you’re exporting and self-hosting an AppMaster app, keep the same rule: ship a clean, documented set of config keys alongside the generated source so ops can run it consistently across environments.
Secrets and credentials: storage, rotation, and access
Secrets are the fastest way a clean handoff turns into a security incident. The goal is straightforward: ops should know every secret the app needs, where it’s stored, who can read it, and how to change it without downtime.
Start with a short secrets list that ops can scan in a minute. For each item, note what it unlocks (database, SMTP, Stripe, JWT signing key), where it lives (vault, cloud secret store, Kubernetes Secret, encrypted file), and who owns rotation.
Write rotation steps like a recipe, not a policy. Include the exact order, how long the old secret must stay valid, and the one check that proves it worked.
Rotation checklist (example)
Use this pattern for each secret:
- Create the new secret value and store it in the approved secret manager.
- Deploy the config change so the app uses the new value.
- Verify: logins, payments, or API calls succeed and error rates stay normal.
- Revoke the old secret and confirm it no longer works.
- Record the rotation date, who did it, and the next due date.
Be explicit about encryption expectations. Secrets should be encrypted at rest in the secret store and protected in transit (TLS) between the app and its dependencies. Never put secrets in source control, build artifacts, or shared docs.
Define break-glass access. If an outage blocks normal access, specify who can approve emergency access, how long it lasts, and what must be audited afterward.
Deployment package: artifacts, versions, and rollback
Ops can only own what they can reproduce. A good deployment package makes three questions easy to answer: what exactly are we running, how do we deploy it again, and how do we get back quickly if something breaks?
Include a clear “bill of materials” for the build. Name the artifact type and how to verify it, not just where it lives:
- Artifact details: container image name/tag (or binary/package name), app version, build date, checksum
- Source reference: release tag or commit hash used to build, plus any build flags that matter
- Supported targets: VM, containers (Docker), or Kubernetes, and which one is the recommended default
- Deployment steps: prerequisites (runtime, database, storage), exact order, and typical deploy time
- Database migrations: how they run (auto on startup or manual), where logs are, and how to confirm success
Add one small, concrete example. For instance: “Deploy v1.8.2 by updating the image tag, running migrations, then restarting web workers. If health checks fail within 10 minutes, revert to v1.8.1 and stop the migration job.”
Rollback, without guesswork
A rollback plan should read like instructions you can follow at 2 a.m. It should state:
- The signal that triggers rollback (error rate, failed health check, broken login)
- The last known good version and where it’s stored
- Whether database changes are reversible, and what to do if they aren’t
If the app is built with AppMaster and exported as source code for self-hosting, include the generated code version, build instructions, and the runtime expectations so ops can rebuild the same release later.
Monitoring and alerting: what to measure and when to page
A handoff isn’t complete until ops can see what the app is doing and gets warned before users complain.
Agree on what logs must exist and where they land (file, syslog, log platform). Make sure logs are time-synced and include a request or correlation ID so incidents are traceable end to end.
You typically want application logs (key events and failures), error logs (stack traces and failed jobs), access logs (requests and status codes), audit logs (admin actions and exports), and infrastructure logs (restarts, node pressure, disk issues).
Next, define a small set of metrics that reflect user impact and system health. If you only pick five: latency (p95/p99), error rate, saturation (CPU/memory/disk), queue depth, and availability checks from outside your network.
Alert rules should be unambiguous: what triggers, severity (page vs ticket), who is on-call, and when to escalate. Add a “known good” dashboard snapshot and a short note describing what normal looks like (typical latency range, expected error rate, usual queue depth). That context prevents noisy alerts and helps new responders act quickly.
Backups and recovery: make restores repeatable
Backups aren’t something you “have.” They’re something you can restore from, on demand.
Write down the exact scope: database, file storage (uploads, reports, invoices), and the pieces people forget, like config that isn’t in code and the encryption keys needed to read protected data.
Keep targets in plain terms. RPO is how much data you can lose (for example, 15 minutes). RTO is how long you can be down (for example, 1 hour). Pick numbers the business agrees to, because they drive cost and effort.
Include:
- What is backed up, where it’s stored, and retention
- Who can run backups and restores, and how access is approved
- A step-by-step restore procedure with verification checks
- Where restore logs live, and what “success” looks like
- Common failure modes (wrong key, missing bucket, schema mismatch) and the fix
If you export and self-host an AppMaster-built app, include PostgreSQL restore steps plus any external storage buckets and the keys used for encrypted fields.
Schedule a restore drill. Record the time, what broke, and what you changed so the next restore is faster and less stressful.
Runbooks and on-call: how ops handles real incidents
A handoff isn’t real until someone can get paged at 2 a.m. and fix the problem without guessing. Runbooks turn tribal knowledge into steps an on-call person can follow.
Start with the incidents you expect to happen first: total outage, slow requests, and a deployment that breaks something. Keep each runbook short. Put the fastest checks at the top so responders get signal in minutes.
What a good runbook contains
Keep the structure consistent so it’s scannable under pressure:
- What users see and how to confirm it (example: error rate above X%, checkout failing)
- First checks (service status, recent deploy, dependency health, disk/CPU, database connections)
- Next checks (which logs to open, key dashboards, recent config changes, queue depth)
- Decision points (when to roll back, when to scale, when to disable a feature)
- Escalation (who owns the app, who owns infra, and when to page each)
If the app was exported or self-hosted from AppMaster, include where the generated services run, how to restart them safely, and which config values are expected per environment.
After the incident: capture the right facts
Keep a short post-incident checklist. Record the timeline, what changed last, the exact error messages, affected users, and what action fixed it. Then update the runbook while the details are still fresh.
Access control and permissions: who can do what
Ops can only own a system if it’s clear who can touch what, and how access is tracked.
Write down the roles you actually use. For many teams, these are enough:
- Deployer: deploy approved versions and trigger rollback
- DB admin: run schema changes and restore backups
- Read-only: view dashboards, logs, and configs without editing
- Incident commander: approve emergency actions during an outage
Document the “door policy” in plain steps: who grants access, where it’s granted (SSO, cloud IAM, database users, CI/CD, admin panels), who can revoke it, and how you confirm it’s removed during offboarding.
Don’t forget non-human access. List every service account and token used by jobs, integrations, and monitoring, with a least-privilege note for each (for example, “can read from bucket X only”). If you export AppMaster source code for self-hosting, include which env vars or config files define these identities, but never paste secret values into the handoff doc.
Also set audit log expectations: what must be logged (login, deploy, config change, DB admin actions), who can read logs, retention, where logs are stored, and how to request logs during an incident or review.
Security and compliance basics (plain English)
Security notes should be readable by non-specialists, but specific enough that ops can act. Add a one-page summary that answers: what data do we store, where does it live, and who can access it?
Start with data types: customer profiles, support tickets, payment metadata, files. Call out sensitive categories like PII (names, emails, phone numbers), credentials, and any regulated data your company cares about. If you exported source code for self-hosting (including from AppMaster), note where that data ends up in the database and which services can read it.
Then write retention and deletion rules in practical terms. Say what you keep, for how long, and how deletion works (soft delete vs hard delete, delayed purge, backups). If you have legal holds or audit needs, note who approves exceptions.
Logs often leak more than databases. Be clear about where PII can appear (access logs, error logs, analytics events) and how you reduce or mask it. If a field must never be logged, state that rule.
Keep approvals explicit:
- Authentication changes need a named approver.
- Payment-related changes (Stripe keys, webhook endpoints, refund logic) need a named approver.
- Role and permission model changes need a named approver.
- Security patching windows and emergency change rules are written down.
If you can add only one extra thing, add an evidence note: where audit logs are kept and how to export them when someone asks for proof.
Example handoff scenario: ops takes ownership in one week
Ops is taking over a customer portal built by a small product team and moving it onto new self-hosted infrastructure. The goal isn’t just “it runs,” but “ops can run it without calling the builders.”
What the week looks like
Day 1: Ops does a clean first deploy in a new environment using only the handoff package. The app comes up, but login fails because an environment variable for the email provider is missing. That gets added to the env template, and the deploy is repeated until it works from scratch.
Day 2: The first alert fires on purpose. Ops triggers a controlled failure (stop one service or block outbound email) and confirms: metrics show the issue, alerts reach the right channel, and the message says what to do next.
Day 3: A token expires in the payment sandbox. Because the credentials location and rotation steps are documented, ops replaces it without guessing or exposing secrets.
Day 4: DNS cutover. A bad record points to the old IP, and the portal seems down for some users. Ops uses the runbook to verify DNS, TLS, and health checks in the right order.
Day 5: First backup restore test. Ops restores to a fresh database and proves the portal can load real data.
What “done” looks like after 1 week
The app has run for 7 days with no mystery fixes, one successful restore, clear alerts, and a repeatable deploy that ops can do alone.
Common handoff mistakes that cause late-night incidents
The fastest way to turn a calm handoff into a 2 a.m. fire is to assume “we told ops everything” is the same as “ops can run it without us.”
Common failure patterns after a self-hosting handoff include secrets shared in spreadsheets or chat, rollbacks that depend on a developer, backups that exist but restores were never tested, alerts that fire all day because thresholds were never tuned, and environment details that only live in someone’s head (ports, DNS names, cron schedules, cloud permissions).
Example: you export source code from AppMaster for self-hosting, and the first deploy works. Two weeks later a config change breaks logins. If secrets were passed around in chat and rollback needs the original builder, ops loses hours just to get back to “working yesterday.”
Quick checks before you say “handoff complete”
Before you close the ticket, run a short fresh-start drill. Give the handoff package to one ops engineer and a clean environment (new VM, new Kubernetes namespace, or a blank cloud project). If they can deploy, observe, and recover the app within a set time (for example, 2 hours), you’re close.
Use these checks:
- Rebuild and deploy from scratch using only the packaged artifacts, config docs, and runbooks (including a rollback).
- Verify every secret lives in the agreed place, and that rotation steps are written and tested.
- Open dashboards and confirm they answer basic questions: is it up, is it slow, is it erroring, is it running out of resources?
- Trigger one safe test alert to confirm paging routes, owners, and quiet hours behave as expected.
- Perform a real restore test into a separate environment, then document the exact steps and expected result.
If you’re exporting generated source code for self-hosting, also confirm ops knows where build inputs, versions, and release tags are recorded so future releases stay repeatable.
Next steps: finalize ownership and keep the package current
Run one final walkthrough with the people who will carry the pager. Treat it like a rehearsal. Prove deploy, rollback, restore, and alerting all work with the exact package you’re handing over.
A final walkthrough usually covers: deploy to a test environment and then production using the same steps, roll back to the previous version and verify the app still works, restore from backup into a clean environment and validate a simple real check (login, create a record, send a message), trigger one safe test alert, and confirm where to find logs and dashboards during an incident.
Make ownership explicit. Assign a named owner for each runbook (deploy, incident, restore) and for each alert route (primary on-call, backup, after-hours behavior). If nobody owns an alert, it will either be ignored or wake up the wrong person.
Write a short Day 2 plan so ops knows what to improve after the first week: tuning thresholds, checking costs, cleaning up old artifacts, and reviewing access. Keep it small and time-boxed.
If you built with AppMaster (appmaster.io), include the exported source code or the exact deployment target details (cloud, regions, build settings, required services) so ops can reproduce the app without relying on the original project workspace. Set a simple cadence to update the package whenever requirements change, so runbooks don’t drift from reality.
FAQ
A production-ready handoff means ops can deploy a known version, confirm it’s healthy, respond to alerts, and recover from failures without relying on a specific developer’s memory. If a week without the builders would put uptime at risk, the handoff isn’t finished.
Start with a one-page system inventory that lists every running component and where it lives: API, web app, workers, scheduled jobs, database, storage, and required third-party services. Add the domains, ports, DNS/TLS ownership, and rough expected load so ops can operate without guessing.
Provide a single config reference that lists every config key, its type, and a safe example value, plus what differs between dev/staging/prod. Keep real credentials out of it, and document where config is stored and how it’s applied during deploys so changes are repeatable.
Create a short secrets list that states what each secret is for, where it is stored, who can read it, and who owns rotation. Write rotation steps like a checklist with one clear verification step, and include a break-glass process for emergencies with audit expectations.
Ops needs to know exactly what is running and how to reproduce it: artifact name/tag, version, build date, checksum, and the source reference used to build it. Include the recommended deployment target, the deploy order, expected deploy time, and how database migrations run and are verified.
Define the trigger signals (like failing health checks or elevated error rate), the last known good version, and the exact steps to revert quickly. Call out whether database changes are reversible, and what the safe fallback is if they aren’t, so rollback doesn’t turn into improvisation.
Pick a small set of metrics that reflect user impact: latency, error rate, resource saturation, queue depth, and an external uptime check. Make alerts explicit about severity, who gets paged, and what “normal” looks like, and ensure logs are time-synced and traceable with a correlation ID.
Document what is backed up, where it’s stored, retention, and who can run restores. Include a step-by-step restore procedure with a verification check, and schedule a restore drill; backups only matter if a restore can be performed on demand in a clean environment.
Keep runbooks short and consistent: symptoms, first checks, next checks, decision points, and escalation. Focus on the likely first incidents (outage, slowness, bad deploy), and update the runbook right after incidents while details are fresh so knowledge doesn’t drift.
Write down the roles you actually use (deployer, DB admin, read-only, incident commander), how access is granted and revoked, and what must be logged for auditing. Don’t forget service accounts and tokens; list what they can access and where their identities are configured without including secret values.


