Implementation guide · Six gates · Telephony, compliance, CRM, cost
AI Receptionist Implementation Checklist (2026): Six Gates to Production
Last verified: June 12, 2026. This article is not legal or compliance advice; consult qualified counsel for your specific workflows.
The Six Gates You Must Pass
Telephony gate
Inbound transport works, audio is stable, and disconnect/retry behavior is defined.
Voice orchestration gate
The call flow is explicit and testable, not 'prompt magic.' An engineer, QA tester, or compliance reviewer can trace it.
Booking gate
The system only says 'booked' after a successful tool action — availability check passed, API call succeeded, appointment ID verified.
Integration gate
CRM/calendar field mapping, timezone handling, and idempotency are verified. Failure modes are defined and handled.
Compliance and safety gate
Recordings and transcripts are classified, secured, and retained appropriately. HIPAA applicability is decided. Breach response runbook exists.
Cost and acceptance gate
The full bill is modeled — telephony, inference, transcription, storage, human fallback — and actual performance is measured before rollout.
Requirements and Scope: The Part Teams Skip
Most implementations fail later because teams do not define four things up front.
Call scope
In-hours, after-hours, overflow, voicemail fallback, emergency handling. Define all of it before selecting a vendor.
Booking policy
What counts as confirmed versus tentative? A booking is confirmed only when availability check passed, API call succeeded, appointment ID verified, timezone correct, and confirmation matched the real record.
Data scope
What do you store? Recordings, transcripts, summaries? How long? Who can access them? Are logs redacted? Where stored? Encrypted at rest and in transit?
Escalation policy
What forces a human handoff? List every trigger: low confidence, missing fields, ambiguous request, calendar API failure, scheduling conflict, policy exception, fraud suspicion.
Telephony and Voice Transport Checklist
Your AI receptionist starts at the audio edge. If this layer is weak, everything above it looks worse than it is.
Verify session behavior
- Call connect time and media start time
- Turn-taking and barge-in behavior (can caller interrupt mid-response?)
- Silence handling and timeout behavior
- Disconnect reason mapping
- Retry policy for transient failures
Run audio quality tests with real-world conditions
- Caller interrupting the agent mid-response
- Background TV or radio noise
- Speakerphone audio
- Accents and fast speech
- Long pauses and repeated “hello?” prompts
Measure: barge-in success rate, whether the agent loops during silence, whether it recovers from bad audio, whether the caller gets stuck repeating themselves.
Define hard session limits in writing
- Maximum call duration
- Maximum silence window before abandonment
- Maximum failed tool-call retries
- When to abandon and transfer, with what fallback message
Voice + LLM Orchestration: Use an Explicit, Testable Flow
Do not rely on “the model will figure it out.” The flow should be obvious enough that an engineer, QA tester, or compliance reviewer can trace it without asking you to explain it.
A solid receptionist flow:
- Greet the caller
- Identify the reason for the call
- Collect only required fields
- Verify identity when needed
- Check availability
- Create or update the appointment (tool action — must confirm success)
- Confirm details back to the caller (date, time, timezone, location, provider)
- Offer next steps
- Escalate if anything is ambiguous or any step fails
Define explicit tool actions
lookup_customer()check_availability()create_appointment()reschedule_appointment()cancel_appointment()escalate_to_human()Each tool needs input validation and deterministic return values. The agent should not say “you’re booked” until the tool confirms success.
Add traceability — log the full chain
call_idsession_idtool_call_idbooking_iderror_codeescalation_reasonAlso define what gets redacted, how long logs are retained, who can access them, and how transcripts are exported for QA.
Booking Correctness: No “Close Enough”
The biggest production failure mode is a receptionist that sounds confident while being wrong. Your checklist has to block that.
| Requirement | What to verify |
|---|---|
| Slot availability | The slot is still open at the moment of booking, not just when availability was first checked |
| API success | The create/update call succeeded and returned a valid appointment ID |
| Record match | The final returned record matches what was spoken to the caller |
| Timezone correctness | Appointment is stored with the correct timezone; daylight saving is handled |
| Idempotency | Duplicate retries cannot create a second appointment in the same slot |
| Confirmation script | Always confirms: date, time, timezone, location, provider, service type. Asks for caller confirmation. Resolves or escalates any dispute. |
Handoff triggers — escalate to human when:
- Intent confidence is low
- Required fields are missing
- Calendar API fails
- Scheduling conflict cannot be resolved
- Caller is asking for a policy exception
- Caller’s request is outside supported scope
What the human must receive at handoff:
- Call summary and caller details
- Extracted intent
- Availability search result
- Failed tool call data
- Reason for escalation
- Conversation ID for lookup
CRM and Calendar Integration Checklist
Field mapping table (build this for every integration)
| Caller data | Destination field | What happens if missing |
|---|---|---|
| Caller name | Contact name | Verify default behavior |
| Caller phone | Phone field | Verify default behavior |
| Appointment type | Service code / event type | Verify default behavior |
| Preferred time | Normalized start/end time | Verify timezone handling |
| Location | Location ID | Verify routing logic |
| Provider | Provider ID | Verify escalation if unresolvable |
Test these failure modes explicitly
- CRM is down — what does the caller hear?
- Calendar API times out — does it retry or escalate?
- Credentials expire mid-session
- The slot is already booked by the time the API call fires
- Appointment write partially succeeds
A good receptionist does not improvise in failure. It escalates clearly and leaves a record.
Compliance and Safety Controls
HIPAA applicability decision
HIPAA may apply if you are a covered entity or business associate and the data includes PHI. If HIPAA applies, the HHS HIPAA Breach Notification Rule requires notifications without unreasonable delay and in no case later than 60 calendar days after discovery for unsecured PHI. Your checklist should answer: Is PHI present? Are transcripts and recordings stored? Are they secured? Who is responsible for breach response?
Recording and transcript governance
- Encryption in transit and at rest
- Retention settings defined and enforced
- PHI redaction or masking options configured
- Access control: who can see recordings and transcripts?
- Audit logs: who accessed what and when?
- Deletion verification process
FCC voice cloning and outbound risk
The FCC’s February 8, 2024 ruling confirmed that AI-generated and voice-cloned voices in covered robocalls are treated as artificial or prerecorded voices under the TCPA. If your AI receptionist has any outbound layer, review that leg for consent requirements. Even inbound-only deployments should configure impersonation detection and disclosure behavior.
Cost and Acceptance Gate
The acceptance gate is where you prove the system works before full rollout. Measure actual performance against defined acceptance criteria with a full cost model.
| Cost component | What to model |
|---|---|
| Telephony minutes | Per-minute or per-call rate, overage, concurrency limits |
| Speech inference | GPT-Realtime-2: $32/1M audio input tokens, $64/1M audio output tokens (OpenAI, 2026) |
| Transcription | GPT-Realtime-Whisper: $0.017/minute (OpenAI, 2026) — billed separately |
| Storage | Recordings, transcripts, analytics data |
| Human fallback | Staff time for escalated calls; escalation rate × average handle time × hourly cost |
| Integration and QA | Setup, ongoing QA, retraining, compliance review time |
Source: OpenAI API pricing page, accessed June 12, 2026. Verify current rates at OpenAI before budgeting.
See also: AI Receptionist ROI Calculator for a full framework for modeling returns against these costs.
FAQ
- What are the six gates an AI receptionist must pass before going live?
- Telephony gate (inbound transport works, audio is stable, disconnect/retry behavior is defined), voice orchestration gate (the call flow is explicit and testable), booking gate (the system only says 'booked' after a successful tool action), integration gate (CRM/calendar field mapping, timezone handling, and idempotency are verified), compliance and safety gate (recordings/transcripts are classified, secured, and retained appropriately), and cost and acceptance gate (you model the full bill and measure actual performance before rollout).
- What does 'booking is atomic' mean in an AI receptionist context?
- A booking is only valid if: the availability check passed, the appointment create/update call succeeded, the returned appointment ID was verified, the timezone is correct, and the confirmation message matched the actual record. If any step fails, the call is not a success — it is a handoff or a tentative hold. The agent should never say 'you're booked' until the tool confirms success.
- How does OpenAI Realtime pricing affect AI receptionist cost modeling?
- As published in OpenAI's 2026 voice-model pricing update, GPT-Realtime-2 is $32 per 1M audio input tokens and $64 per 1M audio output tokens, plus GPT-Realtime-Whisper at $0.017 per minute. OpenAI also documents that Realtime costs accrue when a Response is created, and that transcription uses a different model and billing path. This is why you cannot compare AI receptionist tools only on their headline monthly price — underlying model costs are different for voice vs text.
- Why does timezone handling get its own checklist item?
- Timezone bugs cause real-world misbookings. Your integration must infer timezone only if policy allows it, store timezone explicitly, convert consistently before booking, verify daylight saving time behavior, and read back the final time in the caller's local context. Treat timezone handling as a release blocker, not a detail.
- When does HIPAA apply to an AI receptionist deployment?
- HIPAA may apply if you are a covered entity or business associate and the call data includes protected health information (PHI). If the system stores transcripts or recordings that contain PHI, the HHS HIPAA Breach Notification Rule says notifications must be made without unreasonable delay and in no case later than 60 calendar days after discovery for unsecured PHI. If HIPAA applies, get a BAA from your vendor before any PHI touches the system.
- What is an idempotency key and why do AI receptionist bookings need one?
- An idempotency key is a unique identifier attached to a write operation that ensures repeated calls with the same key produce only one result. AI receptionist bookings need idempotency keys so that retry logic — when a confirmation fails and the system retries the booking — does not create a second appointment in the same slot. Define the key format, where keys are stored, how long they live, and what happens if the first write succeeded but confirmation failed.