← All cases
Telecom2,500 employees · 8M subscribers10-week pilot + 6-month rolloutOutcome Partnership

A voice agent that resolves 63% of first-level calls without ever reaching a human operator

Grew per-agent monthly call capacity 2.3× while raising CSAT from 4.1/5 to 4.3/5.

Industry
Telecommunications — Call center automation
Engagement model
Outcome Partnership
Company size
2,500 employees · 8M subscribers
Engagement length
10-week pilot + 6-month rollout
// 01

Starting pointChallenge

One of Turkey's top 4 telecom operators. 3.2M calls per month, 2,500 operators, average wait time 4 min 12 sec. 71% of calls were repetitive queries (plan info, billing lookup, basic tech support) — work that produced no value, forced on operators.

Prior consultancy attempts: two separate IVR upgrade projects and a chatbot built by another firm. All shipped, none produced measurable CSAT improvement. The leadership team had reached 'AI projects die' fatigue.

The client's only ask when starting with us was: 'This time, no slides — we want a working system. If you fail, you refund the engagement fee.'

// 02

ApproachApproach

  1. step 01

    Week 1-2 — Call taxonomy

    Clustering on 12,000 transcripts from the last 90 days. Finding: the '71% repetitive queries' claim was actually 58% — the rest was hidden because operators weren't selecting categories. This single insight redefined the pilot scope.

  2. step 02

    Week 3 — Voice stack selection

    Three pipelines tested (OpenAI Realtime, Deepgram + GPT-4o + ElevenLabs, AWS-only). Latency, hallucination, and cost evaluated together. Selected: Deepgram nova-3 + Anthropic Claude (function calling) + ElevenLabs Turbo. Latency budget: <800ms end-to-end.

  3. step 03

    Week 4-7 — Production pilot

    Agent built for the first 4 scenarios (billing query, plan change, modem reset, quota display). Pilot started at 50,000 calls/day (15% of total). Additional context (call history, billing state) fed via RAG from the client's data lake.

  4. step 04

    Week 8-10 — KPI tuning

    In the first 2 weeks, the agent escalated to humans at a 41% rate — higher than expected. Error analysis improved the user sentiment (anger detection) model and added prompt-level guardrails. Escalation rate dropped to 22%.

  5. step 05

    Week 11+ — Rollout

    Pilot succeeded: 60% of all incoming calls routed to the agent. Human operators focused on the remaining 40% plus the agent's escalations. Internal training and oncall rotation set up; the client's AI team (3 people) was trained by us during handover.

// 03

Results

63%
First-level resolution rate
calls the agent resolved without escalation
2.3×
Calls per operator capacity
operators now only handle complex cases
4.1 → 4.3
CSAT score
out of 5, measured at t+90 days
8.4 mo
Payback period
engagement + Cloud + LLM cost / monthly operational savings
0.4%
Agent hallucination rate
out-of-policy information — measured with production guardrails
770ms
Average end-to-end latency
user finishes speaking → agent starts responding

"For the first time, we didn't have to ask 'is this AI thing actually working?' — the numbers were in front of us every day."

Client side — Director of Operations

// 04

Technology stack

  • Deepgram Nova-3
  • Anthropic Claude (Sonnet 4.5)
  • ElevenLabs Turbo
  • LiveKit SFU
  • PostgreSQL + pgvector
  • GCP Vertex AI Search
  • Datadog APM
  • Custom orchestrator (TypeScript)
// 05

What came next

Outbound calls + 4 new scenarios

After pilot success, the engagement converted to an Outcome Partnership. The same stack now powers outbound sales calls and campaign notifications. Yearly active engagement.

~/your-engagement

Where does yours sit in this picture?

In a 30-minute discovery call we listen to your current state and share an initial read on whether a similar engagement makes sense. No commitment.