AI ARCHITECT · VOICE AI / Practice est. 2012 / Hyderabad · 17.385°N · 78.486°E
Advisory open · Q3 2026
AI Architect · Voice AI Specialist

Santosh Varma.

I design the voice gateways, retrieval pipelines, and agent runtimes underneath the AI products Fortune 500s actually deploy.

14+ Years shipping
F500 Production deployments
$10M Programme supervised
<800ms Voice round-trip budget
Trusted by
Kore.ai · Storable · Savor Brands · Skillsoft · OPSTECH · TGQuest
BFSI · Healthcare · Retail · Telecom

AI Architecture is the work between the model and the user. Everything demos easily and almost nothing ships.

The model is the easy part. Retrieval that actually retrieves. Voice that hears the user over a bad line. Agents that recover when a tool returns garbage. Infrastructure that survives Tuesday. That is the work — and it is invariably the part the team didn't plan for.

I've been an architect for fourteen years, eight of them around AI. I design the layer that decides whether a Fortune 500 voice agent answers in 600 ms or three seconds; whether a RAG pipeline cites the right paragraph or invents one; whether an autonomous agent can be trusted with a transaction.

Santosh Varma · Hyderabad · 2026

The Specialty · Voice AI

Voice is the
unforgiving
layer.

Text gives you a second to think. Voice does not. The user notices every dropped frame, every missed barge-in, every 200ms of latency. Voice is where AI architecture stops being theory.

I design real-time voice gateways for enterprise scale — ASR/TTS over SIP and WebRTC, turn-taking tuned to a call centre's cadence, LLM routing across hosted and private endpoints, and the eval rigs that catch regressions before they reach a real conversation.

Round-trip <800 ms target
Transport SIP · WebRTC
Stack ASR · TTS · barge-in · NLU
Scale Fortune 500 concurrency
02The rest of the stack
i.

RAG & Hybrid
Retrieval

Retrieval that actually retrieves. Semantic + lexical, re-rankers, grounded-response eval at every step.

Q Hybrid Rerank LLM Answer
MilvusWeaviateVespa
ii.

Agentic
Systems

Autonomous agents that get real work done — tool-use, policy auth, recoverable state, runtime observability.

Agent Plan Act Verify Recover
Tool-usePolicyReplay
iii.

AI Platform
Engineering

The unsexy infra that lets the interesting work survive: routing, isolation, cost/latency budgets, eval pipelines.

Product Agents RAG Model Routing Eval & Obs
Multi-tenantSLOEval-CI
03Selected work
N° 001
2026 — present
AI Architect

Kore.ai

Reference architectures for voice AI, RAG, and agentic workflows across the XO Platform — AgentAssist, SmartAssist, SearchAssist, Voice AI.

Voice GatewaysRAGAgents
N° 002
2024 — 2026
Staff Software Engineer

Storable Auctions

Modernised a legacy monolith into event-driven microservices. Redesigned the Sitelink (FMS) integration end-to-end.

Event-drivenMicroservicesFMS
N° 003
2021 — 2024
Senior Fullstack

Savor Brands

Multi-party white-label commerce platform & a video-based patient-doctor telehealth product. Honolulu.

eCommerceWebRTC$10M scope
N° 004
Side projects
Architect & Builder

OPSTECH · TGQuest

Autonomous micro-agents for workflow automation. Hybrid retrieval, task orchestration, context-driven execution.

AgentsRAGAutomation
04Ask the agent

Interrogate the work.
In your own words.

An agent grounded on my résumé, capabilities, and case studies. Recruiters: poke at the experience. Founders: ask how I'd approach the problem you're stuck on. Replies stay short and concrete.

  • Right architect for enterprise voice AI?
  • RAG over 10M proprietary docs — how?
  • Three bullets — why hire you?
  • Most under-rated production lesson?
sv.agent · v1.0 llama-3.3-70b
Sv

Ask anything about the work — voice AI, retrieval, agents, the boring infra. I'll keep it short and concrete.

05Five principles
  1. 01

    Honesty over hype.

    If the model can't answer it well, the system says so. Eval is a first-class part of the architecture, not a metric you add later.

  2. 02

    Latency is a feature.

    A 200ms answer that's "good enough" beats a perfect answer at three seconds — especially over voice. I design to the budget first.

  3. 03

    The boring infra wins.

    Routing, isolation, observability, recovery. The interesting work only stays interesting because the boring parts are bulletproof.

  4. 04

    Build for replacement.

    Every model, every provider, every chunking strategy gets swapped within 12 months. I architect for the swap, not the commitment.

  5. 05

    Mentor the next layer.

    A platform that depends on one person knowing how it works is a platform with a single point of failure. I leave teams stronger.

06Receipts

Architecting AI-enabled voice gateways on the Kore.ai platform — ASR, TTS, SIP/WebRTC telephony, real-time streaming, and LLM-driven intent resolution powering enterprise voice agents.

— Current role · Kore.ai · 2026

Implemented process improvements resulting in a 20% reduction in issue resolution time.

— Mahathi Software

Led evaluation and enhancement of software/hardware interfaces, resulting in a 30% improvement in system performance and reliability.