Sarvam AI Investor Report 2026 — India's Sovereign LLM & Funding

Section 01

Executive Snapshot

Company

Sarvam AI (Sarvamai)

Industry

Foundational AI, Large Language Models, Multimodal AI, Speech Technology

Founded

August 2023, Bengaluru — by Vivek Raghavan & Pratyush Kumar (ex-AI4Bharat, IIT Madras)

Funding

$41M+ Series A — Lightspeed India, Peak XV Partners, Khosla Ventures (Dec 2023) + Government compute support

Key Achievement

Selected by Indian Government (April 2025) to build India's first sovereign LLM under IndiaAI Mission

Models

Sarvam-30B, Sarvam-105B (flagship "Indus"), Saaras V3 (speech), Bulbul V3 (TTS), Sarvam Vision, Kaze smart glasses

Employees

~200–300 (small research-first team by design)

Government Compute

4,096 NVIDIA H100 SXM GPUs via Yotta Data Services — ~₹99Cr in GPU subsidies under IndiaAI Mission

Why It Matters

India has 1.4 billion people speaking 22 scheduled languages. The global LLM giants — OpenAI, Google, Anthropic, Meta — are excellent at English and adequate at a handful of European languages. For a farmer in Maharashtra asking a government AI about crop subsidies in Marathi, or a patient in Tamil Nadu trying to understand their prescription in Tamil, these models are essentially useless. Sarvam AI is building the AI infrastructure layer for a country that the existing large models weren't built for. The February 2026 unveiling of Sarvam-30B and Sarvam-105B at India's AI Impact Summit — trained from scratch, not fine-tuned from Western models, on sovereign Indian compute — is the moment India formally entered the foundational LLM race.

Section 02

Company Overview

Sarvam AI is India's most significant AI bet — not in terms of funding or valuation, but in terms of strategic ambition. In two and a half years, a team of researchers who left India's premier AI research lab have built a full-stack AI company: foundational language models trained on Indian data, a speech technology platform that supports 22 Indian languages, a vision-language model that outperforms Google and OpenAI on multilingual document understanding, and a physical product — the Kaze smart glasses — that makes AI accessible in audio form for users who don't type. The company received India's highest honour in AI when the government selected it from 67 applicants to build India's sovereign LLM in April 2025.

The foundational models — Sarvam-30B and Sarvam-105B — were unveiled at Bharat Mandapam in New Delhi at the India AI Impact Summit in February 2026. Both were trained from scratch on Indian language data using government-provided compute infrastructure. "Indus," the beta consumer version of Sarvam-105B, launched simultaneously on iOS, Android, and web. It supports reasoning in Indian languages at a standard that, until this moment, no domestic company had achieved.

Sarvam-105BFlagship LLM Parameters

84.3%Vision OCR (vs 80.2% Gemini)

32KContext Window (30B model)

128KContext Window (105B model)

Section 03

The Founders

Vivek Raghavan and Pratyush Kumar came from AI4Bharat — India's premier open-source AI research initiative, based at IIT Madras — where they spent years building multilingual speech and text datasets for Indian languages. The research was excellent. The impact was limited. Academic outputs and papers don't reach the farmer in Maharashtra who needs AI to understand the government scheme he just read about.

In August 2023 they founded Sarvam AI with the conviction that building a company — with commercial incentives, a product focus, and VC capital — was the only way to take what AI4Bharat had learned and deploy it at population scale. Within five months of founding, they had raised $41 million — one of the fastest Series A raises in Indian AI history — from Lightspeed India, Peak XV Partners, and Khosla Ventures. The speed of the raise reflected both the founders' credibility and the timing: the post-ChatGPT world had just made everyone understand that LLMs were real, and India had nobody building one.

"Sovereignty matters much more in AI than building the biggest models. India needs AI that understands India — not AI that translates English for India."

— Vivek Raghavan, Co-founder, Sarvam AI (India AI Impact Summit, February 2026)

Section 04

The Problem They Solve

The global AI revolution is, in practice, an English-language revolution with multilingual features bolted on. ChatGPT, Gemini, Claude, and Llama are trained predominantly on internet text, which skews heavily toward English, European languages, and Chinese. Indian languages — Hindi, Bengali, Tamil, Telugu, Marathi, Kannada, Gujarati, and 14 more scheduled languages — are represented marginally in these training datasets. The result is that AI assistants in Indian languages make factual errors, miss cultural context, misunderstand idiomatic expressions, and fail at the document-understanding tasks (reading Aadhaar forms, PAN cards, land records, crop insurance applications) that represent the practical AI use cases for hundreds of millions of Indians.

Sarvam's insight is that building AI for India is not a translation problem — it's a training data problem. You need large amounts of high-quality Indian language text and speech data, and you need models trained on that data from the start, not fine-tuned from English models. That is what the sovereign LLM programme, backed by government compute and Sarvam's own data collection work, is designed to produce.

Section 05

The Models

Sarvam-30B

Unveiled February 2026

30-billion parameter model, Mixture-of-Experts architecture, ~1B active parameters per token. Context window: 32,000 tokens. Trained on 16 trillion tokens. Designed for real-time conversational use — lower latency, cost-efficient. Benchmarks competitively against Gemma 27B, Mistral-32-24B, Qwen-30B on reasoning and coding.

Sarvam-105B "Indus"

Unveiled February 2026 — Beta on iOS/Android/Web

105-billion parameter flagship, ~9B active parameters per token, 128K context window. Built for enterprise-grade complex reasoning. Consumer beta "Indus" released simultaneously on app stores. Targets agentic workflows, multi-step reasoning in Indian languages.

Sarvam Vision

Vision-Language + OCR System

3-billion-parameter model for document understanding — reads mixed-script text, scanned forms, handwriting across 22 Indian languages. Scored 84.3% on olmOCR-Bench, beating Gemini 3 Pro (80.2%) and ChatGPT (69.8%). Real-world accuracy 93.28% on OmniDocBench. The benchmark results are not incremental improvements — they're decisive leadership in the specific use case that matters most for Indian government services.

Bulbul V3 + Saaras V3

Voice AI Stack

Bulbul V3: Advanced text-to-speech with 35+ voices across 11 Indian languages. Saaras V3: Automatic speech recognition for 22 Indian languages — the widest language coverage of any production ASR system in India. Together they form a complete voice layer that enables AI interfaces for India's non-typing, voice-first users.

Sarvam Kaze — The Smart Glasses

Unveiled in early 2026, the Kaze smart glasses are India's first Made-in-India AI wearable — listening, understanding, and capturing what users see in real time. Supports 10+ Indian languages for voice-based interaction and real-time translation. Launch planned May 2026. The Kaze positions Sarvam as a full-stack AI company, not just an API provider. Think of it as India's version of Meta Ray-Bans, built for India's linguistic complexity.

Section 06

The IndiaAI Mission — Government Partnership

In April 2025, India's Ministry of Electronics and Information Technology selected Sarvam AI from a pool of 67 applicants to build India's sovereign foundational LLM. The selection gave Sarvam access to 4,096 NVIDIA H100 SXM GPUs via Yotta Data Services — approximately ₹99 crore in compute subsidies. In exchange, the government takes an equity stake in Sarvam, and the sovereign LLM remains managed and governed within India's borders.

India AI Impact Summit — February 2026, New Delhi

At the India AI Impact Summit held at Bharat Mandapam, Sarvam AI unveiled Sarvam-30B and Sarvam-105B to an audience that included Google CEO Sundar Pichai, Sam Altman (OpenAI), and Dario Amodei (Anthropic). Sundar Pichai observed: "The developer energy I find in India is second to none." The Summit committed over $200 billion in AI-related investments and formally positioned India as a sovereign AI nation, not just an AI services provider. For Sarvam, the moment represented institutional validation at the highest level.

The IndiaAI Mission's ₹10,371 crore budget — potentially doubling to ₹20,000 crore following the Summit — makes it the largest government AI investment in any developing country. India has gone from almost no AI infrastructure in 2023 to nearly 40,000 government-accessible GPUs in 2025–26.

Section 07

Funding History

August 2023 — Founded

Vivek Raghavan and Pratyush Kumar leave AI4Bharat to found Sarvam AI. Their research background and immediate institutional credibility set them apart from most AI startups.

December 2023 — $41M Series A (one of India's fastest)

Led by Lightspeed India, with Peak XV Partners and Khosla Ventures. $41 million raised within 5 months of founding — validates the thesis that India's LLM moment had arrived and these were the right founders. Effectively the seed and Series A combined into one swift round.

April 2025 — IndiaAI Mission selection

Government selects Sarvam from 67 applicants to build India's sovereign LLM. Access to 4,096 H100 GPUs, ₹99Cr in compute subsidies. Government equity stake. Non-trivial terms — but compute access at this scale is transformative for a startup that couldn't otherwise afford it.

February 2026 — Sarvam-30B & 105B unveiled at AI Impact Summit

India's sovereign AI moment. Both models open-sourced under Apache 2.0 on Hugging Face. "Indus" (consumer 105B beta) released simultaneously on iOS, Android, and web. UIDAI Aadhaar integration, startup programme launched. Kaze smart glasses announced for May 2026.

Section 08

Business Model

Sarvam's monetisation has three channels. The first is API access — developers and enterprises pay for access to Sarvam models through Sarvam's cloud. The second is enterprise contracts — government integrations (UIDAI/Aadhaar collaboration, government services in Indian languages), financial services, healthcare providers, and enterprises that need Indian language AI. The third, longer-term channel is hardware (Kaze smart glasses and future devices) that ship a software service subscription model.

The current stage is primarily pre-revenue at scale — the company is spending most of its capital on model training and infrastructure, using the government compute subsidy to extend its R&D runway without burning VC capital on GPUs. The startup programme announced in March 2026 — offering early-stage companies 6–12 months of free API credits — is a community-building move to create the developer ecosystem that makes Sarvam models the default layer for Indian language AI applications.

Section 09

Competitive Landscape

Company	Country	Indian Language Focus	Scale	Status
Sarvam AI	India	22 Indian languages — native training	Up to 105B params	Sovereign LLM
Krutrim (Bhavish Aggarwal)	India	13 Indian languages — multilingual focus	12B (Krutrim-2)	Consumer AI
Google Gemini	USA	Adequate — trained on limited Indian data	Ultra/Pro/Nano	Global model, Indian gaps
OpenAI GPT-4/o	USA	Limited — English-dominant training	Frontier	Global, not India-first
BharatGen (IIT Bombay)	India	22 Indian languages — Param2 17B MoE	17B MoE	Govt-funded academic

Section 10

Strengths & Challenges

Genuine Advantages

Government mandate — no other Indian company has this official backing
AI4Bharat heritage — deepest Indian language dataset expertise in the country
Vision OCR beating OpenAI and Google on the specific task that matters most for India
Apache 2.0 open-source release — developer ecosystem adoption moat
Government equity means political alignment and distribution through DPI/Aadhaar
Full-stack ambition: model + voice + vision + hardware (Kaze)

Real Vulnerabilities

$41M is tiny vs OpenAI ($18B+), Anthropic ($8B+), Google (∞)
Revenue near zero — long path to commercial sustainability
Government dependency creates mission-drift risk (serving governance vs. commerce)
Krutrim has Bhavish Aggarwal's capital and Ola ecosystem distribution
Global models improving Indian language support faster than expected
Brain drain — global AI companies paying 10× for the same talent

Section 11

The Sovereign AI Debate

Not everyone is excited about India's sovereign AI model. The critical argument: in a world where DeepSeek built a frontier model for $6 million by standing on Meta's open-source shoulders, why should India spend ₹10,000 crore building models from scratch instead of fine-tuning existing open-source models for Indian languages at a fraction of the cost?

The counter-argument — which the Indian government has clearly found persuasive — is that data sovereignty and infrastructure sovereignty matter independently of the cost efficiency question. A government deploying AI for Aadhaar, citizen services, judicial systems, and national security cannot run that AI on American infrastructure controlled by American companies subject to American law. The sovereign LLM is not primarily a technology decision — it is a geopolitical infrastructure decision, similar to how India runs its own payment network (UPI) rather than operating purely on Visa and Mastercard.

The open-sourcing of Sarvam-30B and Sarvam-105B under Apache 2.0 signals that Sarvam understands it needs a developer ecosystem, not just government contracts, to become the default Indian AI infrastructure layer. Open models that developers can build on create the adoption flywheel that closed models never achieve at the developer level.

Section 12

Key Lessons

1. Researcher Founders Have a Unique Advantage in Deep Tech

Vivek Raghavan and Pratyush Kumar's academic background at IIT Madras and AI4Bharat gave them two things that most startup founders don't have: a decade of relevant dataset work and credibility with both the government and the global AI research community. When they said they could build a sovereign Indian LLM, people believed them because they had already built the datasets that a sovereign Indian LLM requires. The lesson for deep tech: domain expertise from serious research institutions is not just a credential — it is the starting capital.

2. Government Can Be a Customer AND an Investor AND a Compute Provider

India's AI policy approach — subsidising compute, taking equity, mandating deployment through government services — creates an unusual public-private partnership that Western AI startups don't operate within. Sarvam's government-as-compute-provider model has allowed it to train 105B parameter models on a $41M funding base that would have been impossible in a purely commercial environment. For startups in strategic sectors, understanding how to partner with governments as capital and infrastructure providers is an underrated capability.

Section 13

Future Outlook

Sarvam's roadmap through 2026 is clear: ship Kaze in May 2026, grow the developer ecosystem through the startup API credit programme, deepen the UIDAI and government service integrations, and pursue enterprise revenue in financial services, healthcare, and education. The longer-term question is whether Sarvam can achieve commercial sustainability before the government compute subsidy ends and before the global AI companies close the Indian language performance gap that Sarvam currently holds.

The 2027 Test

The IndiaAI Mission GPU allocation is finite — 4,096 H100s for a defined period. After that period, Sarvam must fund its own compute. Given that Sarvam-105B-scale training runs cost millions of dollars, the commercial revenue must be substantial well before the compute subsidy expires. If the developer ecosystem builds strongly on Sarvam's open models, API revenue at scale is achievable. If the ecosystem prefers global APIs that are cheaper or better, Sarvam's path gets much harder. The open-source release was the right strategic move to build the ecosystem — the execution on that ecosystem over the next 18 months will determine whether the commercial model closes.

The Bottom Line

Sarvam AI has done, in 30 months, what most AI observers assumed would take India a decade: built a 105-billion parameter foundational LLM from scratch, trained on Indian language data, deployed on sovereign Indian compute, open-sourced under Apache 2.0, and launched a consumer product at India's highest-profile AI event. The models are competitive on the benchmarks that matter for Indian language AI. The government partnership gives distribution channels no commercial company could achieve independently. The remaining questions are not about capability — they are about commercialisation. $41M is modest for frontier AI, the government subsidy is temporary, and the global competition is accelerating Indian language support faster than expected. Whether Sarvam can translate genuine technical leadership into a sustainable business model before those windows close is the story of the next three years.

INDIA'SSOVEREIGNINTELLIGENCE