Key Takeaways
Sarvam AI models — including the 24B-parameter Sarvam-M and lightweight Sarvam 2B — are India’s first sovereign LLMs built for 11+ Indian languages. Sarvam Vision hit 84.3% OCR accuracy, outperforming ChatGPT and Gemini. The Sarvam-M chat model is free via API. Government-backed with 4,096 NVIDIA H100 GPUs, these models are reshaping multilingual AI worldwide.
You’ve probably tried asking ChatGPT something in Hindi. Or Tamil. Or Bengali. And you probably got a response that felt like a confused tourist reading a phrasebook — technically correct, culturally clueless.
That’s the exact problem Sarvam AI models were built to fix. And they’re not just fixing it. They’re beating the biggest names in AI at their own game.
In February 2026, Sarvam Vision scored 84.3% on the olmOCR-Bench — higher than ChatGPT, Gemini 3 Pro, and DeepSeek OCR v2. A Bengaluru startup, outperforming trillion-dollar companies. On reading documents. In Indian languages.
If you build with AI, work with multilingual users, or just want to understand where global AI is actually headed, this is the story you can’t afford to miss.
What Are Sarvam AI Models, Really?
Sarvam AI is an Indian startup founded in 2023 by Vivek Raghavan and Pratyush Kumar (co-founder of AI4Bharat at IIT Madras). Their mission is blunt: build AI that actually works for 1.4 billion people who don’t primarily think in English.
The Government of India selected Sarvam in April 2025 under the IndiaAI Mission to build the country’s sovereign Large Language Model. They received access to 4,096 NVIDIA H100 GPUs and close to ₹99 crore in subsidies. That’s not a side project. That’s a national bet.
Here’s the model lineup:
| Model | Parameters | Type | Best For | Open Source? |
|---|
| Sarvam-M | 24B | Hybrid reasoning LLM | Chat, math, coding, Indic QA | Open weights |
| Sarvam 2B | 2B | Small language model | Translation, summarization | Yes |
| Sarvam-1 (7B) | 7B | Base LLM | 10 Indic languages | Yes |
| Sarvam Vision | Multimodal | Document intelligence | OCR, visual understanding | API access |
| Bulbul V3 | TTS model | Text-to-speech | 11 languages, 35+ voices | API access |
The interesting part? Sarvam-M — their flagship chat model — is completely free per token on their API. We’ll get to pricing later.
What makes this different from yet another LLM announcement? These models don’t treat Indian languages as an afterthought. They’re trained on India-specific datasets with cultural context, idiomatic expressions, and code-switching baked in from day one.
Sarvam-M: The 24B Model That Punches Above Its Weight
Let’s talk specifics. Sarvam-M is a 24-billion-parameter model built on top of Mistral Small. It went through a three-phase training pipeline that’s genuinely clever:
1. Supervised Fine-Tuning (SFT) — High-quality, culturally debiased prompt-response pairs across English and 11 Indian languages.
2. Reinforcement Learning with Verifiable Rewards (RLVR) — The model was rewarded for actually solving math problems correctly, following instructions precisely, and generating working code. Not just sounding smart. Being correct.
3. Inference Optimization — FP8 quantization for efficiency without meaningful accuracy loss.
The Benchmark Numbers That Matter
Here’s where Sarvam-M gets genuinely impressive:
| Benchmark | Sarvam-M Score | What It Measures |
|---|
| Indian language benchmarks | +20% over base | Indic QA, comprehension |
| Math tasks (GSM-8K) | +21.6% over base | Grade-school math reasoning |
| Romanized Indic GSM-8K | +86% over base | Hindi/Tamil math in Roman script |
| Programming tasks | +17.6% over base | Code generation and debugging |
| MILU-IN (Indic knowledge) | 0.75 | Indian language understanding |
| IndicGenBench | 0.49 | Cross-lingual generation |
According to Sarvam’s own technical blog, the model outperforms Llama-4 Scout and competes with much larger models like Llama 3.3 70B and Gemma 3 27B — despite being less than half their size.
The one honest caveat? English-centric benchmarks like MMLU show roughly a 1% performance drop compared to the base Mistral model. That’s the tradeoff for excelling at multilingual tasks.
Sarvam Hybrid Reasoning: Think Mode vs. Non-Think Mode
Here’s a feature most global LLMs don’t offer. Sarvam-M has a dual-mode interface:
- Think Mode — Exposes chain-of-thought reasoning. Best for complex math, coding, and logical problems.
- Non-Think Mode — Low-latency responses for general chat and quick tasks.
You toggle this in the API with a single parameter. One model, two brains.
What Sarvam AI Gets Wrong (Limitations)
No useful review skips the rough edges. Here’s what you should know:
English knowledge gaps. Sarvam-M trails global leaders on pure English benchmarks by about 1%. If your use case is entirely English-only, GPT-4 or Claude still have the edge.
Context window. Sarvam-M supports 32,768 tokens with a sliding window of 4,096. That’s solid but not frontier-level for extremely long documents.
Verification bottlenecks. During RLVR training, coding tasks needed batched verification that wasn’t always feasible when mixing datasets. This means function-calling abilities improved only ~1% in some tests.
No vision in the core LLM yet. Sarvam Vision is a separate model. The flagship Sarvam-M is text-only for now, though multimodal models are reportedly in development.
Being upfront about this matters. A perfectly glowing review helps nobody.
Supported Languages and Real-World Use Cases
Sarvam AI models support 11 Indian languages: Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam, Marathi, Gujarati, Oriya, Punjabi, and English. Sarvam Translate extends this to 22 languages.
Can Sarvam AI Handle Code-Switching?
Yes — and this is a bigger deal than it sounds.
Over 350 million Indians regularly mix Hindi and English in conversation (so-called “Hinglish”). Most global LLMs stumble on this. Sarvam-M was specifically trained with romanized Indian language data and code-switching patterns.
Real use cases already in production:
- UIDAI (Aadhaar): AI-powered multilingual voice support for 1.4 billion Aadhaar holders
- Government of Tamil Nadu: ₹10,000 crore MoU for India’s first Sovereign AI Park
- Government of Odisha: Partnership for state-level AI deployment
- Enterprise chatbots: WhatsApp and phone-based agents in regional languages via Sarvam Samvaad
How to Use Sarvam AI Models: API Access in 5 Steps
Here’s your implementation roadmap:
Step 1: Sign up at dashboard.sarvam.ai and get ₹1,000 free credits.
Step 2: Get your API key from the dashboard.
Step 3: Make your first chat completion call:
python
import requests
url = "https://api.sarvam.ai/v1/chat/completions"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
payload = {
"model": "sarvam-m",
"messages": [{"role": "user", "content": "Explain quantum computing in Hindi"}],
"reasoning_effort": "high" # Enables think mode
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())
Step 4: Explore additional APIs — STT, TTS, Translation, Document Intelligence.
Step 5: Scale to production with the Business plan (₹50,000 for 57,500 credits + 1,000 req/min).
Python and Node.js SDKs are available at sarvam.ai/docs.
Sarvam-M vs. Llama vs. Mistral: Honest Comparison
| Feature | Sarvam-M (24B) | Llama 3.3 (70B) | Mistral Small (24B) |
|---|
| Indic Language Support | 11 languages, native | Limited, translation-style | European focus |
| Math (GSM-8K Indic) | +86% romanized | Not optimized | Not optimized |
| Think/Non-Think Modes | Yes | No | No |
| Code-Switching | Trained specifically | Weak | Minimal |
| Model Size | 24B | 70B (3x larger) | 24B |
| Open Weights | Yes | Yes | Yes |
| Cost (API) | Free per token | Varies by provider | Varies |
| Cultural Context | India-first | Generic | Euro-centric |
Sarvam-M won’t replace GPT-4 for English-only enterprise tasks. But for anything touching Indian languages, multilingual users, or Indic document processing? It’s currently the strongest option per parameter.
Pricing and Availability: What It Actually Costs
This is surprisingly transparent:
| Plan | Price | Credits | Rate Limit | Best For |
|---|
| Starter | Free | ₹1,000 free | 60 req/min | Testing |
| Pro | ₹10,000 (~$120) | 11,000 | 200 req/min | Startups |
| Business | ₹50,000 (~$600) | 57,500 | 1,000 req/min | Production |
| Enterprise | Custom | Custom | Custom | Scale |
Key detail: Sarvam-M chat completions are free per token. You pay for speech, translation, and other APIs. Credits never expire. No monthly subscription.
For context, that’s dramatically cheaper than equivalent API calls on OpenAI or Anthropic for multilingual tasks.
Field Notes: What a Generic AI Wouldn’t Tell You
After reviewing Sarvam’s technical blog, community feedback, and benchmark data, here are the gotchas:
- The May 2025 launch of Sarvam-M received mixed reactions. Some Indian developers questioned whether fine-tuning Mistral Small truly counts as “sovereign AI.” The backlash was real but ultimately pushed Sarvam to accelerate their from-scratch 120B model (expected early 2026).
- Bulbul V3 TTS is genuinely impressive. It handles numerics, technical terms, and named entities with the lowest error rates across Indian languages — something most TTS systems butcher.
- The Sarvam Vision OCR scoring 84.3% is notable because OCR is considered one of the hardest tasks for evaluating AI models. This isn’t a cherry-picked metric.
- Integration with Mem0 adds persistent memory to Sarvam agents — a feature that makes chatbots feel significantly more human over multi-turn conversations.
FAQs About Sarvam AI Models
What is Sarvam AI and its main models? Sarvam AI is a Bengaluru-based startup building India’s sovereign AI stack. Core models include Sarvam-M (24B chat LLM), Sarvam 2B (lightweight tasks), Sarvam Vision (OCR/documents), and Bulbul V3 (text-to-speech).
Is Sarvam-M open source and free to use? Sarvam-M is open-weights (downloadable from Hugging Face) and free per token via the Sarvam API. Other APIs like STT and TTS have pay-per-use pricing.
How does Sarvam-M compare to Llama or Mistral? Sarvam-M outperforms Llama-4 Scout on Indic benchmarks and matches Llama 3.3 70B on several tasks despite being 3x smaller. It trails slightly on English-only benchmarks.
What Indian languages does Sarvam AI support? 11 languages for the LLM (Hindi, Bengali, Tamil, Telugu, Kannada, Malayalam, Marathi, Gujarati, Oriya, Punjabi, English) and 22 languages for translation.
How good is Sarvam-M at math and coding tasks? Very strong: +21.6% improvement on math benchmarks, +17.6% on programming, and an outstanding +86% on romanized Indic GSM-8K over the base model.
What is the pricing for Sarvam AI models? Sarvam-M chat is free. Other APIs start at ₹3.50/10K characters (language detection) up to ₹45/hour (STT with diarization). Plans range from free to ₹50,000.
What Changes Next: Your Clear Takeaway
Sarvam AI isn’t just building models. They’re building infrastructure. Government partnerships in Tamil Nadu, Odisha, and nationally through IndiaAI Mission. A 120-billion-parameter sovereign model in the pipeline. Free API access for their flagship LLM.
If you’re a developer working with Indian languages, the action step is simple: sign up at dashboard.sarvam.ai today, use your ₹1,000 free credits, and test Sarvam-M against whatever you’re currently using. The comparison might surprise you.
If you’re watching the global AI landscape, the takeaway is bigger: the next wave of AI isn’t coming from Silicon Valley alone. India just entered the conversation with models that beat the incumbents where it matters most — for its own 1.4 billion people.
Your challenge: Try Sarvam-M with a real Hindi-English code-switching prompt. Compare the output to ChatGPT. Drop your results in the comments. I’d bet the difference is more dramatic than you expect.