Patram: India’s first vision-language AI model for documents

Patram-7B-Instruct, India’s first vision-language AI model for complex document understanding, launched by IIIT-H and IIT-B under BharatGen, outperforms global models and is open-source.

author-image
DQINDIA Online
New Update
Patram
Listen to this article
0.75x 1x 1.5x
00:00 / 00:00

The BharatGen team from IIIT-Hyderabad and IIT-Bombay has launched Patram 7B-Instruct, the country’s first vision-language foundational model built specifically for document understanding. The announcement was made at the BharatGen National Summit held in New Delhi on 2 June 2025.

Advertisment

What is Patram?

Patram-7B-Instruct is a 7-billion parameter multimodal AI model capable of reading, interpreting, and responding to queries about complex documents, including scanned papers, photographed forms, and handwritten notes. Unlike traditional models trained primarily on Western datasets, Patram is tailored for real-world Indian documents with varied formats, languages, and layouts.

This model is part of the larger BharatGen initiative, which is supported by the Department of Science and Technology (DST) to build open-source, India-focused AI models spanning text, speech, and vision.

Advertisment

Developed in just five months, Patram is the product of a collaboration between engineers, researchers, and student interns at IIIT-Hyderabad and IIT-Bombay, with financial backing from DST and support from IIIT-H and TiH-IoT at IIT-B. It was officially unveiled by Dr Jitendra Singh, Minister of State for Science and Technology, alongside other top officials and dignitaries.

Document intelligence, the ability of machines to read and process paperwork, is essential across sectors such as governance, legal affairs, education, and business. With a model like Patram, India now has a homegrown AI tool that can handle these tasks with greater relevance and accuracy.

The model has shown superior performance on global benchmarks like DocVQA and VisualMRC, even outperforming larger international models such as DeepSeek-VL-2. It also excels in Patram-Bench, a custom benchmark designed around Indian document scenarios.

Advertisment

Patram-7B-Instruct is available as a free, open-source release on Hugging Face, and IndiaAI's AIKosh portal.

The BharatGen initiative also unveiled DocBodh, a generative AI suite built for Indic document intelligence. Together, Patram and DocBodh are expected to strengthen India’s push for digital governance, AI in public services, and self-reliant technology infrastructure under flagship national programs like Digital India and Atmanirbhar Bharat.

India is not just catching up with global AI leaders, it's charting its own course with models built for the country, by the country.