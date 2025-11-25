New Delhi, Nov 25 Union Minister of State (Independent Charge) for Science and Technology Jitendra Singh has hailed "BharatGen" as India’s first sovereign multilingual and multimodal AI-driven Large Language Model.

During his visit to IIT Bombay, Singh stated that BharatGen will shape the future of governance and innovation in India.

BharatGen is India’s first sovereign effort to create a Large Language Model that truly reflects the linguistic, cultural, and social diversity of the nation.

Built to support over twenty-two Indian languages, BharatGen integrates three major modalities- text, speech, and document vision, so that it can understand, generate and interpret information in the same way Indian citizens naturally communicate.

Singh appreciated the scale, ambition, and technical depth of the BharatGen initiative, describing it as a turning point in India’s journey toward technological self-reliance.

"BharatGen is not just a technological project but a national effort to ensure that the future of AI reflects the aspirations, languages and lived experiences of 1.4 billion Indians," he said.

The minister also emphasised that initiatives like BharatGen embody the Prime Minister’s vision of empowering every citizen through science and technology, building systems that are inclusive, trustworthy, and locally grounded, and ensuring that India’s digital narrative is written by Indians themselves.

BharatGen is supported under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS) of the Department of Science and Technology, with Rs 235 crore being channelled through the Technology Innovation Hub at IIT Bombay.

The consortium, led by IIT Bombay, includes leading institutions such as IIT Madras, IIT Kanpur, IIIT Hyderabad, IIT Mandi, IIT Hyderabad, IIM Indore, IIT Kharagpur and IIIT Delhi.

A key component of BharatGen is the Bharat Data Sagar, one of the most ambitious data initiatives undertaken in the country.

The Bharat Data Sagar is being developed to ensure India’s complete ownership and control over its digital knowledge resources.

The Minister also reviewed the BharatGen models released so far.

The team presented Param-1, a foundational text model of 2.9 billion parameters trained on 7.5 trillion tokens, with over one-third of the training data representing Indian content.

BharatGen has also built Speech models such as Shrutam, a 30-million-parameter Automatic Speech Recognition system, and Sooktam, a 150-million-parameter Text-to-Speech model available in nine Indic languages.

Additionally, the project has delivered Patram, India’s first document-vision model with seven billion parameters, trained on 2.5 billion tokens, designed to understand and interpret complex documents in Indian formats.

"These models together create a complete AI stack for India- text, speech and vision, capable of supporting governance, industry, education, agriculture, healthcare and digital inclusion," Singh said.

