
Two Guwahati-based NGOs and IIT Bombay's AI initiative BharatGen have joined hands to include Assamese language in the database of artificial intelligence, an official said on Wednesday.
The three organisations signed an agreement on Tuesday for including two million pages with Assamese content into BharatGen, Assam Jatiya Bidyalay Educational and Socio-Economic Trust secretary Narayan Sharma said in a statement.
"For Assamese, long considered a 'low-resource' language in the digital ecosystem, this partnership is historic. With the inclusion of two million Assamese pages into BharatGen, the language has reached this scale of AI readiness for the first time," he said
The association is the outcome of 'Digitising Assam', a community-driven project spearheaded by Nanda Talukdar Foundation (NTF), which, in 40 months, digitised and preserved more than two million pages of Assamese books, journals, manuscripts and ancient Sachipats.
What is BharatGen?
The BharatGen is the Centre's flagship AI initiative, led by IIT Bombay to build a sovereign, indigenous large model for Indian languages.
Its mission is to develop AI agents fluent in all of India's 22 scheduled languages, grounded in Indian cultural and linguistic data, and available as open-source resources.
BharatGen currently supports nine Indian languages -- Hindi, Marathi, Tamil, Malayalam, Bengali, Punjabi, Gujarati, Telugu and Kannada.