ADVERTISEMENT
IISc readies tech to leapfrog digital divideIn its extended run, SYSPIN could enable TTS in multiple Indian languages, covering more sectors including education, e-mobility, and healthcare
R Krishnakumar
DHNS
Last Updated IST
IISc. Credit: DH Photo
IISc. Credit: DH Photo

Researchers at the Indian Institute of Science (IISc) are planning to release a text-to-speech (TTS) synthesis system aimed at making India’s digital ambitions more inclusive by the end of 2023.

SYSPIN (Synthesising Speech in Indian Languages) converts text in nine Indian languages to corresponding voices and publishes these datasets. The datasets double as an open resource for innovators, encouraging them to develop Artificial Intelligence-driven voice-based services in finance and agriculture.

Implemented in partnership with the IISc-promoted ARTPARK, SYSPIN could be a potential game changer as it could bring technology-enabled services closer to around 602 million people who speak these languages.

ADVERTISEMENT

The datasets and models in Bengali, Hindi, Kannada, Marathi and Telugu have been compiled and the project is set for completion by the end of the year, Prasanta Kumar Ghosh, Project Investigator, said.

Researchers said the unavailability of technological expertise and open voice data had hindered possibilities of AI-powered speech technologies in low-resourced Indian languages. Bhojpuri, Chhattisgarhi, Magadhi, and Maithili are the other languages in the programme.

In its extended run, SYSPIN could enable TTS in multiple Indian languages, covering more sectors including education, e-mobility, and healthcare.

Socio-economic backwardness and low literacy have left large sections of India’s population with limited access to digital innovations.

A promotional video for SYSPIN illustrates how a TTS-based phone application helps an illiterate man find information about schools for his daughter, read out to him in Maithili, one of the languages spoken in Bihar.

"The goal is to enable voice-based interfaces that also help people who cannot read. About 80 hours of TTS data per language, in a male and a female voice, and the relevant computer programmes are being developed. The completed work is being put to use in challenges (contests where participants build TTS systems based on the published voice data),” Ghosh, an Associate Professor at IISc’s Department of Electrical Engineering, told DH.

The team designs text relevant to the two domains, identifies the subjects and speakers, collects and validates the data, and then, open-sources it.

Text normalisation which involves the conversion of symbols, numbers, and abbreviations to context-specific speech is critical to TTS systems.

The size of the TTS corpus is expected to be “several times larger” than any existing corpus in Indian languages. The open-sourcing of the TTS data makes it accessible to researchers, technology innovators, social entrepreneurs and startups to develop application-specific models, Ghosh said.

ADVERTISEMENT
(Published 01 May 2023, 00:24 IST)