Google sheds light on new AI-based model with support for 1,000 languages

Google's state-of-the-art ULM model supports two billion parameters and is being trained on 12 million hours of speech and 28 billion sentences of text

DH Web Desk

Last Updated 08 March 2023, 10:19 IST

In November 2022, Google announced that it is working on an advanced Univeral Language Model (ULM) that would benefit billions of users in terms of translations.

YouTube made good of Google's UML to generate closed captions on millions of videos in several languages on its platforms. But, currently, it is limited to around 100 languages. Now, Google has revealed that its UML is almost ready to support 1,000 most-spoken languages around the world.

Google's state-of-the-art ULM model now supports two billion parameters and is being trained on 12 million hours of speech and 28 billion sentences of text, spanning 300 plus languages.

It boasts Automatic Speech Recognition (ASR) tech and is trained to intuitively detect not only popular (most spoken) languages such as English and Mandarin, but also relatively unknown dialects of Amharic, Cebuano, Assamese, and Azerbaijani.

Even with limited supervised data, the model has been able to achieve less than 30 per cent word error rate (WER; the lower score, the better) on average across the 73 languages.
And on, US-English, USM has a 6 per cent relative lower WER compared to the current internal state-of-the-art model.

A sample of the languages that USM supports. Credit: Google

So, to scale up to 1,000 languages, requires a lot more effort to obtain enough audio and text data to train high-quality models and Google has given a graphical flow chart and how it is progressing in achieving the goal.

"Our training pipeline starts with the first step of self-supervised learning on speech audio covering hundreds of languages. In the second optional step, the model’s quality and language coverage can be improved through an additional pre-training step with text data. The decision to incorporate the second step depends on whether text data is available. USM performs best with this second optional step. The last step of the training pipeline is to fine-tune on downstream tasks (e.g., ASR or automatic speech translation) with a small amount of supervised data," Google said.

Though Google's AI chatbot Bard may have lost the battle of perception with regard to ChatGPT-integrated Microsoft Bing, there is a much bigger war to win. And, if things go as planned, the search engine giant's improvements with USM can leap over the competitors.

Google is also reportedly working on around 20 similar generative AI models such as 'Shopping Try-on' for YouTube (to help users try on new clothes), Maya: (3D image generator), a video summarization tool, a Pixel exclusive wallpaper generator, an AI tool for enterprise customers to developer their own ChatGPT-like bot, Colab + Android Studio (to help software programmers to detect and fix bugs) and more at upcoming annual Google I/O 2023 developer conclave.

Get the latest news on new launches, gadget reviews, apps, cybersecurity, and more on personal technology only on DH Tech.

(Published 08 March 2023, 10:13 IST)

Technology News Google DH Tech Language

Google sheds light on new AI-based model with support for 1,000 languages

Follow us on