Google Maps gets transliteration for 10 Indian dialects

Google Maps gets transliteration for 10 native languages in India

India is one of the most unique countries in the world with more than 21 primary languages used across a wide geography. While it is a shining example of the harmonious diversity of the sub-continent, but also presents a complex obstacle in delivering services. A case in point is Google Maps.

In India, more than 75% of the people with internet connectivity use Google Maps in their regional language rather than English and this is expected to reach 90% in the next five years.

But, currently, the navigation services face translation issues related to names of Places of Interest (PoI), and most of them are not available in the native languages and this inhibits millions of users particularly in rural areas to make good use of Google Maps. 

For instance, when people go to Google Maps and type a name of a place such as Kempegowda Institute of Medical Sciences aka KIMS Hospital in their native language Kannada, Google search engine will have trouble in identifying that desired place and instead just give an alternative list of hospitals near to the user.

To overcome this issue, Google has improved the workings of the Maps service. It has built an ensemble of learned models to transliterate names of Latin script POIs into 10 languages prominent in India-- Hindi, Bangla, Marathi, Telugu, Tamil, Gujarati, Kannada, Malayalam, Punjabi, and Odia. 


Google Maps transliteration feature. Credit: Google

"Using this ensemble, we have added names in these languages to millions of POIs in India, increasing the coverage nearly twenty-fold in some languages. This will immediately benefit millions of existing Indian users who don't speak English, enabling them to find doctors, hospitals, grocery stores, banks, bus stops, train stations, and other essential services in their own language," Google said.

The company is using an ensemble of models to automatically transliterate from the reference Latin script name (such as NIT Garden or Chandramani Garden) into the scripts and orthographies native to the above-mentioned languages. Candidate transliterations are derived from a pair of sequence-to-sequence (seq2seq) models. One is a finite-state model for general text transliteration, trained in a manner similar to models used by Gboard on-device for transliteration keyboards. The other is a neural long short-term memory (LSTM) model trained, in part, on the publicly released Dakshina dataset, the company said.


Transliteration quality improvements for Indian languages in Google Maps. Credit: Google

Google also noted that the transliterated POI names are not translations. Transliteration is only concerned with writing the same words in a different script, much like an English language newspaper might choose to write the name Горбачёв from the Cyrillic script as “Gorbachev” for their readers who do not read the Cyrillic script. 

Get the latest news on new launches, gadget reviews, apps, cybersecurity, and more on personal technology only on DH Tech.