fb

Google adds world’s oldest Indian language to Google Translate

“Sanskrit is the number one, most requested language at Google Translate, and we are finally adding it.”

Google announced at their I/O developer’s conference that it has added 24 new languages to Google Translate.

Sundar Pichai, the CEO of Alphabet, in his speech said, “There is a long tail of languages that are underrepresented on the web today and translating them is a hard technical problem since translation models are usually trained with bilingual text. However, there is not enough publicly available bilingual text for every language.” 

Image source: Sundar Pichai, the CEO of Alphabet.

Out of these new 24 additions, eight languages are from India. These include: 

- Advertisement -
  • Assamese, which is used by about 25 million people in Northeast India
  • Bhojpuri, which is used by about 50 million people in northern India, Nepal and Fiji
  • Dogri, which is used by about 3 million people in northern India
  • Konkani, which is used by about 2 million people in Central India
  • Maithili, which is used by about 34 million people in northern India
  • Meiteilon or Manipuri, which is used by about 2 million people in Northeast India
  • Mizo, which is used by about 830,000 people in Northeast India
  • Sanskrit, which is used by about 20,000 people in India

Isaac Caswell, a Google Translate Research Scientist, told told ET: “Sanskrit is the number one, most requested language at Google Translate, and we are finally adding it.”

Image source: Google.

As part of this update, indigenous languages of the Americas (Quechua, Guarani and Aymara) and an English dialect (Sierra Leonean Krio) have also been added to Translate for the first time.

Caswell added: “This ranges from smaller languages, like Mizo spoken by people in the northeast of India — by about 800,000 people — up to very large world languages like Lingala spoken by around 45 million people across Central Africa.”

Other languages that rare now part of Google Translate include: 

  • Aymara (used by about two million people across some Latin American countries)  
  • Bambara (used by about 14 million people in Mali)
  • Dhivehi (used by about 300,000 people in the Maldives)
  • Ewe (used by about seven million people in Ghana and Togo)
  • Guarani (used by about seven million people in several South American countries)
  • Ilocano (used by about 10 million people in the northern Philippines)
  • Krio (used by about four million people in Sierra Leone)
  • Kurdish or Sorani (used by about eight million people, mostly in Iraq & parts of Turkey)
  • Lingala, (used by about 45 million people in central and eastern Africa)
  • Luganda (used by about 20 million people in Uganda and Rwanda)
  • Oromo (used by about 37 million people in Ethiopia and Kenya)
  • Quechua (used by about 10 million people in Peru, Bolivia, Ecuador and surrounding countries)
  • Sepedi (used by about 14 million people in South Africa)
  • Tigrinya (used by about eight million people in Eritrea and Ethiopia)
  • Tsonga (used by about seven million people in southern Africa)
  • Twi (used by about 11 million people in Ghana)

Google also announced that it has made many key improvements to its Google Translate service.

Caswell observed: “Up until a couple of years ago, it simply was not technologically possible to add languages like these, which are what we call a low resource — meaning that there are not very many text resources out there for them.”

- Advertisement -

Google Translate now supports over 133 different languages. It can be used from the web browser or a user can install the app from Google Play Store or Apple App Store.

,