Lazy transliteration/translation with Python

It is not really transliteration. But if you need any sort of Latin letters from a bunch of symbols you don’t even begin to understand, it is a nice intermediate solution.

1. I will be using googletrans and in particular:

pip3 install googletrans==4.0.0-rc1

Because the one installed on default gives many people this error:

AttributeError: 'NoneType' object has no attribute 'group'

2. The tests are made on this dictionary with translations of word vodka, escept for Arabic, there I have the name of Suleiman the First (the word vodka in Arabic did not work, so I changed it).

choices = {
'text_amhar': "ቮድካ",
'text_arab': "سليمان اول",
'text_arm': "Լօրեմ իպսում դօլօր սիտ ամետ",
'text_belorus': "гарэлка",
'text_bengal': "ভদকা",
'text_birma': "ဗော့ဒ်ကာအရက်",
'text_vietnam': "rượu vodka",
'text_greek': "Λορεμ ιψθμ δολορ σιτ αμετ",
'text_georg': "ლორემ იპსუმ დოლორ სით ამეთ",
'text_gudgarati': "વોડકા",
'text_ivrit': "וודקה", 
'text_idish': "מאַשקע",
'text_kazah': "арақ",
'text_kannada': "ವೋಡ್ಕಾ",
'text_chin_trad': "伏特加",
'text_chin': "伏特加",
'text_kor': "보드카",
'text_khmer': "វ៉ូដាកា",
'text_latysh': "degvīns",
'text_litov': "degtinė",
'text_malayalam': "വോഡ്ക",
'text_marathi': "राय धान्यापासून तयार केलेले मद्य",
'text_deu': "Tröster",
'text_nepal': "भोड्का",
'text_oriya': "ଭୋଡା",
'text_pandzhabi': "ਵਾਡਕਾ",
'text_pers': "ودکا",
'text_polish': "wódka",
'text_pushtu': "ودك",
'text_rumyn': "Vodcă",
'text_samoa': "Faʻafetai",
'text_singal': "වොඩ්කා",
'text_sindhi': "ووڊڪا",
'text_tadzh': "Арақ",
'text_tai': "วอดก้า",
'text_tamil': "ஓட்கா",
'text_telugu': "వోడ్కా",
'text_urdu': "ووڈکا",
'text_hindi': "वोडका",
'text_nihongo': "ウォッカ"
}

In order to get any sort of latin letters from the name, I am simply translating the text to Latin. Most names are going to be transliterated and just a few – translated. Overall, it helps with tasks of low importance.

translator = Translator()

def test_translit():	
	for key, value in choices.items():
		try:
			text = translator.translate(value, dest='la').text
			print('SUCCESS: ', value, ' --> ', text)
		except Exception as e:
			print('\n', key)
			print('FAIL: ', value, '[', str(e), ']')

test_translit()

And the output is:

SUCCESS:  ቮድካ  -->  vodka
SUCCESS:  سليمان اول  -->  Confortatus est ergo Salomon ante
SUCCESS:  Լօրեմ իպսում դօլօր սիտ ամետ  -->  Praesent aliquam, justo convallis luctus
SUCCESS:  гарэлка  -->  vodka
SUCCESS:  ভদকা  -->  vodka
SUCCESS:  ဗော့ဒ်ကာအရက်  -->  vodka
SUCCESS:  rượu vodka  -->  vodka
SUCCESS:  Λορεμ ιψθμ δολορ σιτ αμετ  -->  Praesent aliquam, justo convallis luctus
SUCCESS:  ლორემ იპსუმ დოლორ სით ამეთ  -->  Morbi lacinia interdum nulla penatibus amet nibh adipiscing semper ligula
SUCCESS:  વોડકા  -->  vodka
SUCCESS:  וודקה  -->  vodka
SUCCESS:  מאַשקע  -->  vodka
SUCCESS:  арақ  -->  vodka
SUCCESS:  ವೋಡ್ಕಾ  -->  vodka
SUCCESS:  伏特加  -->  vodka
SUCCESS:  伏特加  -->  vodka
SUCCESS:  보드카  -->  vodka
SUCCESS:  វ៉ូដាកា  -->  Vodaka
SUCCESS:  degvīns  -->  vodka
SUCCESS:  degtinė  -->  vodka
SUCCESS:  വോഡ്ക  -->  vodka
SUCCESS:  राय धान्यापासून तयार केलेले मद्य  -->  vodka
SUCCESS:  Tröster  -->  Tröster
SUCCESS:  भोड्का  -->  vodka
SUCCESS:  ଭୋଡା  -->  Voda
SUCCESS:  ਵਾਡਕਾ  -->  vodka
SUCCESS:  ودکا  -->  vodka
SUCCESS:  wódka  -->  vodka
SUCCESS:  ودك  -->  WWD
SUCCESS:  Vodcă  -->  vodka
SUCCESS:  Faʻafetai  -->  Gratias tibi
SUCCESS:  වොඩ්කා  -->  vodka
SUCCESS:  ووڊڪا  -->  vodka
SUCCESS:  Арақ  -->  vodka
SUCCESS:  วอดก้า  -->  vodka
SUCCESS:  ஓட்கா  -->  vodka
SUCCESS:  వోడ్కా  -->  vodka
SUCCESS:  ووڈکا  -->  vodka
SUCCESS:  वोडका  -->  vodka
SUCCESS:  ウォッカ  -->  vodka

All done in 20 seconds.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.