[Metalab] Google translate language detection

Amir Hassan amir at viel-zu.org
Mon Jan 5 14:35:01 CET 2015


did you ever wonder how google translate's language detecion does work?  
well, i think i have an idea.
I wrote a tool (https://github/kallaballa/UniLingus) that can learn to  
generate gibberish resembling a specific language using markovchains on  
bigrams.

english example:
arsesintior minen is tri traritiolin merat ol tri reseat or antinersea ic  
tri eatratr unener sserist in is diniten

german example:
mauft wert ven einer denelisstrtin aandinendeng wertert eine  
gserenenstalll mauns einer fartin oral deneneng ver

and guess what happens when you feed the text to google translate:

https://translate.google.com/#auto/de/arsesintior%20minen%20is%20tri%20traritiolin%20merat%20ol%20tri%20reseat%20or%20antinersea%20ic%20tri%20eatratr%20unener%20sserist%20in%20is%20diniten

https://translate.google.com/#auto/de/mauft%20wert%20ven%20einer%20denelisstrtin%20aandinendeng%20wertert%20eine%20gserenenstalll%20mauns%20einer%20fartin%20oral%20deneneng%20ver

hihi,
amir




More information about the Metalab mailing list