Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This become even more awful when the target language is not English because English is used as a pivot language: there is now two rounds of errors. I’ve heard from a researcher specialized in the AT field that a rule based Japanese to French system exists and is quite good but I’ve never seen the system itself.


I wonder, would another language serve better as a "pivot", all else equal (like, let's ignore corpus size)? English is both a wonderfully expressive but also frustrating language because of its vocabulary's flexibility, ambiguity, and adaptability.


french used to be the language in which international laws and treaties were written because it’s supposed to be one of the most precise.

later, german was the language of choice of philosopher because of its rigourous structure and the ability to make new words just by combining them.

now it’s english, because it’s the easier to learn.

but also the real reason was the economical power of its time impose its language. The reason english was chosen probably is only because the software developers were english.

I think slavic language would be the best though, because every single word takes a different variation if they’re about a man or woman, etc. So there’s probably the least amount of ambiguity.

But probably an invented intermediate language would be the best. Except the google style translators don’t try to infer any meaning, which makes it hard to reason about.


Funny English sounds like PHP programming language when it's used broadly, relatively easy to understand but sometimes the rules are too loose and some other languages are more strict but not everyone likes the strictness.


A paper or webpage I read long ago about Bing Translate internals mentionned to use of an intermediate language that was artificial.


Esperanto?


why have one pivot at all? why can't french be the pivot between spanish and english?

if we must have one and you believe in euroversals, then german and/or french might be better than english…


Resources availability. English/X or X/English pairs probably exist for a lot of X and may be well endowed. But for any other X/Y pairs resources may not exist at all or be too small to use. That being said it is still interesting to develop this kind of ressources (for example dictionaries) because they can be used for other NLP tasks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: