Martin Benjamin is a multilingual lexicographer and language technologist who is trained neither as a computer scientist nor a linguist. While learning Swahili for his Ph.D. in Anthropology from Yale, on the relationship between poverty and aid in rural Tanzania, he launched the Kamusi Project in 1994, using what would come to be called crowdsourcing as a tool for compiling lexicographic data in the Internet Living Swahili Dictionary. Requests for other African languages to be included in Kamusi led to the development of a multilingual data model that can, in principle, interlink an unlimited number of languages.
In this podcast Martin shares his thoughts about cars, the issue of “underserved languages,” and explains why he believes that Google Translate and other machine translation engines have “gotten it wrong.” He also talks about the work that Kamusi is doing to help make information from all languages more accessible.
The Pirate Professor