Automated Texting Services for Low-Resource Languages

Following our class period with Robert Munro, I found myself browsing through his Twitter and found an article describing his PhD topic in an August 9th Tweet. Within the article, he elaborates on some of the concepts discussed in class; as he explains, so many of the 5,000+ languages of the world are being written for the first time ever with the proliferation of mobile telephony, but the technology to process these languages cannot keep up. Compounding the problem, these phone users are of varied literacy levels, making for spelling inconsistencies among users. However, he concludes that automated information systems can pull out words that are least likely to vary in spelling (ie people, places, organizations) and examine subword variation by identifying affixes within words as well as accounting for phonological or orthographic variation (ie recognize vs. recognise). The article goes on to provide more technical prescriptions for automated text response services, and he even links to another article in a separate Tweet, which describes Powerset, a natural language search system that ultimately failed, but utilized a few valuable processes.

Ultimately, Dr. Munro implies that the capacity for automated text services in “low-resource languages” is well within reach, particularly because the messages are generally just one to two sentences. Because spelling variations are predictable, they can be modeled, and hopefully reliably answered by automated systems. However, the use of these systems will not be realized until they become more reliable and efficient than human responders, which, as he explained in class, can be extremely effective.