Machine Transliteration Based on Error-driven Learning
- 所属单位:信息科学技术学院
- 发表刊物:Proceeding of IALP,2012
- 项目来源:国家社科基金项目
- 摘要:Transliteration is a common translation method when named entities are introduced into another language. Direct orthographical mapping (DOM) approach is successfully applied in machine transliteration by segmenting a word according to syllables and then mapping them directly into target language without considering its pronunciation. The paper studies the performance of two-stage machine transliteration based on Conditional Random Fields. To reduce the amount of computation in model training, we propose an error-driven learning by dividing the training data into several groups and training the transliteration model step by step based on the error prediction data until the performance doesn’t increase or the limitation of the computer. Experiments on data of NEWS2011 show that error-driven model training reduces computational complexity and saves the time of model training. Compared to the combining transliteration model, our transliteration system increases the accuracy of top-1 output with 0.06, reaching 0.652.
- 论文类型:论文集
- 是否译文:否
- 发表时间:2012-10-20
- 第一作者:秦颖