Domain Adaptation Machine Translation
Three main topics can be identified depending on the availability of domain specific data.
Domain adaptation machine translation. Domain adaptation da techniques in machine translation mt have been widely studied. Domain adaptation is the main subject of 143 publications. Often there is a mismatch between the domain for which training data are available and the target domain of a machine translation system. Domain adaptation of neural machine translation by lexicon induction junjie hu mengzhou xia graham neubig jaime carbonell language technologies institute school of computer science carnegie mellon university fjunjieh gneubig jgcg cs cmu edu mengzhox andrew cmu edu abstract it has been previously noted that neural ma.
Domain adaptation in statistical machine translation. 28 are discussed here. Full text available as. Previous unsupervised domain adaptation strategies include training the model with in domain copied monolingual or back translated data.
Phd thesis dublin city university. Although the high quality and domain specific translation is crucial in the real world domain specific corpora are usually scarce or nonexistent and thus vanilla nmt performs poorly. Domain adaptation is useful for specializing current generic machine translation models mainly when the specialized corpus is too limited to train a separate model. For statistical machine translation smt several da methods have been proposed to overcome the lack of domain speciļ¬c data.
Furthermore domain adaptation techniques can be handy for low resource languages that share vocabulary and structure with other rich resource family languages. 1 if any in domain data is available it can be directly used to improve the mt system e g by combining the. However these methods use generic representations for text. Domain adaptation is a very active research topic within the area of smt.
Domain adaptation is required when domain specific data is scarce or nonexistent. Neural machine translation nmt is a deep learning based approach for machine translation which yields the state of the art translation performance in scenarios where large scale parallel corpora are available. 0000 0001 5659 5865 2017 domain adaptation for statistical machine translation and neural machine translation.