2017/2018
On the Design of Web Crawlers for Constructing an Efficient Chinese-Portuguese Bilingual Corpus System
2018 International Conference on Electronics, Information, and Communication (ICEIC) 2018:9-12, IEEE
Author(s) | Sio Tai Cheong/
Jiabo Xu/ Yue Liu |
---|---|
Summary | Machine Translation is a very popular and important topic in Natural Language Processing (NLP) during the last few decades. This paper focuses on the design of the Web Crawlers for Chinese-Portuguese bilingual corpus construction, and this corpus would be used in corresponding Machine Translation systems. It accomplished a bilingual corpus construction process from bilingual corpus collection with web crawlers based on different sources. By this mean, this system can be considered as an innovative and reasonable attempt in setting up the bilingual corpora with Chinese and Portuguese, and it has solved some practical problems at the initial stage of the corpus construction. |