Skip to content. | Skip to navigation
22 languages, 231 bitexts total number of files: 71518 total number of tokens: 3303964 total number of sentence fragments: 1381582 PHP manuals and translations have been downloaded from (http://www.php.net/download-docs.php). The original documents are written in English and have been partly translated into 21 languages. The original manuals contain about 500,000 words. The amount of actually translated texts varies for different languages between 50,000 and 380,000 words. The corpus is rather noisy and may include parts from the English original in some of the translations. The corpus is tokenized and each language pair has been sentence aligned.
cs, de, en, es, fi, fr, he, hu, it, ja, ko, nl, pl, pt, ro, ru, sk, sl, sv, tr, zh
This site conforms to the following standards: