Machine Translation Marathon 2008
May 12-17, 2008, Berlin, Germany
- Spring School class on current methods in statistical MT
- Research showcase
- Open source convention on resources for machine translation
- Workshop on evaluation of European language translation
- Translingual Europe conference
Where & When:
Wandlitz (near Berlin), SeePark Kurhotel am Wandlitzsee (May 12 - May 17, 2008)
Berlin (May 19 & 20, 2008)
How much: Attendance is free of charge, but limited.
We recommend to book a package from the hotel which amounts to €372.25 for 5 nights including dinner and breakfast, as well as 6 lunches
The Spring School takes place from May 12-16. In daily lectures and labs, students and beginning researchers in the field of statistical machine translation are introduced to recent methods and the use of currently available open research tools. Invited talks report on recent research in the field. At the end of the school, participating students will know how to build state-of-the-art machine translation systems from parallel corpora for any language pair and will understand current research problems in machine translation.
Day 1: Introduction to MT and MT Evaluation
Day 2: Word-based models and training
Day 3: Phrase-based models and decoding
Day 4: Factored translation models and discriminative training
Day 5: Tree-based models
Augmenting to the lectures of the summer school, leading researchers in the field will give presentation on new methods in statistical machine translation.
Tue 11:00 "Translation by Pattern Matching", Adam Lopez, University of Edinburgh
Tue 11:45 "Improved Word Alignment in Statistical MT", Alex Fraser, University of Stuttgart
Wed 11:00 "Problems of Deep Syntactic English-to-Czech MT", Ondrej Bojar, Charles University
Wed 11:45 "Smoothing and Data Selection in Large SMT Systems", Holger Schwenk, LIUM
Thu 11:00 "Architecture of the Lucy Translation System", Petra Gieselmann, Lucy Software
Thu 11:45 "Hybrid Architectures for Machine Translation", Andreas Eisele, University of Saarbruecken
Fri 11:00 "Re-Usable Hybrid Machine Translation", Stephan Oepen, Norwegian Institute for Science and Technology
Fri 11:45 "A Data-Driven Approach to Deep Machine Translation", Michael Jellinghaus, University of Saarbruecken
Open Source Convention
The EuroMatrix project is dedicated to the development of open source tools to foster research and development of machine translation systems. One focus is the development of the Moses decoder, which is widely used in academic research as baseline and benchmark. The open source convention brings together developers of the Moses decoder and other open source efforts for intensive collaboration over a 5-day period (May 12-16). The sixth day of the MT Marathon (May 17) is dedicated to open source (Open Source Day). In a public event recent developments and available tools are presented to a wider audience.
09:00 "The IRST Language Model Library", Nicola Bertoldi, FBK
09:45 "Statistical MT Model Estimation with MapReduce", Chris Dyer, University of Maryland
11:00 "Software Engineering in Moses", Hieu Hoang, University of Edinburgh
11:30 "Roadmap for Moses", Philipp Koehn, University of Edinburgh
14:00 Proposals for Projects
16:00 Reports from Activities at the Open Source Convention
Hands-on development: Developers who already have some experience in machine translation methods and good programming skills are invited to collaborate with the original Moses developers and other experts in the field on short but intense 5-day projects.
Call for proposals: The EuroMatrix projects calls for proposals from developers of machine translation systems to extend and augment current open source tools for statistical machine translation. Such projects which should roughly involve 6 months of work, carried out as student projects at universities, development activities at companies, or hobbyist projects must add to the freely available open source tool set for machine translation. Some proposers will be selected to present their ideas at the Open Source Day, and the winning proposal will be awarded with a €1000 grant. Send proposals (1-2 pages) by April 18 to firstname.lastname@example.org.
During the MT Marathon participants will have a chance to evaluate state-of-the-art machine translation quality for a number of European languages. Participants will judge translations between English, German, Spanish, French, Czech and Hungarian. We will use this as an opportunity to discuss the types of errors that statistical translation systems currently make, and how they might be improved. The workshop will also examine current research into automatic evaluation metrics for translation.
If you would like your machine translation system's output to be judged in the evaluation workshop, then you can submit an entry to the shared task of the Euromatrix-sponsored ACL workshop on statistical machine translation (see http://www.statmt.org/wmt08/).
The international conference Translingual Europe aims to inform invited representatives of industry, commerce, research and administration about recent progress in translation technology. In order to determine the requirements and the state of the art with respect to the European languages, the EU-funded project EuroMatrix has organized an open technology contest and a survey of available products and resources. At the Berlin conference, the results of the endeavors will be presented and discussed. A third theme of this conference is the discussion of opportunities and challenges for European research, development and technology transfer in this important application area of information technology.