Valeria Filippello is a Principal Computational Linguist for SDL, a leading translation technology company. This presentation, an Introduction to Statistical Machine Translation, Post-Editing and AdaptiveMT, was an NSS Careers and Networking Event, organised jointly by the School of Modern languages (MLANG), Cardiff University, and ITI Cymru Wales.
Valeria has spent 10 years in her role, her primary task being to test and develop machine translation (MT) and also to train translators in the use of MT. She is a trained translator and interpreter.
Why the need for MT?
- Ability to handle content explosion – for example the launch of a new mobile phone is done very quickly and there needs to be quick translations done for any product release
- Reduced production costs
- Faster throughput
- Greater industry acceptance
There is also a greater consumer acceptance of MT. An estimated 75% of web users use free MT tools due to the greater accessibility and integration of MT solutions.
93% of these MT users use it to understand English. Over 90% are non-English speakers.
Valeria went on to describe the different types of MT technologies…
There is RBMT – Rules-based Machine Translation – the Engine consists of a set of rules, each written by a linguist.
RBMT is time-consuming and it’s application is limited.
There is SMT – Statistical Machine Translation – on the basis of a large set of examples, the engine learns translation rules for itself.
Also, there is Hybrid MT which is any combination of SMT and RBMT Technology.
SMT involves a large database where ‘the system “learns” how to translate by analysing statistical relationships between source and target data.’
The Pros and Cons of SMT
- Once the learning system is in place, developing new engines is a quick process
- Translations are relatively fluent and show some context-sensitivity
- Needs large databases of good quality to be feasible
- engine cannot be influenced directly
- little control on terminology
SMT provides solutions for Post-Editing (PE). Post-editing is an ‘en vogue’ category of employment for translators these days.
PE is more and more accepted by translators.
PE needs to be sustainable in the long term
You need the right solution and a good PE process
There is a challenge in devising SMT solutions for Post-Editing.
- Train translators to become efficient post-editors
- Retrain MT engines taking into account post-editors’ feedback
Here are some types of SMT solutions
- BASELINES eg. Google Translate
- VERTICALS – trained engines, exclusive for a domain-specific terminology
- CUSTOMISED ENGINES – a combination
Post-Editors’ feedback is essential:
- Important part of pE integration into a workflow
- feedback from qualified post-editors is invaluable to improve output quality.
There are key benefits of implementing a feedback process.
In understanding PE various theoreticians define:
- Human intervention for editing the output of a machine’s translation system.
Post-editing / Review
The review stage follows the PE phase. There are two types of PE:
- Post-editing to publishable quality (full PE)
- PE to understandable quality (light PE)
MT is a tool – it is important to understand when it is useful or not
What makes a good Post-Editor?
- Positive attitude to MT
- decent language skills as a translator
- knowledge of expected MT behaviour
- PE practice to achieve proficiency
In order to Post-Edit effectively one needs to understand that the aim of MT is to speed up work.
SDL, Valeria’s company has a Post-Editing Machine Translation Certification that is FREE for students.
Adaptive MT is innovative machine learning.
SMT systems are static. An Adaptive MT is an engine that learns interactively from the post-editors’ comments. Engines are personal. Google has Neural MT.
Adaptive MT can be very useful for freelancers.
Valeria’s talk was most interesting and covered an area of the translation industry in MT that I find most appealing. The event was well supported with plenty of staff and students in attendance from the university plus a large contingent of ITI Cymru Wales members.