實(shí)踐答疑|如何使用 Memsource 機(jī)器翻譯詞匯表?
以下文章來源于煙臺(tái)譯博云天公司?,作者M(jìn)emsource

Memsource 是一款在線計(jì)算機(jī)輔助翻譯軟件,操作簡便,功能齊全,編輯器、記憶庫、術(shù)語庫、QA等基本功能一應(yīng)俱全。同時(shí),Memsource 還提供了多個(gè) MT 引擎集成,并能對(duì)各種引擎進(jìn)行管理。此外,在 Memsource 機(jī)器翻譯詞匯表的輔助下,還可以根據(jù)譯者需求人工干預(yù)機(jī)器翻譯的輸出結(jié)果,可大大提升譯后編輯效率。
Machine translation glossaries:?
why they matter and how to use them?
機(jī)器翻譯詞匯表:為何重要以及如何使用?
Machine translation glossaries?are one of the simplest ways to customize MT. Learn what they are, why they matter, and?how to leverage them to improve MT output in the long run.
機(jī)器翻譯詞匯表是定制機(jī)器翻譯的最簡單方法之一。讓我們一起來看看什么是機(jī)器翻譯詞匯表,它們?yōu)槭裁粗匾?/strong>,以及如何利用它們來長期改善機(jī)器翻譯的輸出結(jié)果。
With machine translation (MT), precision and recall are critical to success. Every translation counts. The more curated and accurate the information you provide for your MT engines, the better they’ll perform.
對(duì)機(jī)器翻譯(MT)來說,精確率和召回率至關(guān)重要。每一次翻譯都很重要。為MT引擎提供信息時(shí),提供的信息越準(zhǔn)確,其表現(xiàn)就會(huì)越好。
譯者注:在機(jī)器翻譯任務(wù)中,BLEU?和?ROUGE?是兩個(gè)常用的評(píng)價(jià)指標(biāo),BLEU?根據(jù)精確率(Precision)衡量翻譯的質(zhì)量,而?ROUGE?根據(jù)召回率(Recall)衡量翻譯的質(zhì)量。
What are machine translation glossaries?
什么是機(jī)器翻譯詞匯表?
Glossaries, in the context of machine translation, are a collection of words and phrases with a preferred machine translation. They’re sometimes referred to as:?
Custom terminology
Custom vocabulary
Custom dictionaries, etc.
在機(jī)器翻譯的背景下,詞匯表是一個(gè)被機(jī)器翻譯首選使用的單詞和短語的集合,有時(shí)被稱為
自定義術(shù)語
自定義詞匯
自定義詞典等
MT glossaries are similar to term bases, but instead of being used by linguists, they are designed to be used by machine translation software.
MT詞匯表類似于術(shù)語庫,但它們的使用者不是語言學(xué)家,而是為機(jī)器翻譯軟件使用而設(shè)計(jì)的。
When attached to MT engines, glossaries help improve the quality of the MT output by ensuring that the MT engines correctly Apply pre-determined terminology.
MT 引擎在啟用詞匯表時(shí),通過確保正確應(yīng)用預(yù)先確定的術(shù)語,可以提高 MT 輸出的質(zhì)量。
Before a source text is translated by an MT engine, it will compare the attached glossary file to the source text to identify terms that have a preferred translation and Apply those.MT引擎在翻譯源文本之前,會(huì)比較詞匯表文件和源文本,確定并應(yīng)用首選翻譯的術(shù)語。
It’s important to note that an MT glossary?doesn’t re-train an engine—it simply overrides any Appropriate term with a predetermined translation.
需要注意的是,MT 詞匯表并沒有重新訓(xùn)練引擎——它只是用預(yù)定的翻譯人工控制了原文中所有適當(dāng)術(shù)語的翻譯方式。
Why are MT glossaries important?
為什么?MT?詞匯表很重要?
MT engines have dramatically improved in output quality over the past few years. Nevertheless, they still lack the contextual understanding of a human translator.
在過去的幾年里,MT 引擎輸出質(zhì)量有了很大提高。然而,MT 引擎仍然無法做到像人類譯員一樣理解上下文。
This means they can make some very basic errors, especially when handling an ambiguous word or a term that has a specific meaning in a given context.這就意味著 MT 引擎可能會(huì)犯一些非常基本的錯(cuò)誤,尤其是在處理模棱兩可的單詞或在特定語境中有特定含義的術(shù)語時(shí)。
Since glossaries are adapted to a domain’s or company’s specific terminology, they help machine translation output be far more accurate than if the engine just drew from general-purpose data sets.由于詞匯表是根據(jù)某個(gè)領(lǐng)域或公司的特定術(shù)語進(jìn)行調(diào)整的,所以機(jī)器翻譯引擎在詞匯表輔助下輸出的結(jié)果遠(yuǎn)比從通用數(shù)據(jù)集中提取的結(jié)果準(zhǔn)確性更高。
How do MT glossaries work?
MT?詞匯表是如何工作的?
The steps that an MT engine usually follows are:
Receive a source text
Translate the source text
Display the output translation
MT 引擎通常遵循的步驟是:
接收源文本
翻譯源文本
顯示輸出的翻譯結(jié)果
With an MT glossary included, MT engines add an intermediate step to the process:
Receive source text
Translate the source text
Search and replace the translation with your preferred terminology
Present the output translation
由于包含了 MT 詞匯表,MT 引擎在這個(gè)過程中增加了一個(gè)中間步驟。
接收源文本
翻譯源文本
搜索并替換首選術(shù)語
顯示輸出的翻譯結(jié)果
To put it another way,?with the help of glossaries, the MT engine searches for matches and automatically Applies them while translating.
換句話說,在詞匯表的幫助下,MT 引擎會(huì)搜索匹配的詞匯,并在翻譯時(shí)自動(dòng)應(yīng)用。
For example, suppose you have a brand for a Bluetooth speaker called “Connected,” and you want to translate the following sentence into Spanish: “Your Connected device was not detected.”
例如,假設(shè)你有一個(gè)名為 “Connected ”的藍(lán)牙音箱品牌,你想把下面這個(gè)句子翻譯為西班牙語:“沒有檢測到你的 Connected 設(shè)備”。
Without an MT glossary, your MT engine would produce something like the following result: “No se ha detectado tu dispositivo conectado” (literal back-translation into English: “Your connected device was not detected”). As you can see, the brand name “Connected” has been translated as “conectado,” which would be incorrect in this case.如果沒有 MT 詞匯表,MT 引擎會(huì)輸出類似下面的結(jié)果?!癗o se ha detectado tu dispositivo conectado”(直譯為英語:“你的連接設(shè)備沒有被檢測到”)??梢钥吹?,品牌名稱 “Connected ”被譯為 “conectado”,這樣的譯法是錯(cuò)誤的。
If you add the brand name “Connected” to your MT glossary, you can enforce the non-translatability of the term. In that case, the MT engine will produce this result: “No se ha detectado tu dispositivo Connected.” This is spot on—using an MT glossary significantly improves accuracy by automatically providing the desired translation.如果將品牌名稱 “Connected ”添加到MT 詞匯表中,就可以強(qiáng)制不翻譯該術(shù)語。在這種情況下,MT 引擎會(huì)輸出下列結(jié)果:“No se ha detectado tu dispositivo Connected”,輸出結(jié)果完全正確,所以使用 MT 詞匯表可以通過自動(dòng)提供所需的翻譯來有效提升譯文準(zhǔn)確性。
Best practices for using MT glossaries
使用?MT?詞匯表的最佳實(shí)踐
To ensure MT glossaries remain reliable and always up to date, here are a few best practices to follow:
為了確保 MT 詞匯表內(nèi)容可靠并始終保持更新,可遵循下列做法:
Keep it simple: Small glossaries, focusing only on the most essential terms, tend to be more effective—massive glossaries could even harm your translation output.
維持極簡:聚焦常用詞表,避免詞條過多。
Limit customizations to words?that you only want to be translated in one way:?The translation suggested by the MT engine should match exactly what you want.
自定義設(shè)置應(yīng)限于只以一種方式翻譯的單詞:MT 引擎建議的翻譯結(jié)果應(yīng)當(dāng)與期望的翻譯結(jié)果完全匹配。
Ensure glossaries are free of errors: Keep your terms free of spelling mistakes, formatting errors, or incorrect translations.
確保詞匯表正確無誤:確保術(shù)語沒有拼寫錯(cuò)誤、格式錯(cuò)誤或翻譯錯(cuò)誤。
Avoid having duplicate terms: MT engines can struggle to Apply the correct term if multiple instances are found.
避免出現(xiàn)重復(fù)的術(shù)語:如果詞匯表中有多個(gè)重復(fù)術(shù)語,MT 引擎可能難以正確應(yīng)用術(shù)語。
Post-edit essential translations: While glossaries can enhance translation quality,?don’t trust them blindly—high-quality human checks on your MT output are always the best guarantee of accuracy. This process is called “post-editing.”
重要的翻譯應(yīng)進(jìn)行譯后編輯:雖然詞匯表可以提高翻譯質(zhì)量,但也不可盲信詞匯表。對(duì)MT 輸出結(jié)果進(jìn)行高質(zhì)量的人工檢查始終是確保準(zhǔn)確性的最佳途徑。這個(gè)過程被稱為 “譯后編輯”。
Be mindful of your language pair:?In morphologically complex languages, like Finnish, Arabic, or Turkish, words may change shape depending on the context—so customizations for these languages may not always produce the best results.
注意語言對(duì):在芬蘭語、阿拉伯語和土耳其語等形態(tài)復(fù)雜的語言中,單詞可能會(huì)根據(jù)上下文改變形態(tài)。所以,對(duì)這些語言進(jìn)行自定義設(shè)置并不一定能夠產(chǎn)生最佳效果。
Review documentation: Although the basic glossary functionality is similar across MT engines, the specifics might differ; it may be helpful to read the available documentation to find out how to best work with a given engine.
查看文檔:盡管各類 MT 引擎的基本詞匯表功能相似,但具體細(xì)節(jié)上可能有所不同;閱讀現(xiàn)有文檔有助于了解如何讓詞匯表與特定機(jī)器翻譯引擎更好地配合。
Not all kinds of terms are Appropriate for glossaries:?For the best results, focus on compound nouns; examples often include product names, like “Postmates” or other specific terms like “WeWork.”
并非各種術(shù)語都適用于詞匯表。為獲得最佳效果,復(fù)合名詞需要關(guān)注;如 “Postmates ” 等產(chǎn)品名稱或“WeWork”等特定術(shù)語。
What terms are suitable for MT glossaries?
哪些術(shù)語適用于?MT?詞匯表?
To maximize the impact and accuracy of MT glossaries, it’s important to use them for specific types of terms:
為了最大化 MT 術(shù)語表的影響力和準(zhǔn)確性,將其用于特定的術(shù)語很重要。?
Product names?like “Ford Mondeo,” “Samsung Galaxy Note 5,” etc.
產(chǎn)品名稱:“福特蒙迪歐”、“三星 Galaxy Note 5 ”等。
Company names?like “Apple,” “Microsoft,” etc.
公司名稱:“蘋果”、“微軟”等。
Ambiguous words,?e.g., homonyms (multiple-meaning words) like “crane” (a machine vs. an animal) or “l(fā)ead” (the metal vs. a potential client)
棱模兩可的詞:例如, “crane”(機(jī)器與動(dòng)物)或 “l(fā)ead”(金屬與潛在客戶)等同形異義詞(多義詞)。
Abbreviations:?A shortened form of a word or phrase that’s frequently used in the industry or domain of interest, e.g., TMS for “translation management system”
縮略語:在相關(guān)行業(yè)或領(lǐng)域經(jīng)常使用的單詞、短語的簡稱,例如,TMS 代表 “翻譯管理系統(tǒng)”。
Borrowed words: Foreign words that the MT engine will likely keep in the original language, like the French “c?te de boeuf” dish, but which you want to translate nevertheless—in this case, “rib eye”.
外來詞:MT 引擎可能會(huì)保留原語中的外來詞,如法國菜肴 “c?te de boeuf ”,但它仍然需要翻譯,本例中應(yīng)譯為?“里脊牛排”。
What terms are less suitable for MT glossaries?
哪些術(shù)語不太適合用 MT 詞匯表?
At the same time, some morphological categories are less suitable to be documented and used in a machine translation glossary:
Verbs: MT glossaries can’t conjugate them correctly in grammatical person, number, gender, tense, aspect, mood, voice, degree of formality,?clusivity, transitivity, or valency.
Inflected languages with many cases and grammatical?genders: MT glossaries can’t currently change the form or ending of some words when the way in which they’re used in sentences changes.
同時(shí),有些形態(tài)類的詞不太適合在機(jī)器翻譯詞匯表中記錄或使用。
動(dòng)詞。MT 詞匯表不能正確連接動(dòng)詞的語法人稱、數(shù)、性、時(shí)態(tài)、體、情態(tài)、語態(tài)、正式程度、包含性、及物性或配價(jià)。
有許多格和性的曲折變化語言。當(dāng)某些詞在句中的使用方式發(fā)生變化時(shí),MT 詞匯表無法實(shí)時(shí)改變這些詞的形式或結(jié)尾。
Managing MT glossaries for all engines directly within a TMS
在?TMS?中直接管理所有機(jī)器翻譯引擎的?MT?詞匯表
Translation management systems (TMS) allow localization managers not only to centralize and automate the localization workflow but also make full use of well-established translation technology like translation memories and glossaries.
翻譯管理系統(tǒng)(TMS)讓本地化經(jīng)理不僅能夠使本地化工作流程集中化和自動(dòng)化,還能充分利用翻譯記憶庫和詞匯表等成熟的翻譯技術(shù)。
Modern TMS solutions, like Memsource, enable the use and management of glossaries without the need to upload and manage them with each individual MT provider.
Memsource 等現(xiàn)代的 TMS 解決方案無需向每個(gè) MT 供應(yīng)商上傳和管理詞匯表,就能夠?qū)ζ溥M(jìn)行使用和管理。
In Memsource, you can directly upload, edit, and use MT glossaries for all supported engines, which can significantly reduce the amount of?deployment?and management time.
在 Memsource 中可以直接上傳、編輯和使用所有支持機(jī)器翻譯引擎的 MT 詞匯表,這可以大大減少部署和管理詞匯表的時(shí)間。
How does glossary support work with each MT engine in Memsource?
詞匯表如何在?Memsource?的每個(gè)?MT?引擎中發(fā)揮作用?
MT Glossaries are available as a part of Memsource Translate, the platform’s MT management hub. Besides MT glossaries, Memsource Translate subscribers can take advantage of a number of fully managed machine translation and advanced AI-powered features like MT Quality Estimation and MT Autoselect.
Memsource Translate 是?Memsource 的 MT 管理中心,MT 詞匯表是其一部分。除了 MT 詞匯表,Memsource Translate 的用戶還可以利用一些完全管理的機(jī)器翻譯和先進(jìn)的 AI 功能,如 MT 質(zhì)量評(píng)估和 MT 自動(dòng)選擇。

Memsource MT術(shù)語表
Through Memsource Translate, users can also add their own MT glossaries, which they can Apply to fully?managed MT engines:
Google Translate
Amazon Translate
DeepL
Microsoft Translator
Rozetta Translate
Tencent TranSmart
用戶還可以通過 Memsource Translate 添加自己的 MT 詞匯表,可以將其應(yīng)用于完全管理的 MT 引擎:
Google Translate
Amazon Translate
DeepL
Microsoft Translator
Rozetta Translate
Tencent TranSmart

Memsource 完全管理的 MT 引擎
As soon as you create a custom glossary, you need to attach it to an existing MT profile.?You can create multiple MT glossaries and use them for different translation projects.創(chuàng)建自定義的詞匯表后,需要將其配置到現(xiàn)有的 MT 文件中??梢詣?chuàng)建多個(gè) MT 詞匯表,并可將其應(yīng)用于不同的翻譯項(xiàng)目。
Looking to the future
展望未來
MT glossaries are a simple and effective way to increase machine translation output quality. This is especially true for:為提高機(jī)器翻譯輸出質(zhì)量,MT 詞匯表是一個(gè)簡單而有效的方法。這一點(diǎn)對(duì)于以下情況非常適用:
Domains with low-frequency terms of translation memories that aren’t very large or well-curated
Small-to mid-sized companies without big enough datasets to use custom MT
Bigger companies that have compiled substantial amounts of terminology data over several years or decades—the data isn’t consistent or language or style best practices have evolved or changed.
術(shù)語使用頻率較低,其翻譯記憶庫規(guī)模不大且未經(jīng)過精心整理的領(lǐng)域
沒有足夠大的數(shù)據(jù)集來使用自定義?MT?的中小型公司
在數(shù)年或數(shù)十年間編制了大量的術(shù)語數(shù)據(jù),但前后數(shù)據(jù)不一致,或其語言、風(fēng)格已經(jīng)發(fā)生了變化的大型公司
Nevertheless,?MT glossaries come with limitations as well. At some point,?an MT glossary can get so large that it can hinder localization managers who manage it—regular updates may become a headache and have a higher risk of accidentally introducing errors.
然而,MT 詞匯表也有局限。有時(shí)過大的 MT 詞匯表可能會(huì)阻礙本地化經(jīng)理對(duì)其進(jìn)行管理,定期更新可能會(huì)成為一個(gè)問題,而且意外引入錯(cuò)誤的風(fēng)險(xiǎn)會(huì)更高。
Equally important, most MT glossaries available on the market still have a search-and-replace functionality. With the continuous improvement in MT technology, engines are expected to get even better and let everyone use glossary terms with morphologically correct inflections.
同樣重要的是,市場上大多數(shù) MT 詞匯表仍然具有搜索和替換的功能。隨著 MT 技術(shù)的不斷改進(jìn),機(jī)器翻譯引擎會(huì)變得越來越好,讓每個(gè)人都能使用形態(tài)正確的曲折變化語詞匯表。
To make the most of their machine translation efforts,?localization managers should always prioritize their needs and available resources before deciding if custom machine translation glossaries are right for their use case.
為了充分利用機(jī)器翻譯,本地化經(jīng)理在確定定制機(jī)器翻譯詞匯表是否適合他們的使用情況之前,應(yīng)始終優(yōu)先考慮需求和可用資源。
What are MT Glossaries?什么是機(jī)器翻譯詞匯表?視頻見原推送
Links:
1.https://www.memsource.com/machine-translation/2.https://www.memsource.com/blog/post-editing-machine-translation-best-practices/3.https://www.memsource.com/translation-management-system/4.https://www.memsource.com/features/machine-translation/5.https://help.memsource.com/hc/en-us/articles/4409263455762-MT-Glossaries
轉(zhuǎn)載來源:煙臺(tái)譯博云天公司公眾號(hào)
轉(zhuǎn)載編輯:丁羽翔
譯文僅供參考,不當(dāng)之處歡迎大家在后臺(tái)留言提出!

本文來源于微信公眾號(hào)“翻譯技術(shù)教育與研究”、微信公眾號(hào)“語言服務(wù)行業(yè)”,致力于語言服務(wù)行業(yè)資訊、洞察、洞見~ 關(guān)注我們,了解更多精彩內(nèi)容~