YiSi software - Semantic machine translation evaluation metric

Overview of the tool

YiSiFootnote 1 is open-source software that evaluates the accuracy of meaning in output sentences produced by machine translation systems. It uses datasets that contain word embeddings to estimate the relationships of meaning between words, in order to assign an accuracy score from 0-100 for each translated sentence. The software was developed by the National Research Council of Canada's Digital Technologies Research Centre.

Target users

  • Developers of machine translation systems
  • Computational linguists

Benefits to users

  • YiSi can pinpoint problems in machine translation output; it helps developers identify areas that require improvement
  • There is a high correlation with human scoring of the accuracy of meaning in translated sentences; it helps developers evaluate and compare machine translation systems

System requirements

  • YiSi was developed to run on Linux.
  • YiSi is written in C++ and requires a version of g++ that supports C++11; we're using GCC 4.9.3.
  • YiSi requires make; we're using GNU make version 3.81.
  • YiSi requires bash; we're using GNU bash version 4.1.2.

Technical tool description

YiSi is a family of semantic machine translation (MT) evaluation metrics with a flexible architecture for evaluating machine translation output in languages with differing amounts of training resources. Inspired by the MEANT 2.0 software (Lo, 2017), YiSi-1 measures the similarity between the human references and MT by aggregating the weighted distributional lexical semantic similarity, and, optionally, the shallow semantic structures. YiSi-0 is a degenerate resource-free version that uses the longest common character substring accuracy to replace distributional semantics for evaluating lexical similarity between the human reference and MT output. On the other hand, YiSi-2 is the bilingual reference-less version that uses bilingual word embeddings for evaluating cross-lingual lexical semantic similarity between the input and machine translation output.

YiSi-1 achieved the highest average correlation with human direct assessment (DA) judgment across all language pairs at system-level and the highest median correlation with DA relative ranking across all language pairs at segment-level in the 2018 Third Conference on Machine Translation (WMT2018) metrics task (Ma et al., 2018). YiSi-1 also successfully served in the WMT2018 parallel corpus filtering task while YiSi-2 showed comparable accuracy in the same task.

YiSi-0 is readily available for evaluating all languages. YiSi-1 requires a monolingual corpus in the output language to train the distributional lexical semantics model. YiSi-1_srl is designed for resource-rich languages that are equipped with an automatic semantic role labeler in the output language. YiSi-2 requires bilingual word embeddings and YiSi-2_srl additionally requires an automatic semantic role labeler for both the input and output language.


YiSi is available free of charge for research and commercial purposes. Contact us to find out more.


Download YiSi and its word embeddings

Master code used to run sentence evaluation:

Pretrained word embeddings:

Pretrained word embeddings – accessible in the NRC Digital Repository

  • Chinese, tokenized by Stanford Chinese segmenter
  • Czech
  • English
  • Estonian
  • Finnish
  • French
  • German
  • Hindi
  • Latvian
  • Polish
  • Romanian
  • Russian
  • Spanish
  • Turkish

Contact us

Technical enquiries
Jackie Lo, Research Officer
Telephone: 613-993-0620
Email: Jackie.Lo@nrc-cnrc.gc.ca

Business enquiries
Pierre Charron, Business Development Officer
Telephone: 613-990-0336
Email: Pierre.Charron@nrc-cnrc.gc.ca