Tatiana Lesnikova, Jérôme David, Jérôme Euzenat, Interlinking English and Chinese RDF data sets using machine translation, in: Johanna Völker, Heiko Paulheim, Jens Lehmann, Harald Sack, Vojtech Svátek (eds), Proc. 3rd ESWC workshop on Knowledge discovery and data mining meets linked open data (Know@LOD), Hersounisos (GR), 2014
Data interlinking is a difficult task particularly in a multilingual environment like the Web. In this paper, we evaluate the suitability of a Machine Translation approach to interlink RDF resources described in English and Chinese languages. We represent resources as text documents, and a similarity between documents is taken for similarity between resources. Documents are represented as vectors using two weighting schemes, then cosine similarity is computed. The experiment demonstrates that TF*IDF with a minimum amount of preprocessing steps can bring high results.
Semantic web, Cross-lingual link discovery, Cross-lingual instance linking, owl:sameAs