ARTÍCULO
TITULO

Word Sense Disambiguation Using Semantic Web for Tamil to English Statistical Machine Translation

Santosh Kumar T.S.    

Resumen

        Machine Translation has been an area of linguistic research for almost more than two decades now. But it still remains a very challenging task for devising an automated system which will deliver accurate translations of the natural languages. However, great strides have been made in this field with more success owing to the development of technologies of the web and off late there is a renewed interest in this area of research.         Technological advancements in the preceding two decades have influenced Machine Translation in a considerable way. Several MT approaches including Statistical Machine Translation greatly benefitted from these advancements, basically making use of the availability of extensive corpora. Web technology web3.0 uses the semantic web technology which represents any object or resource in the web both syntactically and semantically.  This type of representation is very much useful for the computing systems to search any content on the internet similar to lexical search and improve the internet based translations making it more effective and efficient.       In this paper we propose a technique to improve existing statistical Machine Translation methods by making use of semantic web technology. Our focus will be on Tamil and Tamil to English MT. The proposed method could successfully integrate a semantic web technique in the process of WSD which forms part of the MT system. The integration is accomplished by using the capabilities of RDFS and OWL into the WSD component of the MT model. The contribution of this work lies in showing that integrating a Semantic web technique in the WSD system significantly improves the performance of a statistical MT system for a translation from Tamil to English.       In this paper we assume the availability of large corpora in Tamil language and specific domain based ontologies with Tamil semantic web technology using web3.0. We are positive on the expansion and development of Tamil semantic web and subsequently infer that Tamil to English MT will greatly improve the disambiguation concept apart from other related benefits. This method could enable the enhancement of translation quality by improving on word sense disambiguation process while text is translated from Tamil to English language. This method can also be extended to other languages such as Hindi and Indian Languages.

 Artículos similares

       
 
Muhammad Bilal, Zeng Jianqiu, Suad Dukhaykh, Mingyue Fan and Ale? Trunk    
Drawing on social identity theory, this study aims to examine the impact of antecedents of eWOM on the online purchase intention (OPI) of fashion-related products. In addition, social media usage moderates the relationship between eWOM and OPI. A structu... ver más
Revista: Information

 
Ping Pan, Junzhi Ye, Yun Pan, Lize Gu and Licheng Wang    
Commitment schemes are important tools in cryptography and used as building blocks in many cryptographic protocols. We propose two commitment schemes by using Rubik?s groups. Our proposals do not lay the security on the taken-for-granted hardness of the ... ver más
Revista: Information

 
Fridolin Wild, Lawrence Marshall, Jay Bernard, Eric White and John Twycross    
The integration of augmented reality (AR) technology into personal computing is happening fast, and augmented workplaces for professionals in areas such as Industry 4.0 or digital health can reasonably be expected to form liminal zones that push the boun... ver más
Revista: Information

 
Attaporn Wangpoonsarp, Kazuya Shimura and Fumiyo Fukumoto    
This paper focuses on the domain-specific senses of words and proposes a method for detecting predominant sense depending on each domain. Our Domain-Specific Senses (DSS) model is an unsupervised manner and detects predominant senses in each domain. We a... ver más
Revista: Applied Sciences

 
Christos Makris, Georgios Pispirigos and Michael Angelos Simos    
Text annotation is the process of identifying the sense of a textual segment within a given context to a corresponding entity on a concept ontology. As the bag of words paradigm?s limitations become increasingly discernible in modern applications, severa... ver más
Revista: Algorithms