Resolving Ambiguity/Uncertainty in Fact Extraction

 

Hejab Ma’azer Al Fawareh1, Shaidah Jusoh2,  Norita Md. Norwawi3

 

Graduate Dept. of Computer Science, College of Arts & Sciences, FTM Building, Universiti Utara Malaysia, Sintok 06010 Kedah, Malaysia,

e-mail: 1alfawareh@gmail.com, 2shaidah@uum.edu.my, 3nmn@uum.edu.my

 

 

   Most of the valuable and crucial information is stored in texts. Extracting information from the texts requiring a person to read them. This is very time consuming. It can become a challenging task if the person does not have enough background related to the texts. Having an automated system that can extract required information from the texts is becoming an urgent need. Information extraction is one of the application research in the field of knowledge mining. In information extraction, there are two levels of extractions; entity extraction and fact extractions. Fact extraction is a process of spreading out the facts from entities and topics. The major challenging issue in extracting facts from texts is natural language words and structures are always ambiguous. In an automated information extraction system, the fact should be correct and relevant to a user’s needs. Lets us consider a sentence “The robber shot a police in the Giant mall”.  The sentence can be parsed using a grammar rule [Sentence -> Noun Phrase, Verb Phrase] or [Sentence -> Noun Phrase, Verb Phrase, Preposition Phrase]. Thus, the sentence can be interpreted as  “The robber who is inside the Giant mall shot the police” or  “The robber shot a police who is inside the Giant mall”. Up to now, not much research has been conducted in resolving ambiguity and uncertainty problems for fact extraction. The ambiguity problem occurs when a sentence structure could be interpreted into more  than one meaning and uncertainty problems occur when there are more than one fact could be extracted. In this paper, we propose a new technique to resolve ambiguity and uncertainty in fact extraction. The approach is developed by utilizing natural language processing, fuzzy sets and context knowledge-base approaches.