DocumentCode :
2727352
Title :
Determining Bias to Search Engines from Robots.txt
Author :
Ghoula, Nizar ; Khelif, K. ; Dieng-Kuntz, R.
Author_Institution :
INRIA, Sophia Antipolis
fYear :
2007
fDate :
2-5 Nov. 2007
Firstpage :
149
Lastpage :
155
Abstract :
Semantic web approach seems interesting for supporting content mining of millions of patents accessible through the Web. In this paper, we describe our approach for generating semantic annotations on patents, by relying on the structure and on a semantic representation of patent documents. We use both the structure of the patent documents and their textual contents processed by Natural Language Processing (NLP) tools. This method, primarily aimed at helping biologists use patent information can be generalized to all kinds of domains or of structured documents.
Keywords :
data mining; ontologies (artificial intelligence); patents; semantic Web; text analysis; natural language processing tools; ontology-based semantic annotations; patent content mining; patent documents; patent information; patent mining; semantic Web; semantic representation; structured documents; textual contents; Access protocols; Crawlers; Educational robots; File servers; Government; Intelligent robots; Robotics and automation; Search engines; Sun; USA Councils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, IEEE/WIC/ACM International Conference on
Conference_Location :
Fremont, CA
Print_ISBN :
978-0-7695-3026-0
Type :
conf
DOI :
10.1109/WI.2007.98
Filename :
4427081
Link To Document :
بازگشت