Aspects of the P-Norm model of information retrieval: syntactic query generation, efficiency, and theoretical properties
Wrapper generation for semi-structured Internet sources
ACM SIGMOD Record
Information retrieval on the web
ACM Computing Surveys (CSUR)
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Effective Retrieval of Information in Tables on the Internet
IEA/AIE '02 Proceedings of the 15th international conference on Industrial and engineering applications of artificial intelligence and expert systems: developments in applied artificial intelligence
Extending the boolean and vector space models of information retrieval with p-norm queries and multiple concept types
Mining table information on the internet
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
The information retrieval system currently in use fails to consider the structural information of documents but uses extracted indexes from documents instead. Structural information such as the font face, font size, indentation, tables, and etc. demonstrate the author's meaning and is clearly the prime means of documentation. This paper pays special attention to tables because tables are commonly used within many documents to make the meanings clear, which are well recognized because web documents use tags for additional information. On the Internet, tables are used for the purpose of the structure of knowledge and also the design of documents. This report will propose a method of extracting meaningful tables using a decision tree and to construct a dictionary of table indexes in order to apply an information retrieval system and thus enhance the accuracy.