TINTIN: a system for retrieval in text tables
DL '97 Proceedings of the second ACM international conference on Digital libraries
A machine learning based approach for table detection on the web
Proceedings of the 11th international conference on World Wide Web
Open Mind Common Sense: Knowledge Acquisition from the General Public
On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
Mining tables from large scale HTML texts
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Table extraction using conditional random fields
dg.o '03 Proceedings of the 2003 annual national conference on Digital government research
Proceedings of the 16th international conference on World Wide Web
Freebase: a collaboratively created graph database for structuring human knowledge
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
Harvesting relational tables from lists on the web
Proceedings of the VLDB Endowment
DBpedia: a nucleus for a web of open data
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Annotating and searching web tables using entities, types and relationships
Proceedings of the VLDB Endowment
SEISA: set expansion by iterative similarity aggregation
Proceedings of the 20th international conference on World wide web
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
InfoGather: entity augmentation and attribute discovery by holistic matching with web tables
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Probase: a probabilistic taxonomy for text understanding
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Short text conceptualization using a probabilistic knowledgebase
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic taxonomy construction from keywords
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A system for extracting top-K lists from the web
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Entity discovery and annotation in tables
Proceedings of the 16th International Conference on Extending Database Technology
DeExcelerator: a framework for extracting relational data from partially structured documents
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Scalable column concept determination for web tables using large knowledge bases
Proceedings of the VLDB Endowment
Context-dependent conceptualization
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Synthesizing union tables from the web
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
The Web contains a wealth of information, and a key challenge is to make this information machine processable. In this paper, we study how to "understand" HTML tables on the Web, which is one step further from finding the schemas of tables. From 0.3 billion Web documents, we obtain 1.95 billion tables, and 0.5-1% of these contain information of various entities and their properties. We argue that in order for computers to understand these tables, computers must first have a brain --- a general purpose knowledge taxonomy that is comprehensive enough to cover the concepts (of worldly facts) in a human mind. Second, we argue that the process of understanding a table is the process of finding the right position for the table in the knowledge taxonomy. Once a table is associated with a concept in the knowledge taxonomy, it will be automatically linked to all other tables that are associated with the same concept, as well as tables associated with concepts related to this concept. In other words, understanding occurs when computers will understand the semantics of the tables through the interconnections of concepts in the knowledge base. In this paper, we illustrate a two phase process. Our experimental results show that the approach is feasible and it may benefit many useful applications such as web search.