Network flows: theory, algorithms, and applications
Network flows: theory, algorithms, and applications
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
WWW '03 Proceedings of the 12th international conference on World Wide Web
Large Margin Methods for Structured and Interdependent Output Variables
The Journal of Machine Learning Research
Optimizing scoring functions and indexes for proximity search in type-annotated corpora
Proceedings of the 15th international conference on World Wide Web
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
A shortest path dependency kernel for relation extraction
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Wikify!: linking documents to encyclopedic knowledge
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
EntityRank: searching entities directly and holistically
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
Learning to link with wikipedia
Proceedings of the 17th ACM conference on Information and knowledge management
Foundations and Trends in Databases
Collective annotation of Wikipedia entities in web text
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Answering table augmentation queries from unstructured lists on the web
Proceedings of the VLDB Endowment
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
Communications of the ACM
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Recovering semantics of tables on the web
Proceedings of the VLDB Endowment
DC proposal: graphical models and probabilistic reasoning for generating linked data from tables
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part II
WebSets: extracting sets of entities from the web using unsupervised information extraction
Proceedings of the fifth ACM international conference on Web search and data mining
InfoGather: entity augmentation and attribute discovery by holistic matching with web tables
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
LIEGE:: link entities in web lists with knowledge base
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PATTY: a taxonomy of relational patterns with semantic types
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia
Artificial Intelligence
RUBIX: a framework for improving data integration with linked data
Proceedings of the First International Workshop on Open Data
Controlled knowledge base enrichment from web documents
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Understanding tables on the web
ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning
Proceedings of the sixth ACM international conference on Web search and data mining
Entity discovery and annotation in tables
Proceedings of the 16th International Conference on Extending Database Technology
Knowledge harvesting in the big-data era
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
InfoGather+: semantic matching and annotation of numeric and time-varying attributes in web tables
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Data-based research at IIT Bombay
ACM SIGMOD Record
Information extraction as a filtering task
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
The parallel path framework for entity discovery on the web
ACM Transactions on the Web (TWEB)
Aggregated search: A new information retrieval paradigm
ACM Computing Surveys (CSUR)
A human-machine method for web table understanding
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Scalable column concept determination for web tables using large knowledge bases
Proceedings of the VLDB Endowment
Schema extraction for tabular data on the web
Proceedings of the VLDB Endowment
Web table taxonomy and formalization
ACM SIGMOD Record
Synthesizing union tables from the web
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Using linked data to mine RDF from wikipedia's tables
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.02 |
Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational world knowledge is usually considerably better than completely unstructured, free-format text. At the same time, unlike manually-created knowledge bases, relational information mined from "organic" Web tables need not be constrained by availability of precious editorial time. Unfortunately, in the absence of any formal, uniform schema imposed on Web tables, Web search cannot take advantage of these high-quality sources of relational information. In this paper we propose new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express. We propose a new graphical model for making all these labeling decisions for each table simultaneously, rather than make separate local decisions for entities, types and relations. Experiments using the YAGO catalog, DB-Pedia, tables from Wikipedia, and over 25 million HTML tables from a 500 million page Web crawl uniformly show the superiority of our approach. We also evaluate the impact of better annotations on a prototype relational Web search tool. We demonstrate clear benefits of our annotations beyond indexing tables in a purely textual manner.