Duplicate record elimination in large data files
ACM Transactions on Database Systems (TODS)
Weaving the Web; The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor (2 Cassettes)
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
KIM – a semantic platform for information extraction and retrieval
Natural Language Engineering
Coreference for NLP applications
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
A class of applications exists where the information to be stored is partially structured:that is, it consists partly of some structured data sources each conforming to a schema and partly of information left as free text. While investigating the requirements for querying partially structured data, we have encountered several limitations in the currently available approaches and we describe here three new techniques which combine aspects of Information Extraction with data integration in order to better exploit the data in these applications.