The Journal of Machine Learning Research
Integrating Unstructured Data into Relational Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficiently linking text documents with relevant structured information
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Spark: top-k keyword query in relational databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Decision trees for entity identification: approximation algorithms and hardness results
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Ontology-driven automatic entity disambiguation in unstructured text
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Characteristics of document similarity measures for compliance analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Challenging research issues in data mining, databases and information retrieval
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
Today, valuable business information is increasingly stored as unstructured data (documents, emails, etc.). For example, documents exchanged between business partners capture information on transactions between them like purchases or invoices. A major challenge is to correctly recognize and associate real-world entities in unstructured data, e.g. documents, with those stored in structured data e.g., enterprise databases. To address this, we propose in this paper a robust process methodology consisting of three phases: entity extraction from documents, generation of mapping of recognized entities with structured data, and disambiguation of mappings exploiting relationships from the enterprise data and the documents' structure.