The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Hardening soft information sources
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
IntelliClean: a knowledge-based intelligent data cleaner
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Record Linkage in Large Data Sets
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Stuff I've seen: a system for personal information retrieval and re-use
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Iterative record linkage for cleaning and integration
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Personal information management with SEMEX
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Exploiting relationships for object consolidation
Proceedings of the 2nd international workshop on Information quality in information systems
Relational clustering for multi-type entity resolution
MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Data unification in personal information management
Communications of the ACM - Personal information management
ACM SIGKDD Explorations Newsletter
Proceedings of the 15th international conference on World Wide Web
Domain-independent data cleaning via analysis of entity-relationship graph
ACM Transactions on Database Systems (TODS)
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data integration: the teenage years
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
iDM: a unified and versatile data model for personal dataspace management
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient exact set-similarity joins
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
ActiveRDF: object-oriented semantic web programming
Proceedings of the 16th international conference on World Wide Web
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Web Appearance Disambiguation of Personal Names Based on Network Motif
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Adaptive graphical approach to entity resolution
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Assieme: finding and leveraging implicit references in a web search interface for programmers
Proceedings of the 20th annual ACM symposium on User interface software and technology
Relations, cards, and search templates: user-guided web data integration and layout
Proceedings of the 20th annual ACM symposium on User interface software and technology
Record matching in digital library metadata
Communications of the ACM - Alternate reality gaming
Extracting the discussion structure in comments on news-articles
Proceedings of the 9th annual ACM international workshop on Web information and data management
Proceedings of the 9th annual ACM international workshop on Web information and data management
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Structure-based inference of xml similarity for fuzzy duplicate detection
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Declarative information extraction using datalog with embedded extraction predicates
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A relational approach to incrementally extracting and querying structure in unstructured data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
ACM Transactions on the Web (TWEB)
NETB'07 Proceedings of the 3rd USENIX international workshop on Networking meets databases
Citation data clustering for author name disambiguation
Proceedings of the 2nd international conference on Scalable information systems
Purpose based access control for privacy protection in relational database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Improving the accuracy of entity identification through refinement
Ph.D. '08 Proceedings of the 2008 EDBT Ph.D. workshop
Named entity normalization in user generated content
Proceedings of the second workshop on Analytics for noisy unstructured text data
Structured entity identification and document categorization: two tasks with one joint model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
ActiveRDF: Embedding Semantic Web data into object-oriented languages
Web Semantics: Science, Services and Agents on the World Wide Web
Probabilistic Entity Linkage for Heterogeneous Information Spaces
CAiSE '08 Proceedings of the 20th international conference on Advanced Information Systems Engineering
Wildcards for lightweight information integration in virtual desktops
Proceedings of the 17th ACM conference on Information and knowledge management
An FCA-based solution for ontology mediation
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
International Journal of Metadata, Semantics and Ontologies
Consolidation of References to Persons in Bibliographic Databases
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Refining Instance Coreferencing Results Using Belief Propagation
ASWC '08 Proceedings of the 3rd Asian Semantic Web Conference on The Semantic Web
Reconciliando dados de cunho acadêmico
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Exploiting web search to generate synonyms for entities
Proceedings of the 18th international conference on World wide web
idMesh: graph-based disambiguation of linked data
Proceedings of the 18th international conference on World wide web
Swoosh: a generic approach to entity resolution
The VLDB Journal — The International Journal on Very Large Data Bases
Bringing your dead links back to life: a comprehensive approach and lessons learned
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exploiting context analysis for combining multiple entity resolution systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Entity resolution with iterative blocking
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Combining a Logical and a Numerical Method for Data Reconciliation
Journal on Data Semantics XII
Constraint-based entity matching
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
L2R: a logical method for reference reconciliation
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Journal of Artificial Intelligence Research
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
An integrated framework for de-identifying unstructured medical data
Data & Knowledge Engineering
Generic entity resolution with negative rules
The VLDB Journal — The International Journal on Very Large Data Bases
A function-based user authority delegation model
Information Sciences: an International Journal
Frameworks for entity matching: A comparison
Data & Knowledge Engineering
HAMSTER: using search clicklogs for schema and taxonomy matching
Proceedings of the VLDB Endowment
Mining document collections to facilitate accurate approximate entity matching
Proceedings of the VLDB Endowment
Learning string transformations from examples
Proceedings of the VLDB Endowment
Data integration for the relational web
Proceedings of the VLDB Endowment
"Same, Same but Different" A Survey on Duplicate Detection Methods for Situation Awareness
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part II
Merging Expressive Ontologies Using Formal Concept Analysis
OTM '09 Proceedings of the Confederated International Workshops and Posters on On the Move to Meaningful Internet Systems: ADI, CAMS, EI2N, ISDE, IWSSA, MONET, OnToContent, ODIS, ORM, OTM Academy, SWWS, SEMELS, Beyond SAWSDL, and COMBEK 2009
Modeling Concept Evolution: A Historical Perspective
ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
ASWC '09 Proceedings of the 4th Asian Conference on The Semantic Web
Entity-aware query processing for heterogeneous data with uncertainty and correlations
Proceedings of the 2009 EDBT/ICDT Workshops
HARRA: fast iterative hashed record linkage for large-scale data collections
Proceedings of the 13th International Conference on Extending Database Technology
Leveraging personal metadata for Desktop search: The Beagle++ system
Web Semantics: Science, Services and Agents on the World Wide Web
Declarative XML data cleaning with XClean
CAiSE'07 Proceedings of the 19th international conference on Advanced information systems engineering
Self-tuning in graph-based reference disambiguation
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Discovering executable semantic mappings between ontologies
OTM'07 Proceedings of the 2007 OTM Confederated international conference on On the move to meaningful internet systems: CoopIS, DOA, ODBASE, GADA, and IS - Volume Part I
No Code Required: Giving Users Tools to Transform the Web
No Code Required: Giving Users Tools to Transform the Web
On active learning of record matching packages
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Ontology-based information extraction: An introduction and a survey of current approaches
Journal of Information Science
Multiple relationship based deduplication
Proceedings of the Fourth SIGMOD PhD Workshop on Innovative Database Research
Disambiguating identity web references using Web 2.0 data and semantics
Web Semantics: Science, Services and Agents on the World Wide Web
An ontology based approach to automating data integration in scientific workflows
Proceedings of the 7th International Conference on Frontiers of Information Technology
Privacy-aware access control with generalization boundaries
ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Components for information extraction: ontology-based information extractors and generic platforms
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A graphical method for reference reconciliation
DASFAA'10 Proceedings of the 15th international conference on Database systems for advanced applications
Feature-based entity matching: the FBEM model, implementation, evaluation
CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
From web data to entities and back
CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
EIF: a framework of effective entity identification
WAIM'10 Proceedings of the 11th international conference on Web-age information management
A multilevel and domain-independent duplicate detection model for scientific database
WAIM'10 Proceedings of the 11th international conference on Web-age information management
On Graph-Based Name Disambiguation
Journal of Data and Information Quality (JDIQ)
Evaluating entity resolution results
Proceedings of the VLDB Endowment
Record linkage with uniqueness constraints and erroneous values
Proceedings of the VLDB Endowment
On-the-fly entity-aware query processing in the presence of linkage
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Evaluation of entity resolution approaches on real-world match problems
Proceedings of the VLDB Endowment
A context-based model for the interpretation of polysemous terms
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Ontology-driven possibilistic reference fusion
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Efficient entity resolution for large heterogeneous information spaces
Proceedings of the fourth ACM international conference on Web search and data mining
Large-scale collective entity matching
Proceedings of the VLDB Endowment
Incrementally maintaining classification using an RDBMS
Proceedings of the VLDB Endowment
Self-supervised web search for any-k complete tuples
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB
Interaction between record matching and data repairing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Eliminating the redundancy in blocking-based entity resolution methods
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Efficient duplicate detection on cloud using a new signature scheme
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Applied Intelligence
Linking semantic desktop data to the web of data
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part II
Duplicate detection through structure optimization
Proceedings of the 20th ACM international conference on Information and knowledge management
Scalable entity matching computation with materialization
Proceedings of the 20th ACM international conference on Information and knowledge management
KD2R: a key discovery method for semantic reference reconciliation
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems
Quality-aware similarity assessment for entity matching in Web data
Information Systems
XML duplicate detection using sorted neighborhoods
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data
Proceedings of the fifth ACM international conference on Web search and data mining
Ontology-driven automatic entity disambiguation in unstructured text
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Efficient semantic-aware detection of near duplicate resources
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Secure anonymization for incremental datasets
SDM'06 Proceedings of the Third VLDB international conference on Secure Data Management
Unsupervised duplicate detection using sample non-duplicates
Journal on Data Semantics VII
Extracting mnemonic names of people from the web
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Similarity function recommender service using incremental user knowledge acquisition
ICSOC'11 Proceedings of the 9th international conference on Service-Oriented Computing
Organizational search in email systems
Proceedings of the 50th Annual Southeast Regional Conference
Proceedings of the 21st international conference on World Wide Web
Classification rule learning for data linking
Proceedings of the 2012 Joint EDBT/ICDT Workshops
A framework for robust discovery of entity synonyms
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Foundations and Trends in Information Retrieval
Entity resolution: theory, practice & open challenges
Proceedings of the VLDB Endowment
3SEPIAS: A Semi-Structured Search Engine for Personal Information in dAtaspace System
Information Sciences: an International Journal
Named entity disambiguation in streaming data
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Map to humans and reduce error: crowdsourcing for deduplication applied to digital libraries
Proceedings of the 21st ACM international conference on Information and knowledge management
Integrating feature analysis and background knowledge to recommend similarity functions
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
10th international workshop on quality in databases: QDB 2012
ACM SIGMOD Record
Domain-Independent Entity Coreference for Linking Ontology Instances
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
Adaptive Connection Strength Models for Relationship-Based Entity Resolution
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
Data Linking for the Semantic Web
International Journal on Semantic Web & Information Systems
Towards scalable real-time entity resolution using a similarity-aware inverted index approach
AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
A performance comparison of parallel DBMSs and MapReduce on large-scale text analytics
Proceedings of the 16th International Conference on Extending Database Technology
Optimal hashing schemes for entity matching
Proceedings of the 22nd international conference on World Wide Web
Mining entity attribute synonyms via compact clustering
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Beyond search: Retrieving complete tuples from a text-database
Information Systems Frontiers
Evaluation of instance matching tools: The experience of OAEI
Web Semantics: Science, Services and Agents on the World Wide Web
Query-driven approach to entity resolution
Proceedings of the VLDB Endowment
Large-scale linked data integration using probabilistic reasoning and crowdsourcing
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient entity matching using materialized lists
Information Sciences: an International Journal
Incremental entity resolution on rules and data
The VLDB Journal — The International Journal on Very Large Data Bases
Joint entity resolution on multiple datasets
The VLDB Journal — The International Journal on Very Large Data Bases
Aligning tweets with events: Automation via semantics
Semantic Web - On real-time and ubiquitous social semantics
Journal of Information Science
Entity ranking using click-log information
Intelligent Data Analysis
Hi-index | 0.00 |
Reference reconciliation is the problem of identifying when different references (i.e., sets of attribute values) in a dataset correspond to the same real-world entity. Most previous literature assumed references to a single class that had a fair number of attributes (e.g., research publications). We consider complex information spaces: our references belong to multiple related classes and each reference may have very few attribute values. A prime example of such a space is Personal Information Management, where the goal is to provide a coherent view of all the information on one's desktop.Our reconciliation algorithm has three principal features. First, we exploit the associations between references to design new methods for reference comparison. Second, we propagate information between reconciliation decisions to accumulate positive and negative evidences. Third, we gradually enrich references by merging attribute values. Our experiments show that (1) we considerably improve precision and recall over standard methods on a diverse set of personal information datasets, and (2) there are advantages to using our algorithm even on a standard citation dataset benchmark.