The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
IEEE Transactions on Pattern Analysis and Machine Intelligence
AJAX: an extensible data cleaning tool
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Hardening soft information sources
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
IntelliClean: a knowledge-based intelligent data cleaner
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A hierarchical naive Bayes mixture model for name disambiguation in author citations
Proceedings of the 2005 ACM symposium on Applied computing
Effective and scalable solutions for mixed and split citation problems in digital libraries
Proceedings of the 2nd international workshop on Information quality in information systems
Mining knowledge from text using information extraction
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
A comparative illustration of AI planning-based web services composition
ACM SIGecom Exchanges
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data integration: the teenage years
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Adaptive sorted neighborhood methods for efficient record linkage
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 2007 ACM symposium on Document engineering
Proceedings of the 9th annual ACM international workshop on Web information and data management
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A strategy for allowing meaningful and comparable scores in approximate matching
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Communications of the ACM
Survey on test collections and techniques for personal name matching
International Journal of Metadata, Semantics and Ontologies
Replica identification using genetic programming
Proceedings of the 2008 ACM symposium on Applied computing
Efficient similarity joins for near duplicate detection
Proceedings of the 17th international conference on World Wide Web
Identification of time-varying objects on the web
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Matching XML documents in highly dynamic applications
Proceedings of the eighth ACM symposium on Document engineering
Similarity of Names Across Scripts: Edit Distance Using Learned Costs of N-Grams
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Ed-Join: an efficient algorithm for similarity joins with edit distance constraints
Proceedings of the VLDB Endowment
Foundations and Trends in Databases
The impact of parameter setup on a genetic programming approach to record deduplication
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Efficient top-k count queries over imprecise duplicates
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Generalized Mongue-Elkan Method for Approximate Text String Comparison
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Disambiguating authors in academic publications using random forests
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Efficient approximate entity extraction with edit distance constraints
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Optimal Stopping: A Record-Linkage Approach
Journal of Data and Information Quality (JDIQ)
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
Robust similarity measures for named entities matching
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Graph Theoretic Topological Analysis of Web Service Networks
World Wide Web
Voice-rate: a dialog system for consumer ratings
NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Reranking and Classifying Search Results Exhaustively Based on Edit-and-Propagate Operations
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Learning semantic definitions of online information sources
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
Learning semantic descriptions of web information sources
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Discovering matching dependencies
Proceedings of the 18th ACM conference on Information and knowledge management
A possibilistic approach to string comparison
IEEE Transactions on Fuzzy Systems
Answering table augmentation queries from unstructured lists on the web
Proceedings of the VLDB Endowment
Entity-aware query processing for heterogeneous data with uncertainty and correlations
Proceedings of the 2009 EDBT/ICDT Workshops
Discriminative training of clustering functions: theory and experiments with entity identification
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators
Journal of Biomedical Informatics
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Properties of possibilistic string comparison
IEEE Transactions on Fuzzy Systems
Detecting duplicate biological entities using Shortest Path Edit Distance
International Journal of Data Mining and Bioinformatics
Hashing-based approaches to spelling correction of personal names
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
PROSPECT: a system for screening candidates for recruitment
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
From web data to entities and back
CAiSE'10 Proceedings of the 22nd international conference on Advanced information systems engineering
Duplicate identification in deep web data integration
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Exact and efficient proximity graph computation
ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Privacy-preserving record linkage
PSD'10 Proceedings of the 2010 international conference on Privacy in statistical databases
Disclosing false identity through hybrid link analysis
Artificial Intelligence and Law
On Graph-Based Name Disambiguation
Journal of Data and Information Quality (JDIQ)
Annotating and searching web tables using entities, types and relationships
Proceedings of the VLDB Endowment
Automatic threshold estimation for data matching applications
Information Sciences: an International Journal
Identity matching using personal and social identity features
Information Systems Frontiers
Multimedia metadata mapping: towards helping developers in their integration task
Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia
A hierarchical Naïve Bayes model for approximate identity matching
Decision Support Systems
Proceedings of the second international workshop on MapReduce and its applications
Efficient similarity joins for near-duplicate detection
ACM Transactions on Database Systems (TODS)
Matching unstructured product offers to structured product specifications
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
PG-join: proximity graph based string similarity joins
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Efficient name disambiguation in digital libraries
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Web trace duplication detection based on context
WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II
Applied Intelligence
A publication process model to enable privacy-aware data sharing
IBM Journal of Research and Development
Author Name Disambiguation in Citations
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Combining machine learning and human judgment in author disambiguation
Proceedings of the 20th ACM international conference on Information and knowledge management
Computer-based genealogy reconstruction in founder populations
Journal of Biomedical Informatics
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
A dictionary-based approach to fast and accurate name matching in large law enforcement databases
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Ontology-driven automatic entity disambiguation in unstructured text
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Monitoring research collaborations using semantic web technologies
ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications
Structured databases of named entities from Bayesian nonparametrics
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Similarity function recommender service using incremental user knowledge acquisition
ICSOC'11 Proceedings of the 9th international conference on Service-Oriented Computing
SC spectra: a linear-time soft cardinality approximation for text comparison
MICAI'11 Proceedings of the 10th international conference on Artificial Intelligence: advances in Soft Computing - Volume Part II
Improving access to multimedia using multi-source hierarchical meta-data
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Cross-lingual knowledge linking across wiki knowledge bases
Proceedings of the 21st international conference on World Wide Web
Citation-based bootstrapping for large-scale author disambiguation
Journal of the American Society for Information Science and Technology
Discovering missing links in networks using vertex similarity measures
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Aggregating web offers to determine product prices
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Soft cardinality + ML: learning adaptive similarity functions for cross-lingual textual entailment
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Automatic SLA Matching and Provider Selection in Grid and Cloud Computing Markets
GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing
Valid statistical inference on automatically matched files
PSD'12 Proceedings of the 2012 international conference on Privacy in Statistical Databases
An evolutionary approach to complex schema matching
Information Systems
A machine learning approach for instance matching based on similarity metrics
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Integrating feature analysis and background knowledge to recommend similarity functions
WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Domain-Independent Entity Coreference for Linking Ontology Instances
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
Vietnamese author name disambiguation for integrating publications from heterogeneous sources
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
PartSS: an efficient partition-based filtering for edit distance constraints
ADC '11 Proceedings of the Twenty-Second Australasian Database Conference - Volume 115
Contextual rule-based feature engineering for author-paper identification
Proceedings of the 2013 KDD Cup 2013 Workshop
Effective string processing and matching for author disambiguation
Proceedings of the 2013 KDD Cup 2013 Workshop
A hybrid model words-driven approach for web product duplicate detection
CAiSE'13 Proceedings of the 25th international conference on Advanced Information Systems Engineering
Editorial: Efficient discovery of similarity constraints for matching dependencies
Data & Knowledge Engineering
Learning an accurate entity resolution model from crowdsourced labels
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Deduplicating a places database
Proceedings of the 23rd international conference on World wide web
Textual and Content-Based Search in Repositories of Web Application Models
ACM Transactions on the Web (TWEB)
Clustering with Proximity Graphs: Exact and Efficient Algorithms
International Journal of Knowledge-Based Organizations
Hi-index | 0.02 |
Identifying approximately duplicate database records that refer to the same entity is essential for information integration. The authors review traditional approaches to solving this problem and present their recent experimental results on comparing, combining, and learning textual similarity measures for name matching.