Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
Very fast EM-based mixture model clustering using multiresolution kd-trees
Proceedings of the 1998 conference on Advances in neural information processing systems II
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Fast ordering of large categorical datasets for better visualization
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Induction of semantic classes from natural language text
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
DIRT @SBT@discovery of inference rules from text
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating EM for Large Databases
Machine Learning
Interactive deduplication using active learning
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to match and cluster large high-dimensional data sets for data integration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning from Cluster Examples
Machine Learning
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
GraphZip: a fast and automatic compression method for spatial data clustering
Proceedings of the 2004 ACM symposium on Applied computing
Discovery of inference rules for question-answering
Natural Language Engineering
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Iterative record linkage for cleaning and integration
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Methods for evaluating and creating data quality
Information Systems - Special issue: Data quality in cooperative information systems
Name disambiguation in author citations using a K-way spectral clustering method
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A hierarchical naive Bayes mixture model for name disambiguation in author citations
Proceedings of the 2005 ACM symposium on Applied computing
Exploiting relationships for object consolidation
Proceedings of the 2nd international workshop on Information quality in information systems
Effective and scalable solutions for mixed and split citation problems in digital libraries
Proceedings of the 2nd international workshop on Information quality in information systems
Relational clustering for multi-type entity resolution
MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Establishing value mappings using statistical models and user feedback
Proceedings of the 14th ACM international conference on Information and knowledge management
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
Automated cleansing for spend analytics
Proceedings of the 14th ACM international conference on Information and knowledge management
Supervised clustering with support vector machines
ICML '05 Proceedings of the 22nd international conference on Machine learning
Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Adaptive Name Matching in Information Integration
IEEE Intelligent Systems
Profile-Based Object Matching for Information Integration
IEEE Intelligent Systems
Domain-independent data cleaning via analysis of entity-relationship graph
ACM Transactions on Database Systems (TODS)
Learning metadata from the evidence in an on-line citation matching scheme
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A scaleable document clustering approach for large document corpora
Information Processing and Management: an International Journal
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
A Bayesian Model for Supervised Clustering with the Dirichlet Process Prior
The Journal of Machine Learning Research
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Data quality awareness: a case study for cost optimal association rule mining
Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Integration of Ontology Data through Learning Instance Matching
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
SlideSeer: a digital library of aligned document and presentation pairs
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Adaptive sorted neighborhood methods for efficient record linkage
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Adaptive graphical approach to entity resolution
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Towards automated record linkage
AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Fast ordering of large categorical datasets for visualization
Intelligent Data Analysis
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
IEEE Transactions on Knowledge and Data Engineering
A novel approach to clustering merchandise records
Journal of Computer Science and Technology
Proceedings of the 9th annual ACM international workshop on Web information and data management
Probabilistic correlation-based similarity measure of unstructured records
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Example-driven design of efficient record matching queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Survey on test collections and techniques for personal name matching
International Journal of Metadata, Semantics and Ontologies
A hierarchical model-based approach to co-clustering high-dimensional data
Proceedings of the 2008 ACM symposium on Applied computing
Efficient concept clustering for ontology learning using an event life cycle on the web
Proceedings of the 2008 ACM symposium on Applied computing
Floatcascade learning for fast imbalanced web mining
Proceedings of the 17th international conference on World Wide Web
A two-step classification approach to unsupervised record linkage
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Data utility and privacy protection trade-off in k-anonymisation
PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
Video linkage: group based copied video detection
CIVR '08 Proceedings of the 2008 international conference on Content-based image and video retrieval
Lexicon randomization for near-duplicate detection with I-Match
The Journal of Supercomputing
Multimedia Tools and Applications
Extracting Semantic Networks from Text Via Relational Clustering
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
A Graph Partitioning Approach to Entity Disambiguation Using Uncertain Information
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Multidimensional content eXploration
Proceedings of the VLDB Endowment
Data weaving: scaling up the state-of-the-art in data clustering
Proceedings of the 17th ACM conference on Information and knowledge management
On co-authorship for author disambiguation
Information Processing and Management: an International Journal
An ontology data matching method for web information integration
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Efficient top-k count queries over imprecise duplicates
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Teaching large scale data processing: the five-week course and two years' experiences
SCE '08 Proceedings of the 1st ACM Summit on Computing Education in China on First ACM Summit on Computing Education in China
A Term-Based Driven Clustering Approach for Name Disambiguation
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Swoosh: a generic approach to entity resolution
The VLDB Journal — The International Journal on Very Large Data Bases
ACM Computing Surveys (CSUR)
Disambiguating authors in academic publications using random forests
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Exploiting context analysis for combining multiple entity resolution systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Entity resolution with iterative blocking
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
Learning blocking schemes for record linkage
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Sound and efficient inference with probabilistic and deterministic dependencies
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Memory-efficient inference in relational domains
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
The Normalized Compression Distance as a Distance Measure in Entity Identification
ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
Efficient Clustering of Web-Derived Data Sets
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Approximate minimum spanning tree clustering in high-dimensional space
Intelligent Data Analysis
Conference Mining via Generalized Topic Modeling
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Constraint-based entity matching
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
An unsupervised approach for product record normalization across different web sites
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Journal of Artificial Intelligence Research
Creating relational data from unstructured and ungrammatical data sources
Journal of Artificial Intelligence Research
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
Practical Markov logic containing first-order quantifiers with application to identity uncertainty
CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Creating probabilistic databases from duplicated data
The VLDB Journal — The International Journal on Very Large Data Bases
Robust record linkage blocking using suffix arrays
Proceedings of the 18th ACM conference on Information and knowledge management
Discovering matching dependencies
Proceedings of the 18th ACM conference on Information and knowledge management
An efficient clustering algorithm for large-scale topical web pages
Proceedings of the 18th ACM conference on Information and knowledge management
A possibilistic approach to string comparison
IEEE Transactions on Fuzzy Systems
Learning author-topic models from text corpora
ACM Transactions on Information Systems (TOIS)
Automated generation of model cases for help-desk applications
IBM Systems Journal
Generic entity resolution with negative rules
The VLDB Journal — The International Journal on Very Large Data Bases
Frameworks for entity matching: A comparison
Data & Knowledge Engineering
Anchor text extraction for academic search
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Ranking and semi-supervised classification on large scale graphs using map-reduce
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
An incremental clustering scheme for data de-duplication
Data Mining and Knowledge Discovery
Learning similarity metrics for event identification in social media
Proceedings of the third ACM international conference on Web search and data mining
HARRA: fast iterative hashed record linkage for large-scale data collections
Proceedings of the 13th International Conference on Extending Database Technology
A constrained clustering approach to duplicate detection among relational data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Speeding up clustering-based k-anonymisation algorithms with pre-partitioning
BNCOD'07 Proceedings of the 24th British national conference on Databases
Self-tuning in graph-based reference disambiguation
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Scaling record linkage to non-uniform distributed class sizes
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
On active learning of record matching packages
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Properties of possibilistic string comparison
IEEE Transactions on Fuzzy Systems
Temporal expert finding through generalized time topic modeling
Knowledge-Based Systems
Detecting data misuse by applying context-based data linkage
Proceedings of the 2010 ACM workshop on Insider threats
Maximum normalized spacing for efficient visual clustering
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Entity disambiguation for knowledge base population
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
An efficient duplicate record detection using q-grams array inverted index
DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
An efficient approach to clustering real-estate listings
IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
On Graph-Based Name Disambiguation
Journal of Data and Information Quality (JDIQ)
Evaluating entity resolution results
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Evaluation of entity resolution approaches on real-world match problems
Proceedings of the VLDB Endowment
Entity resolution with evolving rules
Proceedings of the VLDB Endowment
Robust Record Linkage Blocking Using Suffix Arrays and Bloom Filters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient entity resolution for large heterogeneous information spaces
Proceedings of the fourth ACM international conference on Web search and data mining
A probabilistic approach for learning folksonomies from structured data
Proceedings of the fourth ACM international conference on Web search and data mining
VIRaL: Visual Image Retrieval and Localization
Multimedia Tools and Applications
Large-scale collective entity matching
Proceedings of the VLDB Endowment
Kernel based K-medoids for clustering data with uncertainty
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Construction of a large-scale test set for author disambiguation
Information Processing and Management: an International Journal
A fast approach for parallel deduplication on multicore processors
Proceedings of the 2011 ACM Symposium on Applied Computing
Proceedings of the second international workshop on MapReduce and its applications
Eliminating the redundancy in blocking-based entity resolution methods
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
To compare or not to compare: making entity resolution more efficient
Proceedings of the International Workshop on Semantic Web Information Management
Which noun phrases denote which concepts?
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning phenotype mapping for integrating large genetic data
BioNLP '11 Proceedings of BioNLP 2011 Workshop
A supervised machine learning approach for duplicate detection over gazetteer records
GeoS'11 Proceedings of the 4th international conference on GeoSpatial semantics
Public record aggregation using semi-supervised entity resolution
Proceedings of the 13th International Conference on Artificial Intelligence and Law
Collective graph identification
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Matching unstructured product offers to structured product specifications
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Entity matching: how similar is similar
Proceedings of the VLDB Endowment
Spam or ham?: characterizing and detecting fraudulent "not spam" reports in web mail systems
Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
Event log mining tool for large scale HPC systems
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Efficient name disambiguation in digital libraries
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Applied Intelligence
Extending functional dependency to detect abnormal data in RDF graphs
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Legal document clustering with built-in topic segmentation
Proceedings of the 20th ACM international conference on Information and knowledge management
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Object identification with attribute-mediated dependences
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Probabilistic data generation for deduplication and data linkage
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Identifying value mappings for data integration: an unsupervised approach
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data
Proceedings of the fifth ACM international conference on Web search and data mining
A precise blocking method for record linkage
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Dynamic pattern mining: an incremental data clustering approach
Journal on Data Semantics II
An efficient clustering approach for large document collections
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Collaborative filtering using associative neural memory
ITWP'03 Proceedings of the 2003 international conference on Intelligent Techniques for Web Personalization
Entity matching for semistructured data in the Cloud
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Generation of tag-based user profiles for clustering users in a social music site
ACIIDS'12 Proceedings of the 4th Asian conference on Intelligent Information and Database Systems - Volume Part II
Panacea: towards holistic optimization of MapReduce applications
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Leveraging unlabeled data to scale blocking for record linkage
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Efficient and Practical Approach for Private Record Linkage
Journal of Data and Information Quality (JDIQ)
Group topic modeling for academic knowledge discovery
Applied Intelligence
Active sampling for entity matching
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
MapReduce algorithms for big data analysis
Proceedings of the VLDB Endowment
Entity resolution: theory, practice & open challenges
Proceedings of the VLDB Endowment
A discriminative hierarchical model for fast coreference at large scale
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Ensemble semantics for large-scale unsupervised relation extraction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
An automatic blocking mechanism for large-scale de-duplication tasks
Proceedings of the 21st ACM international conference on Information and knowledge management
Fast and accurate incremental entity resolution relative to an entity knowledge base
Proceedings of the 21st ACM international conference on Information and knowledge management
Evaluating Entity Linking with Wikipedia
Artificial Intelligence
Online unsupervised coreference resolution for semi-structured heterogeneous data
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II
Clustering based on rank distance with applications on DNA
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
p-PIC: Parallel power iteration clustering for big data
Journal of Parallel and Distributed Computing
Adaptive Connection Strength Models for Relationship-Based Entity Resolution
Journal of Data and Information Quality (JDIQ) - Special Issue on Entity Resolution
On the performance of high dimensional data clustering and classification algorithms
Future Generation Computer Systems
ACM Transactions on Database Systems (TODS)
MFIBlocks: An effective blocking algorithm for entity resolution
Information Systems
A taxonomy of privacy-preserving record linkage techniques
Information Systems
Efficient XML duplicate detection using an adaptive two-level optimization
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Sumblr: continuous summarization of evolving tweet streams
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Better cross company defect prediction
Proceedings of the 10th Working Conference on Mining Software Repositories
Optimal hashing schemes for entity matching
Proceedings of the 22nd international conference on World Wide Web
Active Sampling for Entity Matching with Guarantees
ACM Transactions on Knowledge Discovery from Data (TKDD) - Special Issue on ACM SIGKDD 2012
An automatic blocking strategy for XML duplicate detection
ACM SIGAPP Applied Computing Review
Efficient hierarchical clustering of large high dimensional datasets
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Programming with personalized pagerank: a locally groundable first-order probabilistic logic
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A joint model for discovering and linking entities
Proceedings of the 2013 workshop on Automated knowledge base construction
Editorial: Efficient discovery of similarity constraints for matching dependencies
Data & Knowledge Engineering
Type Extension Trees for feature construction and learning in relational domains
Artificial Intelligence
Query-driven approach to entity resolution
Proceedings of the VLDB Endowment
Linkage of compound objects for supporting maintenance of large-scale web sites
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Efficient entity matching using materialized lists
Information Sciences: an International Journal
Information Sciences: an International Journal
Incremental entity resolution on rules and data
The VLDB Journal — The International Journal on Very Large Data Bases
Sampling from repairs of conditional functional dependency violations
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |