Foundations of statistical natural language processing
Foundations of statistical natural language processing
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Concept-based knowledge discovery in texts extracted from the Web
ACM SIGKDD Explorations Newsletter
Mining confident rules without support requirement
Proceedings of the tenth international conference on Information and knowledge management
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
A Statistical Corpus-Based Term Extractor
AI '01 Proceedings of the 14th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Positive and Unlabeled Examples Help Learning
ALT '99 Proceedings of the 10th International Conference on Algorithmic Learning Theory
EKAW '00 Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management
Web-Log Mining for Predictive Web Caching
IEEE Transactions on Knowledge and Data Engineering
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Noun-phrase analysis in unrestricted text for information retrieval
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Can social bookmarking improve web search?
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Crowdsourcing user studies with Mechanical Turk
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Concept mining for indexing medical literature
MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition
WebSets: extracting sets of entities from the web using unsupervised information extraction
Proceedings of the fifth ACM international conference on Web search and data mining
Assessing web article quality by harnessing collective intelligence
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Ontology acquisition from web service descriptions
Proceedings of the 28th Annual ACM Symposium on Applied Computing
CONCERT: a concept-centric web news recommendation system
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Hi-index | 0.00 |
Concepts are sequences of words that represent real or imaginary entities or ideas that users are interested in. As a first step towards building a web of concepts that will form the backbone of the next generation of search technology, we develop a novel technique to extract concepts from large datasets. We approach the problem of concept extraction from corpora as a market-basket problem, adapting statistical measures of support and confidence. We evaluate our concept extraction algorithm on datasets containing data from a large number of users (e.g., the AOL query log data set), and we show that a high-precision concept set can be extracted.