Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Statistical Models for Co-occurrence Data
Statistical Models for Co-occurrence Data
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Automatic construction of a hypernym-labeled noun hierarchy from text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic summarization of search engine hit lists
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
PageCluster: Mining conceptual link hierarchies from Web log files for adaptive Web site navigation
ACM Transactions on Internet Technology (TOIT)
A practical web-based approach to generating topic hierarchy for text segments
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Taxonomy generation for text segments: A practical web-based approach
ACM Transactions on Information Systems (TOIS)
Automatically labeling hierarchical clusters
dg.o '06 Proceedings of the 2006 international conference on Digital government research
An experimental study on automatically labeling hierarchical clusters using statistical features
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Ontology learning: state of the art and open issues
Information Technology and Management
User Oriented Hierarchical Information Organization and Retrieval
ECML '07 Proceedings of the 18th European conference on Machine Learning
Collection Browsing through Automatic Hierarchical Tagging
AH '08 Proceedings of the 5th international conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Query based optimal web site clustering using simulated annealing
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Real time extraction of related terms by bi-directional lexico-syntactic patterns from the web
Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Document Clustering Description Extraction and Its Application
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Enhancing cluster labeling using wikipedia
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Heuristic-Based Approach for Constructing Hierarchical Knowledge Structures
IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
A Genre-Aware Approach to Focused Crawling
World Wide Web
Analysis of structural relationships for hierarchical cluster labeling
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Constructing tree-based knowledge structures from text corpus
Applied Intelligence
Selecting candidate labels for hierarchical document clusters using association rules
MICAI'10 Proceedings of the 9th Mexican international conference on Artificial intelligence conference on Advances in soft computing: Part II
Word clouds of multiple search results
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Principal components for automatic term hierarchy building
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Dynamic pattern mining: an incremental data clustering approach
Journal on Data Semantics II
A web-based novel term similarity framework for ontology learning
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part I
Discovering a term taxonomy from term similarities using principal component analysis
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Cluster labeling for multilingual scatter/gather using comparable corpora
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Exploring the existing category hierarchy to automatically label the newly-arising topics in cQA
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We create a statistical model for inferring hierarchical term relationships about a topic, given only a small set of example web pages on the topic, without prior knowledge of any hierarchical information. The model can utilize either the full text of the pages in the cluster or the context of links to the pages. To support the model, we use "ground truth" data taken from the category labels in the Open Directory. We show that the model accurately separates terms in the following classes: self terms describing the cluster, parent terms describing more general concepts, and child terms describing specializations of the cluster. For example, for a set of biology pages, sample parent, self, and child terms are science, biology, and genetics respectively. We create an algorithm to predict parent, self, and child terms using the new model, and compare the predictions to the ground truth data. The algorithm accurately ranks a majority of the ground truth terms highly, and identifies additional complementary terms missing in the Open Directory.