Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Document clustering using word clusters via the information bottleneck method
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 10th international conference on World Wide Web
Finding topic words for hierarchical summarization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Inferring hierarchical descriptions
Proceedings of the eleventh international conference on Information and knowledge management
Enriching web taxonomies through subject categorization of query terms from search engine logs
Decision Support Systems - Web retrieval and mining
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Model-Based Hierarchical Clustering
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Enhanced word clustering for hierarchical text classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards Automatic Generation of Query Taxonomy: A Hierarchical Query Clustering Approach
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Topic hierarchy generation via linear discriminant projection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The TaxGen Framework: Automating the Generation of a Taxonomy for a Large Document Collection
HICSS '99 Proceedings of the Thirty-Second Annual Hawaii International Conference on System Sciences-Volume 2 - Volume 2
Word-sense disambiguation using statistical methods
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
Visualized cognitive knowledge map integration for P2P networks
Decision Support Systems
Preserving User Preferences in Automated Document-Category Management: An Evolution-Based Approach
Journal of Management Information Systems
Detecting relationships among categories using text classification
Journal of the American Society for Information Science and Technology
An efficient similarity join algorithm with cosine similarity predicate
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Navigating within news collections using tag-flakes
Journal of Visual Languages and Computing
Multilingual document mining and navigation using self-organizing maps
Information Processing and Management: an International Journal
Towards fuzzy domain ontology based concept map generation for E-Learning
ICWL'07 Proceedings of the 6th international conference on Advances in web based learning
Automatically structuring domain knowledge from text: An overview of current research
Information Processing and Management: an International Journal
Category labelling for automatic classification scheme generation
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Conceptual modeling of cardinality constraints in social publishing
International Journal of Intelligent Systems
Domain taxonomy learning from text: The subsumption method versus hierarchical clustering
Data & Knowledge Engineering
Measuring similarity of windows applications using static and dynamic birthmarks
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Dimension independent similarity computation
The Journal of Machine Learning Research
Hi-index | 0.00 |
It is crucial in many information systems to organize short text segments, such as keywords in documents and queries from users, into a well-formed taxonomy. In this article, we address the problem of taxonomy generation for diverse text segments with a general and practical approach that uses the Web as an additional knowledge source. Unlike long documents, short text segments typically do not contain enough information to extract reliable features. This work investigates the possibilities of using highly ranked search-result snippets to enrich the representation of text segments. A hierarchical clustering algorithm is then designed for creating the hierarchical topic structure of text segments. Text segments with close concepts can be grouped together in a cluster, and relevant clusters linked at the same or near levels. Different from traditional clustering algorithms, which tend to produce cluster hierarchies with a very unnatural shape, the algorithm tries to produce a more natural and comprehensive tree hierarchy. Extensive experiments were conducted on different domains of text segments, including subject terms, people names, paper titles, and natural language questions. The obtained experimental results have shown the potential of the proposed approach, which provides a basis for the in-depth analysis of text segments on a larger scale and is believed able to benefit many information systems.