Detecting relationships among categories using text classification

Authors:
Saket S. R. Mengle;Nazli Goharian
Affiliations:
Information Retrieval Lab, Illinois Institute of Technology, Chicago, IL;Computer Science Department, Georgetown University, Washington, DC
Venue:
Journal of the American Society for Information Science and Technology
Year:
2010

Citing 28
Cited 2

Models of incremental concept formation

Machine learning: paradigms and methods
Passage-level evidence in document retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
The nature of statistical learning theory

The nature of statistical learning theory
Hierarchical classification of Web content

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Discovery of ontologies from knowledge bases

Proceedings of the 1st international conference on Knowledge capture
Co-clustering documents and words using bipartite spectral graph partitioning

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Ontology Learning and Its Application to Automated Terminology Translation

IEEE Intelligent Systems
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Toward a Security Ontology

IEEE Security and Privacy
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
A hierarchical method for multi-class support vector machines

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Automatically learning document taxonomies for hierarchical classification

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Multi-labelled classification using maximum entropy method

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Hierarchical Taxonomy Preparation for Text Categorization Using Consistent Bipartite Spectral Graph Copartitioning

IEEE Transactions on Knowledge and Data Engineering
Taxonomy generation for text segments: A practical web-based approach

ACM Transactions on Information Systems (TOIS)
Feature-based recommendation system

Proceedings of the 14th ACM international conference on Information and knowledge management
Automatic Fuzzy Ontology Generation for Semantic Web

IEEE Transactions on Knowledge and Data Engineering
Automatic computation of semantic proximity using taxonomic knowledge

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Discovering relationships among categories using misclassification information

Proceedings of the 2008 ACM symposium on Applied computing
On document splitting in passage detection

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Passage detection using text classification

Journal of the American Society for Information Science and Technology
Ambiguity measure feature-selection algorithm

Journal of the American Society for Information Science and Technology
Exploiting known taxonomies in learning overlapping concepts

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
AquaLog: an ontology-portable question answering system for the semantic web

ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications

Context aware query classification using dynamic query window and relationship net

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Networked hierarchies for web directories

Proceedings of the 20th international conference companion on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discovering relationships among concepts and categories is crucial in various information systems. The authors' objective was to discover such relationships among document categories. Traditionally, such relationships are represented in the form of a concept hierarchy, grouping some categories under the same parent category. Although the nature of hierarchy supports the identification of categories that may share the same parent, not all of these categories have a relationship with each other—other than sharing the same parent. However, some “non-sibling” relationships exist that although are related to each other are not identified as such. The authors identify and build a relationship network (relationship-net) with categories as the vertices and relationships as the edges of this network. They demonstrate that using a relationship-net, some nonobvious category relationships are detected. Their approach capitalizes on the misclassification information generated during the process of text classification to identify potential relationships among categories and automatically generate relationship-nets. Their results demonstrate a statistically significant improvement over the current approach by up to 73% on 20 News groups 20NG, up to 68% on 17 categories in the Open Directories Project (ODP17), and more than twice on ODP46 and Special Interest Group on Information Retrieval (SIGIR) data sets. Their results also indicate that using misclassification information stemming from passage classification as opposed to document classification statistically significantly improves the results on 20NG (8%), ODP17 (5%), ODP46 (73%), and SIGIR (117%) with respect to F1 measure. By assigning weights to relationships and by performing feature selection, results are further optimized. © 2010 Wiley Periodicals, Inc.