Models for retrieval with probabilistic indexing
Information Processing and Management: an International Journal - Modeling data, information and knowledge
Modern Information Retrieval
Classification of Web Documents Using a Graph Model
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Text mining: generating hypotheses from MEDLINE
Journal of the American Society for Information Science and Technology
Construction of conceptual graph representation of texts
HLT-SRWS '04 Proceedings of the Student Research Workshop at HLT-NAACL 2004
Multi-document summarization by graph search and matching
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Mining concept associations for knowledge discovery through concept chain queries
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
IEEE Transactions on Fuzzy Systems
Hi-index | 0.00 |
For information retrieval and text-mining, a robust scalable framework is required to represent the information extracted from documents and enable visualization and query of such information. One very widely used model is the vector space model which is based on the bag-of-words approach. However, it suffers from the fact that it loses important information about the original text, such as information about the order of the terms in the text or about the frontiers between sentences or paragraphs. In this paper, we propose a graph-based text representation, which is capable of capturing (i) Term order (ii) Term frequency (iii) Term co-occurrence (iv) Term context in documents. We also apply the graph model into our text mining task, which is to discover unapparent associations between two and more concepts (e.g. individuals) from a large text corpus. Counterterrorism corpus is used to evaluate the performance of various retrieval models, which demonstrates feasibility and effectiveness of graphic text representation in information retrieval and text mining.