Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
EKAW '99 Proceedings of the 11th European Workshop on Knowledge Acquisition, Modeling and Management
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining Text Archives: Creating Readable Maps to Structure and Describe Document Collections
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Structuring Domain-Specific Text Archives by Deriving a Probabilistic XML DTD
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
EKAW '00 Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management
WWW '03 Proceedings of the 12th international conference on World Wide Web
Extracting a domain-specific ontology from a corporate intranet
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Parmenides: an opportunity for ISO TC37 SC4?
LingAnnot ;03 Proceedings of the ACL 2003 workshop on Linguistic annotation: getting the model right - Volume 19
Some Experiments on Clustering Similar Sentences of Texts in Portuguese
PROPOR '08 Proceedings of the 8th international conference on Computational Processing of the Portuguese Language
Proceedings of the 2012 ACM international conference on Intelligent User Interfaces
Evaluation of ontology enhancement tools
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Learning of semantic sibling group hierarchies - K-means vs. bi-secting-K-means
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Domain relevance on term weighting
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Hi-index | 0.00 |
While classic information retrieval methods return whole documents as a result of a query, many information demands would be better satisfied by fine-grain access inside the documents. One way to support this goal is to make the semantics of small document regions explicit, e.g. as XML labels, so that query engines can exploit them. To this purpose, the topics of the small document regions must be discovered from the texts; differently from document labelling applications, fine-grain topics cannot be listed in advance for arbitrary collections. Text-understanding approaches can derive the topic of a document region but are less appropriate for the construction of a small set of topics that can be used in queries. To address this challenge we propose the coupling of text mining, prior knowledge explicated in ontologies and human expertise and present the system RELFIN, which is designed to assis the human expert in the discovery of topics appropriate for (i) ontology enhancement with additional concepts or relationships, (ii) semantic characterization and tagging of document regions. RELFIN performs data mining upon linguistically preprocessed corpora to group document regions on topics and constructing the topic labels for them, so that the labels are characteristic of the regions and thus helpful in ontology-based search. We show our first results of applying RELFIN on a case study of text analysis and retrieval.