Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
C4.5: programs for machine learning
C4.5: programs for machine learning
KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Domain-Specific Keyphrase Extraction
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Narrative text classification for automatic key phrase extraction in web document corpora
Proceedings of the 7th annual ACM international workshop on Web information and data management
Integration of association rules and ontologies for semantic query expansion
Data & Knowledge Engineering
Integration of association rules and ontologies for semantic query expansion
Data & Knowledge Engineering
GE-Miner: integration of cluster ensemble and text mining for comprehensive gene expression analysis
International Journal of Bioinformatics Research and Applications
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
CollabRank: towards a collaborative approach to single-document keyphrase extraction
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Single document keyphrase extraction using neighborhood knowledge
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
Exploiting neighborhood knowledge for single document summarization and keyphrase extraction
ACM Transactions on Information Systems (TOIS)
Construction of a corporative information system for an electric power company
INES'10 Proceedings of the 14th international conference on Intelligent engineering systems
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Transactions on Computational Systems Biology II
Concept extraction for online shopping
Proceedings of the 14th Annual International Conference on Electronic Commerce
Combining Supervised Learning Techniques to Key-Phrase Extraction for Biomedical Full-Text
International Journal of Intelligent Information Technologies
Hi-index | 0.00 |
To tackle the issue of information overload, we present an Information Gain-based KeyPhrase Extraction System, called KPSpotter. KPSpotter is a flexible web-enabled keyphrase extraction system, capable of processing various formats of input data, including web data, and generating the extraction model as well as the list of keyphrases in XML. In KPSpotter, the following two features were selected for training and extracting keyphrases: 1) TF*IDF and 2) Distance from First Occurrence. Input training and testing collections were processed in three stages: 1) Data Cleaning, 2) Data Tokenizing, and 3) Data Discretizing. To measure the system performance, the keyphrases extracted by KPSpotter are compared with the ones that the authors assigned. Our experiments show that the performance of KPSpotter was evaluated to be equivalent to KEA, a well-known keyphrase extraction system. KPSpotter, however, is differentiated from other extraction systems in the followings: First, KPSpotter employs a new keyphrase extraction technique that combines the Information Gain data mining measure and several Natural Language Processing techniques such as stemming and case-folding. Second, KPSpotter is able to process various types of input data such as XML, HTML, and unstructured text data and generate XML output. Third, the user can provide input data and execute KPSpotter through the Internet. Fourth, for efficiency and performance reason, KPSpotter stores candidate keyphrases and its related information such as frequency and stemmed form into an embedded database management system.