Explora: a multipattern and multistrategy discovery assistant
Advances in knowledge discovery and data mining
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Discovering unexpected information from your competitors' web sites
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Framework for mining web content outliers
Proceedings of the 2004 ACM symposium on Applied computing
Mining web content outliers using structure oriented weighting techniques and N-grams
Proceedings of the 2005 ACM symposium on Applied computing
WCOND-Mine: Algorithm for Detecting Web Content Outliers from Web Documents
ISCC '05 Proceedings of the 10th IEEE Symposium on Computers and Communications
Example-Based Robust Outlier Detection in High Dimensional Datasets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
CWS: a comparative web search system
Proceedings of the 15th international conference on World Wide Web
Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance
Knowledge and Information Systems
Resume information extraction with cascaded hybrid model
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A machine learning approach to web page filtering using content and structure analysis
Decision Support Systems
Detecting outlying properties of exceptional objects
ACM Transactions on Database Systems (TODS)
OpinionMiner: a novel machine learning system for web opinion mining and extraction
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 11th International Conference on Electronic Commerce
Coupled semi-supervised learning for information extraction
Proceedings of the third ACM international conference on Web search and data mining
HOT: hypergraph-based outlier test for categorical data
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Web-scale knowledge extraction from semi-structured tables
Proceedings of the 19th international conference on World wide web
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Open information extraction using Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Automatic Information Extraction from E-Commerce Web Sites
ICEE '10 Proceedings of the 2010 International Conference on E-Business and E-Government
An approach to extract special skills to improve the performance of resume selection
DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Hi-index | 0.00 |
In the literature, research efforts are going on to extract interesting information from text documents to improve the performance of information-based services. Interesting information is extracted after identifying features from each document. In this paper, we have proposed the notion of 'special feature' which is a new kind of knowledge that can be used to improve the performance of information-based services. A feature is a special feature if only very few documents in the dataset possess it. Given a text document dataset, we have proposed a methodology to extract special features. By using the notion of special features, we have also proposed frameworks to improve the performance of product selection in the e-commerce environment and the process of resume selection. The experiment results on real datasets show that it is possible to improve the efficiency of the applications with the proposed approach.