Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning to remove Internet advertisements
Proceedings of the third annual conference on Autonomous Agents
Mining Lemma Disambiguation Rules from Czech Corpora
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Case acquisition and labelling are important bottlenecks for predictive data mining. In the web context, a cascade of supporting techniques can be used, from general ones such as user interfaces, through filtering based on keyword frequency, to web-specific techniques exploiting public search engines. We show how a synergistic application of multiple techniques can be helpful in obtaining and pre-processing textual data, in particular for ILP-based web mining. The (two-fold) learning task itself consist in construction and disambiguation of categorisation rules, which are to process the results returned by web search engines.