Adapting Information Extraction Knowledge For Unseen Web Sites

Authors:
Tak-Lam Wong;Wai Lam
Affiliations:
-;-
Venue:
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Year:
2002

Citing 0
Cited 3

Adapting Web information extraction knowledge via mining site-invariant and site-dependent features

ACM Transactions on Internet Technology (TOIT)
Cross Language Information Extraction Knowledge Adaptation

RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Learning to adapt cross language information extraction wrapper

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a wrapper adaptation framework which aimsat adapting a learned wrapper to an unseen Web site. It significantlyreduces human effort in constructing wrappers.Our framework makes use of extraction rules previously discoveredfrom a particular site to seek potential training ex-amplecandidates for an unseen site. Rule generalizationand text categorization are employed for finding suitable examplecandidates. Another feature of our approach is thatit makes use of the previously discovered lexicon to classifygood training examples automatically for the new site. Weconducted extensive experiments to evaluate the quality ofthe extraction performance and the adaptability of our approach.