Adapting Web information extraction knowledge via mining site-invariant and site-dependent features
ACM Transactions on Internet Technology (TOIT)
Cross Language Information Extraction Knowledge Adaptation
RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Learning to adapt cross language information extraction wrapper
Applied Intelligence
Hi-index | 0.00 |
We propose a wrapper adaptation framework which aimsat adapting a learned wrapper to an unseen Web site. It significantlyreduces human effort in constructing wrappers.Our framework makes use of extraction rules previously discoveredfrom a particular site to seek potential training ex-amplecandidates for an unseen site. Rule generalizationand text categorization are employed for finding suitable examplecandidates. Another feature of our approach is thatit makes use of the previously discovered lexicon to classifygood training examples automatically for the new site. Weconducted extensive experiments to evaluate the quality ofthe extraction performance and the adaptability of our approach.