Multiple sections extraction using visual cue
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Formal concept analysis approach for data extraction from a limited deep web database
Journal of Intelligent Information Systems
Framework for surveillance of instant messages
International Journal of Internet Technology and Secured Transactions
Hi-index | 0.00 |
Our survey shows that the techniques used in data extraction from deep webs need to be improved to achieve the efficiency and accuracy of automatic wrappers. Further investigations indicate that the development of a lightweight ontological technique using existing lexical database for English (WordNet) is able to check the similarity of data records and detect the correct data region with higher precision using the semantic properties of these data records. The advantages of this method are that it can extract three types of data records, namely, single-section data records, multiple-section data records, and loosely structured data records, and it also provides options for aligning iterative and disjunctive data items. Experimental results show that our technique is robust and performs better than the existing state-of-the-art wrappers. Tests also show that our wrapper is able to extract data records from multilingual web pages and that it is domain independent.