SESQ: A Model-Driven Method for Building Object Level Vertical Search Engines
ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Web database schema identification through simple query interface
RED'09 Proceedings of the 2nd international conference on Resource discovery
Hi-index | 0.00 |
Data-rich webpages are providing an increasingly important data source for web applications. While the problem of data object recognition is intensively discussed, it is mostly addressed as a separated process from the frontier task of relevant webpage identification. In this paper, we propose a method to leverage the classification result of data-rich webpages for efficient and scalable data object recognition. A novel context information is proposed, which can be inferred from the webpage classification and exploited in the bottom-up data object recognition. Experimental results show that the context information brings a 19% improvement in the running efficiency of the bottomup data object recognition.