On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An interactive clustering-based approach to integrating source query interfaces on the deep Web
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Communications of the ACM - ACM at sixty: a look back in time
Exploiting the contextual cues for bio-entity name recognition in biomedical literature
Journal of Biomedical Informatics
Learning to extract form labels
Proceedings of the VLDB Endowment
An empirical study on using hidden markov model for search interface segmentation
Proceedings of the 18th ACM conference on Information and knowledge management
ETTA-IM: A deep web query interface matching approach based on evidence theory and task assignment
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Recently, the Web has been rapidly "deepened" by many searchable databases online, where data are hidden behind query interfaces. Automatic processing of a query interface is a must to access the invisible contents of deep Web. This entails automatic segmentation, i.e., the task of grouping related components of an interface together. The segmentation is divided into two steps: interface component labeling and interface component grouping. In this paper we present a new approach to perform query interface segmentation using two-phase Conditional Random Fields (CRFs). At the first phase, one CRFs model is used to tag each component with a semantic label (attribute-name, operator, operand or other); at the second phase, another CRFs model is used to create groups of related components. Experiments show that our approach yields high accuracy.