Snowball: extracting relations from large plain-text collections
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Kernel methods for relation extraction
The Journal of Machine Learning Research
Learning surface text patterns for a Question Answering system
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Extracting relations with integrated information using kernel methods
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency Tree Kernels for Relation Extraction from Natural Language Text
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
FBK-IRST: kernel methods for semantic relation extraction
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Extracting position relations from the web
Proceedings of the eleventh international workshop on Web information and data management
Fusion of multiple features for chinese named entity recognition based on CRF model
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
An alignment-based approach to semi-supervised relation extraction including multiple arguments
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Study of kernel-based methods for Chinese relation extraction
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Hi-index | 0.00 |
The use of position relations, which refer to the position of people in an organization, can serve for enterprises as a significant competitive intelligence method. The rapid growth of the data volume in the Web brings new opportunities for us to extract position relations of interest from the Web. In this paper, we propose a new algorithm to extract position relations from the Web. Our algorithm is based on the structural feature of position relations in the Web, i.e., a position relation is usually presented in Web pages as a table or a list. In order to define the structural feature of Web content, we first introduce a structural coefficient for each Web page, which is then used to generate structural file segments for Web pages. A structural file segment consists of all candidates of position relations having a similar structure. After that, we employ a pattern-matching method to extract position relations from the structural file segments. Finally, we conduct experiments on a real data set containing 6028 Chinese Web pages gathered by the Baidu search engine, and evaluate precision and recall of our approach. The experimental results confirm that our algorithm has a precision over 96% and a recall over 87%.