Machine Learning
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Automatic document metadata extraction using support vector machines
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
DCC '01 Proceedings of the Data Compression Conference
Automatic text summarization based on the Global Document Annotation
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Information extraction from research papers using conditional random fields
Information Processing and Management: an International Journal
Collective entity resolution in relational data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Semi-supervised conditional random fields for improved sequence segmentation and labeling
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Reference metadata extraction using a hierarchical knowledge representation framework
Decision Support Systems
FLUX-CIM: flexible unsupervised extraction of citation metadata
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Simple, robust, scalable semi-supervised learning via expectation regularization
Proceedings of the 24th international conference on Machine learning
Metadata Extraction from Chinese Research Papers Based on Conditional Random Fields
FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 01
Genre as noise: noise in genre
International Journal on Document Analysis and Recognition
A simple method for citation metadata extraction using hidden markov models
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Learning a two-stage SVM/CRF sequence classifier
Proceedings of the 17th ACM conference on Information and knowledge management
Predicting structured objects with support vector machines
Communications of the ACM - Scratch Programming for All
Joint inference in information extraction
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
FireCite: lightweight real-time reference string extraction from webpages
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
Locating and parsing bibliographic references in HTML medical articles
International Journal on Document Analysis and Recognition - Special Issue DRR09
Machine reading at the University of Washington
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Applying weighted PageRank to author citation networks
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
In education and research, references play a key role. However, extracting and parsing references are difficult problems. One concern is that there are many styles of references; hence, given a surface form, identifying what style was employed is problematic, especially in heterogeneous collections of theses and dissertations, which cover many fields and disciplines, and where different styles may be used even in the same publication. We address these problems by drawing upon suitable knowledge found in the WWW. In particular, we research a two-stage classifier approach, involving multi-class classification with respect to reference styles, and partially solve the problem of parsing surface representations of references. We describe empirical evidence for the effectiveness of our approach and plans for improvement of our methods.