Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A hierarchical naive Bayes mixture model for name disambiguation in author citations
Proceedings of the 2005 ACM symposium on Applied computing
Information extraction from research papers using conditional random fields
Information Processing and Management: an International Journal
Author name disambiguation in MEDLINE
ACM Transactions on Knowledge Discovery from Data (TKDD)
BooksOnline'11: 4th workshop on online books, complementary social media, and crowdsourcing
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
In this paper, we deal with the problem of extracting and processing useful information from bibliographic references in Digital Humanities (DH) data. A machine learning technique for sequential data analysis, Conditional Random Field is applied to a corpus extracted from OpenEdition site, a web platform for journals and book collections in the humanities and social sciences. We present our ongoing project with this purpose that includes the construction of a proper corpus and a efficient CRF model on this as a preliminary. This project is supported by Google Grant for Digital Humanities. A number of experiments are conducted to find one of the best settings for a CRF model on the corpus, and we verify them both in an automatic and manual way of evaluation.