Knowledge-based metadata extraction from PostScript files
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
S-CREAM - Semi-automatic CREAtion of Metadata
EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised named-entity extraction from the web: an experimental study
Artificial Intelligence
IEEE Transactions on Knowledge and Data Engineering
Joint optimization of wrapper generation and template detection
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Tree-structured conditional random fields for semantic annotation
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Using web page layout for extraction of sender names
Proceedings of the 3rd International Universal Communication Symposium
WISDOM: a web information credibility analysis system
ACLDemos '09 Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Automatic Web Pages Author Extraction
FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Named entity recognition and identification for finding the owner of a home page
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Hi-index | 0.00 |
In this paper, we define the problem of identifying the author of a Web page as a sub-problem of identifying the information sender configuration of a Web page. We propose a method that extracts the author name candidates from a Web page based on linguistic features, and rank the candidates based on local features such as distance from the main content. The evaluation shows that we can achieve more than 75% precision when evaluated with candidates ranked within top five.