Focused named entity recognition using machine learning

Authors:
Li Zhang;Yue Pan;Tong Zhang
Affiliations:
IBM China Research, Beijing, P.R. China;IBM China Research, Beijing, P.R. China;IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2004

Citing 14
Cited 13

C4.5: programs for machine learning

C4.5: programs for machine learning
The identification of important concepts in highly structured technical papers

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Training a selection function for extraction

Proceedings of the eighth international conference on Information and knowledge management
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Finding topic words for hierarchical summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Text Categorization Based on Regularized Linear Classification Methods

Information Retrieval
On the Dual Formulation of Regularized Linear Systems with Convex Risks

Machine Learning
Text chunking based on a generalization of winnow

The Journal of Machine Learning Research
A machine learning approach to coreference resolution of noun phrases

Computational Linguistics - Special issue on computational anaphora resolution
Identifying topics by position

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Text categorization for a comprehensive time-dependent benchmark

Information Processing and Management: an International Journal
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A decision-tree-based symbolic rule induction system for text categorization

IBM Systems Journal

Automatic extraction of titles from general documents using machine learning

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Email data cleaning

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Automatic extraction of titles from general documents using machine learning

Information Processing and Management: an International Journal
Compare&contrast: using the web to discover comparable cases for news stories

Proceedings of the 16th international conference on World Wide Web
A generic software architecture of a text processing system for analyzing product warranty claims data

COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
Learning document aboutness from implicit user feedback and document structure

Proceedings of the 18th ACM conference on Information and knowledge management
Mining automotive warranty claims data for effective root cause analysis

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
EagleEye: entity-centric business intelligence for smarter decisions

IBM Journal of Research and Development
Opinion target extraction in Chinese news comments

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A query-based multi-document sentiment summarizer

Proceedings of the 20th ACM international conference on Information and knowledge management
A supervised learning approach to entity search

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
iASA: learning to annotate the semantic web

Journal on Data Semantics IV
Topic-Oriented words as features for named entity recognition

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study the problem of finding most topical named entities among all entities in a document, which we refer to as focused named entity recognition. We show that these focused named entities are useful for many natural language processing applications, such as document summarization, search result ranking, and entity detection and tracking. We propose a statistical model for focused named entity recognition by converting it into a classification problem. We then study the impact of various linguistic features and compare a number of classification algorithms. From experiments on an annotated Chinese news corpus, we demonstrate that the proposed method can achieve near human-level accuracy.