The nature of statistical learning theory
The nature of statistical learning theory
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Analysis of a very large web search engine query log
ACM SIGIR Forum
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Information Retrieval
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A New Statistical Approach to Personal Name Extraction
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Augmenting Naive Bayes Classifiers with Statistical Language Models
Information Retrieval
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Automatic identification of user goals in Web search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Assigning belief scores to names in queries
HLT '01 Proceedings of the first international conference on Human language technology research
A testbed for people searching strategies in the WWW
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
An Adaptive Two-Phase Approach to WiFi Location Sensing
PERCOMW '06 Proceedings of the 4th annual IEEE international conference on Pervasive Computing and Communications Workshops
Named entity recognition with a maximum entropy approach
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition through classifier combination
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Finding advertising keywords on web pages
Proceedings of the 15th international conference on World Wide Web
Detecting online commercial intention (OCI)
Proceedings of the 15th international conference on World Wide Web
Query enrichment for web-query classification
ACM Transactions on Information Systems (TOIS)
An effective two-stage model for exploiting non-local dependencies in named entity recognition
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Query suggestion using hitting time
Proceedings of the 17th ACM conference on Information and knowledge management
Understanding user's query intent with wikipedia
Proceedings of the 18th international conference on World wide web
Improving web search relevance with semantic features
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Structural annotation of search queries using pseudo-relevance feedback
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Joint annotation of search queries
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Personal names are an important kind of Web queries in Web search, and yet they are special in many ways. Strategies for retrieving information on personal names should therefore be different from the strategies for other types of queries. To improve the search quality for personal names, a first step is to detect whether a query is a personal name. Despite the importance of this problem, relatively little previous research has been done on this topic. Since Web queries are usually short, conventional supervised machine-learning algorithms cannot be applied directly. An alternative is to apply some heuristic rules coupled with name-term dictionaries. However, when the dictionaries are small, this method tends to make false negatives; when the dictionaries are large, it tends to generate false positives. A more serious problem is that this method cannot provide a good trade-off between precision and recall. To solve these problems, we propose an approach based on the construction of probabilistic name-term dictionaries and personal name grammars, and use this algorithm to predict the probability of a query to be a personal name. In this paper, we develop four different methods for building probabilistic name-term dictionaries in which a term is assigned with a probability value of the term being a name term. We compared our approach with baseline algorithms such as dictionary-based look-up methods and supervised classification algorithms including logistic regression and SVM on some manually labeled test sets. The results validate the effectiveness of our approach, whose F1 value is more than 79.8%, which outperforms the best baseline by more than 11.3%