Word association norms, mutual information, and lexicography
Computational Linguistics
A vector space model for automatic indexing
Communications of the ACM
Collaborative Filtering Using Weighted Majority Prediction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Collaborative filtering via gaussian probabilistic latent semantic analysis
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Item-based top-N recommendation algorithms
ACM Transactions on Information Systems (TOIS)
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
IEEE Transactions on Knowledge and Data Engineering
Automatic recognition of Chinese unknown words based on roles tagging
SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
Querying the web: a multiontology disambiguation method
ICWE '06 Proceedings of the 6th international conference on Web engineering
Paraphrasing with bilingual parallel corpora
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Names and similarities on the web: fact extraction in the fast lane
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Measuring semantic similarity between words using web search engines
Proceedings of the 16th international conference on World Wide Web
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Weakly-supervised discovery of named entities using web search queries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
"More like these": growing entity classes from seeds
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Language-Independent Set Expansion of Named Entities Using the Web
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Iterative Set Expansion of Named Entities Using the Web
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Translating queries into snippets for improved query expansion
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Learning to Rank for Information Retrieval
Foundations and Trends in Information Retrieval
Improving similarity measures for short segments of text
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Unsupervised named-entity extraction from the Web: An experimental study
Artificial Intelligence
Incorporating user behaviors in new word detection
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Automatic set instance extraction using the web
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Web-scale distributional similarity and entity set expansion
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Similarity measures for short segments of text
ECIR'07 Proceedings of the 29th European conference on IR research
Tree edit models for recognizing textual entailments, paraphrases, and answers to questions
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Growing related words from seed via user behaviors: a re-ranking based approach
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Unsupervised Semantic Similarity Computation between Terms Using Web Documents
IEEE Transactions on Knowledge and Data Engineering
Empirical analysis of predictive algorithms for collaborative filtering
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
The use of SVM for chinese new word identification
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
Nowadays, user behavior analysis and collaborative filtering have drawn a large body of research in the machine learning community. The goal is either to enhance the user experience or discover useful information hidden in the data. In this article, we conduct extensive experiments on a Chinese input method data set, which keeps the word lists that users have used. Then, from the collaborative perspective, we aim to solve two tasks in natural language processing, that is, related word retrieval and new word detection. Motivated by the observation that two words are usually highly related to each other if they co-occur frequently in users’ records, we propose a novel semantic relatedness measure between words that takes both user behaviors and collaborative filtering into consideration. We utilize this measure to perform related word retrieval and new word detection tasks. Experimental results on both tasks indicate the applicability and effectiveness of our method.