User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective

  • Authors:
  • Zhiyuan Liu;Yabin Zheng;Lixing Xie;Maosong Sun;Liyun Ru;Yang Zhang

  • Affiliations:
  • State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University;State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University;Sohu Inc. R&D Center

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, user behavior analysis and collaborative filtering have drawn a large body of research in the machine learning community. The goal is either to enhance the user experience or discover useful information hidden in the data. In this article, we conduct extensive experiments on a Chinese input method data set, which keeps the word lists that users have used. Then, from the collaborative perspective, we aim to solve two tasks in natural language processing, that is, related word retrieval and new word detection. Motivated by the observation that two words are usually highly related to each other if they co-occur frequently in users’ records, we propose a novel semantic relatedness measure between words that takes both user behaviors and collaborative filtering into consideration. We utilize this measure to perform related word retrieval and new word detection tasks. Experimental results on both tasks indicate the applicability and effectiveness of our method.