Growing related words from seed via user behaviors: a re-ranking based approach

Authors:
Yabin Zheng;Zhiyuan Liu;Lixing Xie
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
Year:
2010

Citing 6
Cited 2

Item-based top-N recommendation algorithms

ACM Transactions on Information Systems (TOIS)
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Improving similarity measures for short segments of text

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Incorporating user behaviors in new word detection

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Similarity measures for short segments of text

ECIR'07 Proceedings of the 29th European conference on IR research

Why press backspace?: understanding user input behaviors in Chinese Pinyin input method

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
User Behaviors in Related Word Retrieval and New Word Detection: A Collaborative Perspective

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.01

Visualization

Abstract

Motivated by Google Sets, we study the problem of growing related words from a single seed word by leveraging user behaviors hiding in user records of Chinese input method. Our proposed method is motivated by the observation that the more frequently two words co-occur in user records, the more related they are. First, we utilize user behaviors to generate candidate words. Then, we utilize search engine to enrich candidate words with adequate semantic features. Finally, we reorder candidate words according to their semantic relatedness to the seed word. Experimental results on a Chinese input method dataset show that our method gains better performance.