Real time extraction of related terms by bi-directional lexico-syntactic patterns from the web

  • Authors:
  • Hiroaki Ohshima;Katsumi Tanaka

  • Affiliations:
  • Kyoto University Yoshida Honmachi, Sakyo, Kyoto, Japan;Kyoto University Yoshida Honmachi, Sakyo, Kyoto, Japan

  • Venue:
  • Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a method for detecting related terms of a given term quickly using a conventional Web search engine. There are many kinds of related terms. For example, hypernyms and hyponyms are basic related terms that are treated in dictionaries. Synonyms and coordinate terms are also well defined related terms. Topic terms and description terms represent topics of the given term and they are vaguely defined. There are other related terms such as abbreviations and nicknames. The proposed method can be used these many kinds of related terms. It extracts related terms from text resources only from Web search results, which consist of titles, snippets, and URLs of Web pages. We use two different kind of lexico-syntactic patterns to extract related terms from the search results, and they are called bi-directional lexico-syntactic patterns. The proposed method can be applied to both languages where words are separated by a space such as English and Korean and ones where words are not separated by a space such as Japanese and Chinese. The proposed method does not need any advanced natural language processing such as morphological analysis or syntactic parsing. It works relatively fast with good precision.