N semantic classes are harder than two

  • Authors:
  • Ben Carterette;Rosie Jones;Wiley Greiner;Cory Barr

  • Affiliations:
  • University of Massachusetts, Amherst, MA;Yahoo! Research, Burbank, CA;Los Angeles Software Inc., Santa Monica, CA;Yahoo! Research, Burbank, CA

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show that we can automatically classify semantically related phrases into 10 classes. Classification robustness is improved by training with multiple sources of evidence, including within-document cooccurrence, HTML markup, syntactic relationships in sentences, substitutability in query logs, and string similarity. Our work provides a benchmark for automatic n-way classification into WordNet's semantic classes, both on a TREC news corpus and on a corpus of substitutable search query phrases.