Mining employment market via text block detection and adaptive cross-domain information extraction

  • Authors:
  • Tak-Lam Wong;Wai Lam;Bo Chen

  • Affiliations:
  • The Chinese University of Hong Kong, Hong Kong, Hong Kong;The Chinese University of Hong Kong, Hong Kong, Hong Kong;The Chinese University of Hong Kong, Hong Kong, Hong Kong

  • Venue:
  • Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have developed an approach for analyzing online job advertisements in different domains (industries) from different regions worldwide. Our approach is able to extract precise information from the text content supporting useful employment market analysis locally and globally. A major component in our approach is an information extraction framework which is composed of two challenging tasks. The first task is to detect unformatted text blocks automatically based on an unsupervised learning model. Identifying these useful text blocks through this learning model allows the generation of highly effective features for the next task which is text fragment extraction learning. The task of text fragment extraction learning is formulated as a domain adaptation model for text fragment classification. One advantage of our approach is that it can easily adapt to a large number of online job advertisements in different and new domains. Extensive experiments have been conducted to demonstrate the effectiveness and flexibility of our approach.