A pattern-based voting approach for concept discovery on the web

Authors:
Jing Chen;Zhigang Zhang;Qing Li;Xiaoming Li
Affiliations:
Department of Computer Engineering and Information Technology, City University of Hong Kong, Kowloon, Hong Kong;Department of Computer Science and Technology, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Department of Computer Engineering and Information Technology, City University of Hong Kong, Kowloon, Hong Kong;Department of Computer Science and Technology, School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Venue:
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Year:
2005

Citing 9
Cited 1

Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Concept-based knowledge discovery in texts extracted from the Web

ACM SIGKDD Explorations Newsletter
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining topic-specific concepts and definitions on the web

WWW '03 Proceedings of the 12th international conference on World Wide Web
PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth

ICDE '01 Proceedings of the 17th International Conference on Data Engineering
Rule discovery from textual data based on key phrase patterns

Proceedings of the 2004 ACM symposium on Applied computing
Conceptual Indexing: A Better Way to Organize Knowledge

Conceptual Indexing: A Better Way to Organize Knowledge
Domain-specific keyphrase extraction

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2

Improving the performance of association classifiers by rule prioritization

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatically discovering concepts is not only a fundamental task in knowledge capturing and ontology engineering processes, but also a key step of many applications in information retrieval. For such a task, pattern-based approaches and statistics-based approaches are widely used, between which the former ones eventually turned out to be more precise. However, the effective patterns in such approaches are usually defined manually. It involves much time and human labor, and considers only a limited set of effective patterns. In our research, we accomplish automatically obtaining patterns through frequent sequence mining. A voting approach is then presented that can determine whether a sentence contains a concept and accurately identify it. Our algorithm includes three steps: pattern mining, pattern refining and concept discovery. In our experimental study, we use several traditional measures, precision, recall and F1 value, to evaluate the performance of our approach. The experimental results not only verify the validity of the approach, but also illustrate the relationship between performance and the parameters of the algorithm.