Ten lectures on wavelets
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Topic Extraction from News Archive Using TF*PDF Algorithm
WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
ACM SIGCOMM Computer Communication Review
Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling
IEEE Transactions on Knowledge and Data Engineering
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering for unsupervised relation identification
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
BBS based hot topic retrieval using back-propagation neural network
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Dynamically Modeling Semantic Dependencies in Web Forum Threads
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
Web forum has become an important resource on the Web due to its rich information contributed by millions of Internet users every day. Meanwhile, thousands of junk or valueless messages exist in web forum. Recognizing high-quality topics should be fundamental tasks in Search Engine and Web Mining systems. However, it is not a trivial problem to quantify high-quality topics on web forum. Users face a daunting challenge in identifying a small subset of topics worthy of their attention. In this paper, we present several characteristics to measure high-quality topic, based on these characteristics, we propose a novel model to recognize high-quality topics on web forum. Our model consists of three steps. First, time series signals which contain distinctive characteristics between high-quality topics and non-high-quality topics are extracted from topics. Second, features are obtained from signals by using Wavelet Packet Transform (WPT). Third, upon the features, high-quality topics are recognized by using Back-Propagation Neural Network. Conducting experiments on Tencent Message Boards which have 2,710,994 messages and 189,962 authors ranging from Jan 1, 2005 to Nov 12, 2007, we demonstrate the efficiency of our model, showing that the average accuracy rate of high-quality topic recognition is 95% and nearly 50,000 topics can be recognized in one second.