A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
Web objectionable text content detection using topic modeling technique
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Real-time content analysis can be a bottleneck in Web filtering This work presents a simple, but effective early decision algorithm to accelerate the filtering process by examining only part of the Web content The algorithm can make the filtering decision, either to block or to pass the Web content, as soon as it is confident with a high probability that the content should belong to a banned or an allowable category The experiments show the algorithms can examine only around one-fourth of the Web content on average, while the accuracy remains fairly good: 89% in the banned content and 93% in the allowable content This algorithm can complement other Web filtering approaches to filter the Web content with high efficiency.