Communications of the ACM - Special issue on parallelism
Parallel free-text search on the connection machine system
Communications of the ACM - Special issue on parallelism
SCISOR: extracting information from on-line news
Communications of the ACM
TCS: a shell for content-based text categorization
Proceedings of the sixth conference on Artificial intelligence applications
Creating segmented databases from free text for text retrieval
SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Trading MIPS and memory for knowledge engineering
Communications of the ACM
Prism: A Case-Based Telex Classifier
IAAI '90 Proceedings of the The Second Conference on Innovative Applications of Artificial Intelligence
CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories
IAAI '90 Proceedings of the The Second Conference on Innovative Applications of Artificial Intelligence
Automated learning of decision rules for text categorization
ACM Transactions on Information Systems (TOIS)
Text categorization for multiple users based on semantic features from a machine-readable dictionary
ACM Transactions on Information Systems (TOIS)
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Using IR techniques for text classification in document analysis
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
ACM Computing Surveys (CSUR)
A comparison of classifiers and document representations for the routing problem
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Noise reduction in a statistical approach to text categorization
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Partial orders for document representation: a new methodology for combining document features
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based text categorization: a comparison of category search strategies
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Combining classifiers in text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection, perceptron learning, and a usability case study for text categorization
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Abstracting of legal cases: the SALOMON experience
Proceedings of the 6th international conference on Artificial intelligence and law
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Using a generalized instance set for automatic text categorization
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic essay grading using text categorization techniques
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Scalable association-based text classification
Proceedings of the ninth international conference on Information and knowledge management
Web page classification based on k-nearest neighbor approach
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Automatic categorization of case law
Proceedings of the 8th international conference on Artificial intelligence and law
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting Hierarchy in Text Categorization
Information Retrieval
Automatic Text Categorization and Its Application to Text Retrieval
IEEE Transactions on Knowledge and Data Engineering
Learning Approaches for Detecting and Tracking News Events
IEEE Intelligent Systems
Text classification using ESC-based stochastic decision lists
Information Processing and Management: an International Journal
Evaluation and Construction of Training Corpuses for Text Classification: A Preliminary Study
NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
An Approach to Improve Text Classification Efficiency
ADBIS '02 Proceedings of the 6th East European Conference on Advances in Databases and Information Systems
Mining HTML Pages to Support Document Sharing in a Cooperative System
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
A Linear Text Classification Algorithm Based on Category Relevance Factors
ICADL '02 Proceedings of the 5th International Conference on Asian Digital Libraries: Digital Libraries: People, Knowledge, and Technology
Chinese Documents Classification Based on N-Grams
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
A Machine Learning Approach to Web Mining
AI*IA '99 Proceedings of the 6th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Pruning Training Corpus to Speedup Text Classification
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
Journal of the American Society for Information Science and Technology
A maximal figure-of-merit learning approach to text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
CBC: Clustering Based Text Classification Requiring Minimal Labeled Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient multi-way text categorization via generalized discriminant analysis
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Event detection from online news documents for supporting environmental scanning
Decision Support Systems - Special issue: Knowledge management technique
Document classification using domain specific kanji characters extracted by X2 method
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
An adaptive k-nearest neighbor text categorization strategy
ACM Transactions on Asian Language Information Processing (TALIP)
A text categorization based on summarization technique
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Large-scale text categorization by batch mode active learning
Proceedings of the 15th international conference on World Wide Web
ACM Transactions on Information Systems (TOIS)
A fuzzy clustering approach for finding similar documents using a novel similarity measure
Expert Systems with Applications: An International Journal
Using hypothesis margin to boost centroid text classifier
Proceedings of the 2007 ACM symposium on Applied computing
A machine learning approach to web page filtering using content and structure analysis
Decision Support Systems
A new approach on search for similar documents with multiple categories using fuzzy clustering
Expert Systems with Applications: An International Journal
Designing evolving user profile in e-CRM with dynamic clustering of Web documents
Data & Knowledge Engineering
Index based approach for categorizing online news articles
CEA'08 Proceedings of the 2nd WSEAS International Conference on Computer Engineering and Applications
Effective spam filtering: A single-class learning and ensemble approach
Decision Support Systems
Text categorization via generalized discriminant analysis
Information Processing and Management: an International Journal
Construction of supervised and unsupervised learning systems for multilingual text categorization
Expert Systems with Applications: An International Journal
Scalable Web Mining with Newistic
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Sales Intelligence Using Web Mining
ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
Hierarchical Bayesian clustering for automatic text classification
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Random-walk term weighting for improved text classification
TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
Categorization of news articles using neural text categorizer
FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Modelling users' interests and needs for an adaptive online information system
UM'03 Proceedings of the 9th international conference on User modeling
Topic tracking based on keywords dependency profile
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Profile based algorithm to topic spotting in Reuter21578
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Text and hypertext categorization
Artificial intelligence
Connecting the dots between news articles
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Cross-lingual text categorization: Conquering language boundaries in globalized environments
Information Processing and Management: an International Journal
Connecting Two (or Less) Dots: Discovering Structure in News Articles
ACM Transactions on Knowledge Discovery from Data (TKDD)
An improved kNN algorithm – fuzzy kNN
CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
An adaptive fuzzy kNN text classifier
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Semantic search in the World News domain using automatically extracted metadata files
Knowledge-Based Systems
AIS'04 Proceedings of the 13th international conference on AI, Simulation, and Planning in High Autonomy Systems
A three-way decision approach to email spam filtering
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Automated text classification using a dynamic artificial neural network model
Expert Systems with Applications: An International Journal
A Semantic Triplet Based Story Classifier
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Exploiting poly-lingual documents for improving text categorization effectiveness
Decision Support Systems
Cost-sensitive three-way email spam filtering
Journal of Intelligent Information Systems
Hi-index | 0.01 |
We describe a method for classifying news stories using Memory Based Reasoning (MBR) a k-nearest neighbor method), that does not require manual topic definitions. Using an already coded training database of about 50,000 stories from the Dow Jones Press Release News Wire, and SEEKER [Stanfill] (a text retrieval system that supports relevance feedback) as the underlying match engine, codes are assigned to new, unseen stories with a recall of about 80% and precision of about 70%. There are about 350 different codes to be assigned. Using a massively parallel supercomputer, we leverage the information already contained in the thousands of coded stories and are able to code a story in about 2 seconds. Given SEEKER, the text retrieval system, we achieved these results in about two person-months. We believe this approach is effective in reducing the development time to implement classification systems involving large number of topics for the purpose of classification, message routing etc.