Information filtering and information retrieval: two sides of the same coin?
Communications of the ACM - Special issue on information filtering
Agents that reduce work and information overload
Communications of the ACM
GroupLens: applying collaborative filtering to Usenet news
Communications of the ACM
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Expected Error Analysis for Model Selection
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Unifying Approach to HTML Wrapper Representation and Learning
DS '00 Proceedings of the Third International Conference on Discovery Science
Active Hidden Markov Models for Information Extraction
IDA '01 Proceedings of the 4th International Conference on Advances in Intelligent Data Analysis
Message Understanding Conference-6: a brief history
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Syskill & webert: Identifying interesting web sites
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
WISDOM: Web Intrapage Informative Structure Mining Based on Document Object Model
IEEE Transactions on Knowledge and Data Engineering
Ex-ray: Data mining and mental health
Applied Soft Computing
Hi-index | 0.00 |
Generating press clippings for companies manually requires a considerable amount of resources. We describe a system that monitors online newspapers and discussion boards automatically. The system extracts, classifies and analyzes messages and generates press clippings automatically, taking the specific needs of client companies into account. Key components of the system are a spider, an information extraction engine, a text classifier based on the Support Vector Machine that categorizes messages by subject, and a second classifier that analyzes which emotional state the author of a newsgroup posting was likely to be in. By analyzing large amount of messages, the system can summarize the main issues that are being reported on for given business sectors, and can summarize the emotional attitude of customers and shareholders towards companies.