Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Lexical analysis and stoplists
Information retrieval
Information retrieval
Unblocking brainstorming through the use of a simple group editor
CSCW '92 Proceedings of the 1992 ACM conference on Computer-supported cooperative work
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A cognitive network model of creativity: a renewed focus on brainstorming methodology
ICIS '99 Proceedings of the 20th international conference on Information Systems
ELIZA—a computer program for the study of natural language communication between man and machine
Communications of the ACM
Clustering Algorithms
Data Mining Techniques: For Marketing, Sales, and Customer Support
Data Mining Techniques: For Marketing, Sales, and Customer Support
Numerical Recipes in C: The Art of Scientific Computing
Numerical Recipes in C: The Art of Scientific Computing
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Interactive methods for taxonomy editing and validation
Proceedings of the eleventh international conference on Information and knowledge management
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Machine Learning
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A Word Stemming Algorithm for the Spanish Language
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Maximizing the spread of influence through a social network
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Information diffusion through blogspace
Proceedings of the 13th international conference on World Wide Web
Flash forums and forumReader: navigating a new kind of large-scale online discussion
CSCW '04 Proceedings of the 2004 ACM conference on Computer supported cooperative work
Structure and evolution of blogspace
Communications of the ACM - The Blogosphere
Communications of the ACM - The Blogosphere
The predictive power of online chatter
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Deriving marketing intelligence from online discussion
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Ethical aspects of web log data mining
International Journal of Information Technology and Management
Multi-taxonomy: Determining Perceived Brand Characteristics from Web Data
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
COBRA - mining web for COrporate Brand and Reputation Analysis
Web Intelligence and Agent Systems
Business insights workbench: an interactive insights discovery solution
Proceedings of the 2007 conference on Human interface: Part II
Tracking topic evolution in on-line postings: 2006 IBM innovation Jam data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A smarter process for sensing the information space
IBM Journal of Research and Development
Hi-index | 0.00 |
Data-mining techniques that detect trends and patterns in structured data are often ill-suited for analysis of unstructured text. Information critical to business- and generated by groups such as employees, customers, and the public-appears in such forms as chats, electronic discussion forums, and blogs. This paper describes techniques developed to detect themes and trends in such informal communication streams. Our approach begins with unsupervised text clustering to create initial categories. A human analyst then refines the categories into easily understandable themes. To facilitate this process, we developed an interactive approach to text category creation and validation that aids the analyst in evaluating each category of a taxonomy and makes it possible to visualize relationships among categories. The resulting analysis can then be communicated to participants in real time. We report on the results of using these techniques in IBM companywide "Jam" events, during which tens of thousands of employees worldwide participated in electronic discussions of key business issues.