Learning trees and rules with set-valued features

Authors:
William W. Cohen
Affiliations:
AT&T Laboratories, Murray Hill, NJ
Venue:
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Year:
1996

Citing 18
Cited 86

Generalized subsumption and its applications to induction and redundancy

Artificial Intelligence
A bootstrapping approach to conceptual clustering

Proceedings of the sixth international workshop on Machine learning
Detecting and correcting errors in rule-based expert systems: an integration of empirical and explanation-based learning

Knowledge Acquisition
PAC-learnability of determinate logic programs

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Representation and learning in information retrieval

Representation and learning in information retrieval
Learning Boolean Functions in an Infinite Attribute Space

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
Pac-learning nondeterminate clauses

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
An introduction to computational learning theory

An introduction to computational learning theory
Context-sensitive learning methods for text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Creating a Memory of Causal Relationships: An Integration of Empirical and Explanation-Based Learning Methods

Creating a Memory of Causal Relationships: An Integration of Empirical and Explanation-Based Learning Methods
Information Retrieval

Information Retrieval
Inductive Logic Programming: Techniques and Applications

Inductive Logic Programming: Techniques and Applications
Learning Logical Definitions from Relations

Machine Learning
The CN2 Induction Algorithm

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Background Knowledge and Declarative Bias in Inductive Concept Learning

AII '92 Proceedings of the International Workshop on Analogical and Inductive Inference

Recommendation as classification: using social and content-based information in recommendation

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Context-sensitive learning methods for text categorization

ACM Transactions on Information Systems (TOIS)
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Classification and regression: money *can* grow on trees

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ATTICS (poster abstract): a software platform for online text classification

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Towards multidocument summarization by reformulation: progress and prospects

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Content-based book recommending using learning for text categorization

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Adaptive Retrieval Agents: Internalizing Local Contextand Scaling up to the Web

Machine Learning - Special issue on information retrieval
Scalable association-based text classification

Proceedings of the ninth international conference on Information and knowledge management
Probe, count, and classify: categorizing hidden web databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
PERSIVAL, a system for personalized search and summarization over multimedia healthcare information

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
PERSIVAL demo: categorizing hidden-web resources

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
An extended transformation approach to inductive logic programming

ACM Transactions on Computational Logic (TOCL) - Special issue devoted to Robert A. Kowalski
Intelligent information triage

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic question answering on the web

Proceedings of the 11th international conference on World Wide Web
Extending SDARTS: extracting metadata from web databases and interfacing with the open archives initiative

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Propositionalization approaches to relational data mining

Relational Data Mining
Meaningful term extraction and discriminative term selection in text categorization via unknown-word methodology

ACM Transactions on Asian Language Information Processing (TALIP)
Predictive Statistical Models for User Modeling

User Modeling and User-Adapted Interaction
Designing and Evaluating an Adaptive Spoken Dialogue System

User Modeling and User-Adapted Interaction
Clustering and Classification in Structured Data Domains Using Fuzzy Lattice Neurocomputing (FLN)

IEEE Transactions on Knowledge and Data Engineering
QProber: A system for automatic classification of hidden-Web databases

ACM Transactions on Information Systems (TOIS)
Converting numerical classification into text classification

Artificial Intelligence
The disambiguation of nominalizations

Computational Linguistics
Mining Classification Rules from Datasets with Large Number of Many-Valued Attributes

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Use of a Weighted Topic Hierarchy for Document Classification

TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Exploiting Structural Information for Text Classification on the WWW

IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
Types and forms of data

Handbook of data mining and knowledge discovery
Adding numbers to text classification

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Categorizing web queries according to geographical locality

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Learning to predict problematic situations in a spoken dialogue system: experiments with how may I help you?

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Predicting automatic speech recognition performance using prosodic cues

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
New Techniques for Disambiguation in Natural Language and Their Application to Biological Text

The Journal of Machine Learning Research
Lessons and Challenges from Mining Retail E-Commerce Data

Machine Learning
Automatic detection of poor speech recognition at the dialogue level

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Probabilistic question answering on the Web: Research Articles

Journal of the American Society for Information Science and Technology
Detecting problematic turns in human-machine interactions: rule-induction versus memory-based learning approaches

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Generation of VP ellipsis: a corpus-based approach

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Predicting user reactions to system error

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Identifying user corrections automatically in spoken dialogue systems

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Towards emotion prediction in spoken tutoring dialogues

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Classifying recognition results for spoken dialog systems

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
What's yours and what's mine: determining intellectual attribution in scientific text

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Labeling corrections and aware sites in spoken dialogue systems

SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
Statistical acquisition of content selection rules for natural language generation

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
The design and implementation of a self-healing database system

Journal of Intelligent Information Systems - Special issue: Database and applications security
Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

Knowledge and Information Systems
Characterizing and Predicting Corrections in Spoken Dialogue Systems

Computational Linguistics
Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Detection of question-answer pairs in email conversations

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Disambiguating toponyms in news

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
D-SCIDS: distributed soft computing intrusion detection system

Journal of Network and Computer Applications - Special issue: Network and information security: A computational intelligence approach
REFEREE: an open framework for practical testing of recommender systems using ResearchIndex

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Distributed search over the hidden web: hierarchical database sampling and selection

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Towards a query optimizer for text-centric tasks

ACM Transactions on Database Systems (TODS)
Induction of decision trees in numeric domains using set-valued attributes

Intelligent Data Analysis
Classification-aware hidden-web text database selection

ACM Transactions on Information Systems (TOIS)
PRIE: a system for generating rulelists to maximize ROC performance

Data Mining and Knowledge Discovery
Impact of imputation of missing values on classification error for discrete data

Pattern Recognition
Automatic Hidden Web Database Classification

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Using Question-Answer Pairs in Extractive Summarization of Email Conversations

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Boosting products of base classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Summarizing email threads

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Columbia Newsblaster: multilingual news summarization on the web

HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL 2004
Answering the question you wish they had asked: the impact of paraphrasing for question answering

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Combination of statistical word alignments based on multiple preprocessing schemes

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis

Computational Linguistics
Syntactic reordering for English-Arabic phrase-based machine translation

Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Automatically training a problematic dialogue predictor for a spoken dialogue system

Journal of Artificial Intelligence Research
Learning content selection rules for generating object descriptions in dialogue

Journal of Artificial Intelligence Research
Using text classifiers for numerical classification

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Detection of imperative and declarative question-answer pairs in email conversations

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Combining case-based and model-based reasoning for predicting the outcome of legal cases

ICCBR'03 Proceedings of the 5th international conference on Case-based reasoning: Research and Development
Schema mapping in p2p networks based on classification and probing

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Finding related sentences in multiple documents for multidocument discourse parsing of Brazilian Portuguese texts

Companion Proceedings of the XIV Brazilian Symposium on Multimedia and the Web
Aspect-based sentiment analysis of movie reviews on discussion boards

Journal of Information Science
Federated Search

Foundations and Trends in Information Retrieval
Profile-Based security against malicious mobile agents

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
MEPIDS: multi-expression programming for intrusion detection system

IWINAC'05 Proceedings of the First international work-conference on the Interplay Between Natural and Artificial Computation conference on Artificial Intelligence and Knowledge Engineering Applications: a bioinspired approach - Volume Part II
Generalization behaviour of alkemic decision trees

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Learning ontology-aware classifiers

DS'05 Proceedings of the 8th international conference on Discovery Science
Approximate boolean reasoning: foundations and applications in data mining

Transactions on Rough Sets V
Feature selection: a useful preprocessing step

IRSG'97 Proceedings of the 19th Annual BCS-IRSG conference on Information Retrieval Research
Dealing with orthographic variation in a tagger-lemmatizer for fourteenth century Dutch charters

Language Resources and Evaluation
Detection of imperative and declarative question--answer pairs in email conversations

AI Communications
A Greedy Algorithm for Construction of Decision Trees for Tables with Many-Valued Decisions --A Comparative Study

Fundamenta Informaticae - Concurrency, Specification and Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

In most learning systems examples are represented as fixed-length "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the feature-vector representation that allows the value of a feature to be a set of strings; for instance, to represent a small white and black dog with the nominal features size and species and the set-valued feature color, one might use a feature vector with size=small, species=canis-familiaris and color-{white, black}. Since we make no assumptions about the number of possible set elements, this extension of the traditional feature-vector representation is closely connected to Blum's "infinite attribute" representation. We argue that many decision tree and rule learning algorithms can be easily extended to set-valued features. We also show by example that many real-world learning problems can be efficiently and naturally represented with set-valued features; in particular, text categorization problems and problems that arise in propositionalizing first-order representations lend themselves to set-valued features.