Mining knowledge from text using information extraction

Authors:
Raymond J. Mooney;Razvan Bunescu
Affiliations:
University of Texas at Austin, Austin, TX;University of Texas at Austin, Austin, TX
Venue:
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Year:
2005

Citing 47
Cited 28

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Programming perl

Programming perl
Clustering algorithms

Information retrieval
C4.5: programs for machine learning

C4.5: programs for machine learning
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Unifying instance-based and rule-based induction

Machine Learning
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A scalable comparison-shopping agent for the World-Wide Web

AGENTS '97 Proceedings of the first international conference on Autonomous agents
Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Relational learning of pattern-match rules for information extraction

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Content-based book recommending using learning for text categorization

DL '00 Proceedings of the fifth ACM conference on Digital libraries
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Modern Information Retrieval

Modern Information Retrieval
Mining soft-matching association rules

Proceedings of the eleventh international conference on Information and knowledge management
The Frame-Based Module of the SUISEKI Information Extraction System

IEEE Intelligent Systems
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Active Learning for Natural Language Parsing and Information Extraction

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Constructing Biological Knowledge Bases by Extracting Information from Text Sources

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A Mutually Beneficial Integration of Data Mining and Information Extraction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Boosted Wrapper Induction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Information Extraction with HMM Structures Learned by Stochastic Optimization

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Kernel methods for relation extraction

The Journal of Machine Learning Research
Bottom-up relational learning of pattern matching rules for information extraction

The Journal of Machine Learning Research
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Toward general-purpose learning for information extraction

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Text mining with information extraction

Text mining with information extraction
Mining web sites using adaptive information extraction

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A simple named entity extractor using AdaBoost

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition with a maximum entropy approach

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Named entity recognition using hundreds of thousands of features

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Memory-based named entity recognition using unannotated data

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Adaptive Name Matching in Information Integration

IEEE Intelligent Systems
Discovering relations among named entities from large corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Dependency tree kernels for relation extraction

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Collective information extraction with relational Markov networks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Mining soft-matching rules from textual data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Representing sentence structure in hidden Markov models for information extraction

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Comparative experiments on learning information extractors for proteins and their interactions

Artificial Intelligence in Medicine
Automatically generating extraction patterns from untagged text

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Discriminative probabilistic models for relational data

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Context-sensitive semantic smoothing for the language modeling approach to genomic IR

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Integrating probabilistic extraction models and data mining to discover relations and patterns in text

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds

Proceedings of the 16th international conference on World Wide Web
The role of documents vs. queries in extracting class attributes from text

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Weakly-supervised discovery of named entities using web search queries

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Overview and semantic issues of text mining

ACM SIGMOD Record
Collective knowledge systems: Where the Social Web meets the Semantic Web

Web Semantics: Science, Services and Agents on the World Wide Web
Methodological Review: Extracting interactions between proteins from the literature

Journal of Biomedical Informatics
Information extraction from syllabi for academic e-Advising

Expert Systems with Applications: An International Journal
Information Extraction

Foundations and Trends in Databases
Mining comparative sentences and relations

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Opinion extraction and summarization on the web

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Joint extraction of entities and relations for opinion recognition

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Semantic similarity measures for Malay sentences

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Acquisition of instance attributes via labeled and related instances

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Comparable entity mining from comparative questions

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Expresses-an-opinion-about: using corpus statistics in an information extraction approach to opinion mining

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Compositional information extraction methodology from medical reports

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
Relation-Based document retrieval for biomedical literature databases

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
An ontology-based retrieval system using semantic indexing

Information Systems
Succinct and informative cluster descriptions for document repositories

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
Relation-Based document retrieval for biomedical IR

Transactions on Computational Systems Biology V
The effects of OCR error on the extraction of private information

DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Is the contextual information relevant in text clustering by compression?

Expert Systems with Applications: An International Journal
Using concept-based indexing to improve language modeling approach to genomic IR

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
A framework for biological event extraction from text

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
A Knowledge Mining Approach for Effective Customer Relationship Management

International Journal of Knowledge-Based Organizations
Text Mining in Bioinformatics: Research and Application

International Journal of Information Retrieval Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important approach to text mining involves the use of natural-language information extraction. Information extraction (IE) distills structured data or knowledge from unstructured text by identifying references to named entities as well as stated relationships between such entities. IE systems can be used to directly extricate abstract knowledge from a text corpus, or to extract concrete data from a set of documents which can then be further analyzed with traditional data-mining techniques to discover more general patterns. We discuss methods and implemented systems for both of these approaches and summarize results on mining real text corpora of biomedical abstracts, job announcements, and product descriptions. We also discuss challenges that arise when employing current information extraction technology to discover knowledge in text.