Communications of the ACM
Data mining: concepts and techniques
Data mining: concepts and techniques
Visual Web Information Extraction with Lixto
Proceedings of the 27th International Conference on Very Large Data Bases
Extracting Information from Semistructured Data
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Enhancing Text Classification Using Synopses Extraction
WISE '03 Proceedings of the Fourth International Conference on Web Information Systems Engineering
Information extraction from unstructured document
Information extraction from unstructured document
Ontology based information extraction from text
Knowledge-driven multimedia information extraction and ontology evolution
Hi-index | 0.00 |
This paper presents a new two-phase pattern (2PP) discovery technique for information extraction. 2PP consists of orthographic pattern discovery (OPD) and semantic pattern discovery (SPD) where the OPD determines the structural features from an identified region of a document and the SPD discovers a dominant semantic pattern for the region via inference, apposition and analogy. Then the discovered pattern is applied back into the region to extract required data items through pattern matching. We evaluated 2PP using 6500 data items and obtained effective result.