Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
C4.5: programs for machine learning
C4.5: programs for machine learning
Layout & language: preliminary experiments in assigning logical structure to table cells
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
A framework for web table mining
Proceedings of the 4th international workshop on Web information and data management
Table extraction using conditional random fields
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining tables from large scale HTML texts
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Layout and language: integrating spatial and linguistic knowledge for layout understanding tasks
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Using the structure of Web sites for automatic segmentation of tables
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Journal of Intelligent Information Systems
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Model for Detecting and Merging Vertically Spanned Table Cells in Plain Text Documents
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Transforming arbitrary tables into logical form with TARTAR
Data & Knowledge Engineering
TableSeer: automatic table metadata extraction and searching in digital libraries
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Extracting relevant named entities for automated expense reimbursement
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
From dirt to shovels: fully automatic tool generation from ad hoc data
Proceedings of the 35th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Identifying table boundaries in digital documents via sparse line detection
Proceedings of the 17th ACM conference on Information and knowledge management
Web Semantics: Science, Services and Agents on the World Wide Web
Detecting and recognizing tables in spreadsheets
DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
An efficient pre-processing method to identify logical components from PDF documents
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Table detection from plain text using machine learning and document structure
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Web table discrimination with composition of rich structural and content information
Applied Soft Computing
Hi-index | 0.00 |
Many real-world texts contain tables. In order to process these texts correctly and extract the information contained within the tables, it is important to identify the presence and structure of tables. In this paper, we present a new approach that learns to recognize tables in free text, including the boundary, rows and columns of tables. When tested on Wall Street Journal news documents, our learning approach outperforms a deterministic table recognition algorithm that identifies table recognition algorithm that identifies tables based on a fixed set of conditions. Our learning approach is also more flexible and easily adaptable to texts in different domains with different table characteristics.