Layout & language: preliminary experiments in assigning logical structure to table cells
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Towards a workbench for acquisition of domain knowledge from natural language
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Learning to recognize tables in free text
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A machine learning based approach for table detection on the web
Proceedings of the 11th international conference on World Wide Web
A framework for web table mining
Proceedings of the 4th international workshop on Web information and data management
Detecting Tables in HTML Documents
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Using the structure of Web sites for automatic segmentation of tables
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Web data extraction based on partial tree alignment
WWW '05 Proceedings of the 14th international conference on World Wide Web
A Scalable Hybrid Approach for Extracting Head Components from Web Tables
IEEE Transactions on Knowledge and Data Engineering
Finding specification pages according to attributes
Proceedings of the 15th international conference on World Wide Web
Learning table extraction from examples
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Transforming arbitrary tables into logical form with TARTAR
Data & Knowledge Engineering
Towards domain-independent information extraction from web tables
Proceedings of the 16th international conference on World Wide Web
TableSeer: automatic table metadata extraction and searching in digital libraries
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
CASCON '07 Proceedings of the 2007 conference of the center for advanced studies on Collaborative research
OntoMiner: automated metadata and instance mining from news websites
International Journal of Web and Grid Services
Extracting logical structures from HTML tables
Computer Standards & Interfaces
Person Retrieval on XML Documents by Coreference Analysis Utilizing Structural Features
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Browsing large HTML tables on small screens
Proceedings of the 21st annual ACM symposium on User interface software and technology
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
Using structured text for large-scale attribute extraction
Proceedings of the 17th ACM conference on Information and knowledge management
Identifying table boundaries in digital documents via sparse line detection
Proceedings of the 17th ACM conference on Information and knowledge management
Discriminating Meaningful Web Tables from Decorative Tables Using a Composite Kernel
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Automatic hidden-web table interpretation, conceptualization, and semantic annotation
Data & Knowledge Engineering
Analysis and Interpretation of Semantic HTML Tables
WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
Detecting tables in Web documents
Engineering Applications of Artificial Intelligence
Web Semantics: Science, Services and Agents on the World Wide Web
Constructing domain ontology using structural and semantic characteristics of web-table head
IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
Web-scale knowledge extraction from semi-structured tables
Proceedings of the 19th international conference on World wide web
Automatic construction of a lexical attribute knowledge base
KSEM'07 Proceedings of the 2nd international conference on Knowledge science, engineering and management
Automatic acquisition of attribute host by selectional constraint resolution
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Automatic hidden-web table interpretation by sibling page comparison
ER'07 Proceedings of the 26th international conference on Conceptual modeling
Automatic domain-ontology structure and example acquisition from semi-structured texts
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
A fine-grained taxonomy of tables on the web
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Communications of the ACM
Web-scale table census and classification
Proceedings of the fourth ACM international conference on Web search and data mining
Mining for attributes and values in tables
Proceedings of the International Conference on Management of Emergent Digital EcoSystems
An efficient pre-processing method to identify logical components from PDF documents
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Acoustic modeling of dialogue elements for document accessibility
UAHCI'11 Proceedings of the 6th international conference on Universal access in human-computer interaction: applications and services - Volume Part IV
Attribute retrieval from relational web tables
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Towards a framework for attribute retrieval
Proceedings of the 20th ACM international conference on Information and knowledge management
Hybrid approach to extracting information from web-tables
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Table detection from plain text using machine learning and document structure
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
A machine learning based approach for separating head from body in web-tables
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Improving web browsing on small devices based on table classification
PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Structure detection system from web documents through backpropagation network learning
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Web table discrimination with composition of rich structural and content information
Applied Soft Computing
Understanding tables on the web
ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Adapting data table to improve web accessibility
Proceedings of the 10th International Cross-Disciplinary Conference on Web Accessibility
Aggregated search: A new information retrieval paradigm
ACM Computing Surveys (CSUR)
Schema extraction for tabular data on the web
Proceedings of the VLDB Endowment
Hi-index | 0.02 |
Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table filtering, recognition, interpretation, and presentation are discussed. Heuristic rules and cell similarities are employed to identify tables. The F-measure of table recognition is 86.50%. We also propose an algorithm to capture attribute-value relationships among table cells. Finally, more structured data is extracted and presented.