Managing Web-Based Data: Database Models and Transformations
IEEE Internet Computing
Extracting structured data from Web pages
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Automatic information extraction from large websites
Journal of the ACM (JACM)
Structured databases on the web: observations and implications
ACM SIGMOD Record
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Do not crawl in the dust: different urls with similar text
Proceedings of the 16th international conference on World Wide Web
A new algorithm for clustering search results
Data & Knowledge Engineering
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
FiVaTech: Page-Level Web Data Extraction from Template Pages
IEEE Transactions on Knowledge and Data Engineering
FOCIH: Form-Based Ontology Creation and Information Harvesting
ER '09 Proceedings of the 28th International Conference on Conceptual Modeling
Automatically building probabilistic databases from the web
Proceedings of the 20th international conference companion on World wide web
Highly efficient algorithms for structural clustering of large websites
Proceedings of the 20th international conference on World wide web
Generating SPARQL executable mappings to integrate ontologies
ER'11 Proceedings of the 30th international conference on Conceptual modeling
Semi-automatic conceptual data modeling using entity and relationship instance repositories
ER'11 Proceedings of the 30th international conference on Conceptual modeling
A statistical approach to URL-based web page clustering
Proceedings of the 21st international conference companion on World Wide Web
Towards discovering ontological models from big RDF data
ER'12 Proceedings of the 2012 international conference on Advances in Conceptual Modeling
Discovering implicit schemas in JSON data
ICWE'13 Proceedings of the 13th international conference on Web Engineering
CALA: An unsupervised URL-based web page classification system
Knowledge-Based Systems
Hi-index | 0.00 |
Deep Web sites expose data from a database, whose conceptual model remains hidden. Having access to that model is mandatory to perform several tasks, such as integrating different web sites; extracting information from the web unsupervisedly; or creating ontologies. In this paper, we propose a technique to discover the conceptual model behind a web site in the Deep Web, using a statistical approach to discover relationships between entities. Our proposal is unsupervised, not requiring the user to have expert knowledge; and it does not focus on a single view on the database, instead it integrates all views containing entities and relationships that are exposed in the web site.