Improving web data annotations with spreading activation

Authors:
Fatih Gelgi;Srinivas Vadrevu;Hasan Davulcu
Affiliations:
Department of Computer Science and Engineering, Arizona State University, Tempe, AZ;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ;Department of Computer Science and Engineering, Arizona State University, Tempe, AZ
Venue:
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Year:
2005

Citing 10
Cited 5

On the use of spreading activation methods in automatic information

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval

Modern Information Retrieval
RoadRunner: Towards Automatic Data Extraction from Large Web Sites

Proceedings of the 27th International Conference on Very Large Data Bases
Semi-Automatic Wrapper Generation for Internet Information Sources

COOPIS '97 Proceedings of the Second IFCIS International Conference on Cooperative Information Systems
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation

WWW '03 Proceedings of the 12th international conference on World Wide Web
Extracting structured data from Web pages

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Wrapper induction for information extraction

Wrapper induction for information extraction
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
Automated Metadata and Instance Extraction from News Web Sites

WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
OntoMiner: Bootstrapping and Populating Ontologies from Domain-Specific Web Sites

IEEE Intelligent Systems

CP/CV: concept similarity mining without frequency information from domain describing taxonomies

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Propagation-vectors for trees (PVT): concise yet effective summaries for hierarchical data and trees

Proceedings of the 2008 ACM workshop on Large-Scale distributed systems for information retrieval
Semantic partitioning of web pages

WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Application of the spreading activation technique for recommending concepts of well-known ontologies in medical systems

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Empowering the access to public procurement opportunities by means of linking controlled vocabularies. A case study of Product Scheme Classifications in the European e-Procurement sector

Computers in Human Behavior

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by the human eye, “meaningful information” is still largely inaccessible for the computer applications. In this paper, we present automated algorithms to gather meta-data and instance information by utilizing global regularities on the Web and incorporating the contextual information. Our system is distinguished since it does not require domain specific engineering. Experimental evaluations were successfully performed on the TAP knowledge base and the faculty-course home pages of computer science departments containing 16,861 Web pages.