The Art of Computer Programming, Volume 4, Fascicle 1: Bitwise Tricks & Techniques; Binary Decision Diagrams
Locality sensitive hashing: A comparison of hash function types and querying mechanisms
Pattern Recognition Letters
Refining non-taxonomic relation labels with external structured data to support ontology learning
Data & Knowledge Engineering
Applying Optimal Stopping Theory to Improve the Performance of Ontology Refinement Methods
HICSS '11 Proceedings of the 2011 44th Hawaii International Conference on System Sciences
Optimizing queries to remote resources
Journal of Intelligent Information Systems
DBpedia spotlight: shedding light on the web of documents
Proceedings of the 7th International Conference on Semantic Systems
Sentimantics: conceptual spaces for lexical sentiment polarity representation with contextuality
WASSA '12 Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis
HICSS '13 Proceedings of the 2013 46th Hawaii International Conference on System Sciences
Hi-index | 0.00 |
Knowledge capture approaches in the age of massive Web data require robust and scalable mechanisms to acquire, consolidate and pre-process large amounts of heterogeneous data, both unstructured and structured. This paper addresses this requirement by introducing the Extensible Web Retrieval Toolkit (eWRT), a modular Python API for retrieving social data from Web sources such as Delicious, Flickr, Yahoo! and Wikipedia. eWRT has been released as an open source library under GNU GPLv3. It includes classes for caching and data management, and provides low-level text processing capabilities including language detection, phonetic string similarity measures, and string normalization.