Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Information Retrieval
Machine Learning
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
Disambiguating Web appearances of people in a social network
WWW '05 Proceedings of the 14th international conference on World Wide Web
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Person resolution in person search results: WebHawk
Proceedings of the 14th ACM international conference on Information and knowledge management
Text Classification without Negative Examples Revisit
IEEE Transactions on Knowledge and Data Engineering
Microformats: The Next (Small) Thing on the Semantic Web?
IEEE Internet Computing
A bootstrapping method for learning semantic lexicons using extraction pattern contexts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A Graph-Theoretic Approach to Enterprise Network Dynamics (Progress in Computer Science and Applied Logic (PCS))
Retrieving and Matching RDF Graphs by Solving the Satisfiability Problem
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Random walks on the click graph
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Clustering and Embedding Using Commute Times
IEEE Transactions on Pattern Analysis and Machine Intelligence
Towards breaking the quality curse.: a web-querying approach to web people search.
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Exploring the facebook experience: a new approach to usability
Proceedings of the 5th Nordic conference on Human-computer interaction: building bridges
idMesh: graph-based disambiguation of linked data
Proceedings of the 18th international conference on World wide web
WIT: web people search disambiguation using random walks
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
PORE: positive-only relation extraction from wikipedia text
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Instance based clustering of semantic web resources
ESWC'08 Proceedings of the 5th European semantic web conference on The semantic web: research and applications
Integrating transportation ontologies using semantic web languages
HoloMAS'05 Proceedings of the Second international conference on Holonic and Multi-Agent Systems for Manufacturing
Harnessing different knowledge sources to measure semantic relatedness under a uniform model
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
More than modelling and hiding: towards a comprehensive view of Web mining and privacy
Data Mining and Knowledge Discovery
A unified approach to matching semantic data on the Web
Knowledge-Based Systems
Hi-index | 0.00 |
As web users disseminate more of their personal information on the web, the possibility of these users becoming victims of lateral surveillance and identity theft increases. Therefore web resources containing this personal information, which we refer to as identity web references must be found and disambiguated to produce a unary set of web resources which refer to a given person. Such is the scale of the web that forcing web users to monitor their identity web references is not feasible, therefore automated approaches are required. However, automated approaches require background knowledge about the person whose identity web references are to be disambiguated. Within this paper we present a detailed approach to monitor the web presence of a given individual by obtaining background knowledge from Web 2.0 platforms to support automated disambiguation processes. We present a methodology for generating this background knowledge by exporting data from multiple Web 2.0 platforms as RDF data models and combining these models together for use as seed data. We present two disambiguation techniques; the first using a semi-supervised machine learning technique known as Self-training and the second using a graph-based technique known as Random Walks, we explain how the semantics of data supports the intrinsic functionalities of these techniques. We compare the performance of our presented disambiguation techniques against several baseline measures including human processing of the same data. We achieve an average precision level of 0.935 for Self-training and an average f-measure level of 0.705 for Random Walks in both cases outperforming several baselines measures.