Web content outlier mining through mathematical approach and trust rating

  • Authors:
  • G. Poonkuzhali;K. Sarukesi;G. V. Uma

  • Affiliations:
  • Department of Computer Science and Engineering, Rajalakshmi Engineering College, Anna University, Chennai, Tamil Nadu, India;Hindustan Institute of Technology and Science, Chennai, Tamil Nadu, India;Department of Information Science & Technology, Anna University, Chennai, Tamil Nadu, India

  • Venue:
  • ACACOS'11 Proceedings of the 10th WSEAS international conference on Applied computer and applied computational science
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this Internet era, the WWW is flooded with voluminous amount of information with more replicated and irrelevant web pages. As the unnecessary and duplicated web pages increase the indexing space and time complexity, finding and removing these pages become a significant issue among the information retrieval and web mining research communities as most of the people rely on search engines to get the required information. Web content outlier mining plays a decisive role in covering all these aspects. Existing algorithms for web content outlier mining focuses attention on applying weightage only to structured documents whereas in this research work, a mathematical approach based on two way rectangular representations, signed approach of trust rating and correlation method is developed for retrieving right information without duplicates present in both structured and unstructured web documents.